On Sat, 9 Apr 2022 at 03:22, VinÃcius dos Santos Oliveira
Can you clarify what you mean by "erased afterwards"? Afterwards when? Before or after the delivery to the user? When? I need to know when before I can comment much further.
Let me give you an example, a map data type with two elements looks like this on the wire "%2\r\n$4\r\nkey1\r\n$6\r\nvalue1\r\n$4\r\nkey2\r\n$6\r\nvalue2\r\n" The parser will start reading the message header with async_read_until (\r\n) and see it is a map with size 2. This information will be passed to the user by means of a callback (adapter in my examples) callback({type::map, 2, ...}, ...); after that the read operation will consume the "%2\r\n" and the buffer content will be reduced to "$4\r\nkey1\r\n$6\r\nvalue1\r\n$4\r\nkey2\r\n$6\r\nvalue2\r\n" Reading the next element works likewise but now the element is a blob type and not a map. The parser reads the header to know the size of the blob (again with read_until) and "$4\r\n" is consumed, reducing the buffer to "key1\r\n$6\r\nvalue1\r\n$4\r\nkey2\r\n$6\r\nvalue2\r\n" then it reads the blob "key1" with a read of size 6 (two more to consume \r\n) and passes that info to the user callback({type::blob_string, 1, 1, "key1"}, ...); "key1\r\n" is then consumed resulting in a buffer "$6\r\nvalue1\r\n$4\r\nkey2\r\n$6\r\nvalue2\r\n" The same procedure will be applied to the remaining elements until the map is completely processed. In simple words, as soon as data becomes available it is passed to the user and consumed. callback(...) x.consume(n)
It's not clear to me at all how aedis manages the buffer. If the buffer were an internal implementation detail (as in wrapping the underlying socket) I wouldn't care, but as it is... it's part of the public interface and I must understand how to use it.
Sure, does the explanation above make things clearer?
Golang's bufio.Scanner implementation will avoid excessive memcopies to the head of the buffer by using a "moving window" over the buffer. It only uses the tail of the buffer to new read operations. Only when the buffer fully fills it'll memcpy the current message to the head of the buffer as to have more space.
That is something I would like to see in Asio. It would definitely improve performance.
The pattern to parse the textual protocol is simple: get message, process message, discard message.
Upon accumulating a whole message, you decode its fields.
What is the point of accumulating the whole message if I am done with
what has already been read? If I am reading a Redis Hash with millions
of elements in a std::unordered_map
Does redis's usage pattern feel similar to this? If it doesn't, then how does it differ? If it differs, I should reevaluate my thoughts for this discussion.
I hope these points were also addressed in the comments above. If not, please ask.
As for "[rather] than keeping it in intermediate storage", that's more complex. The deserialized object *is* intermediate storage. The question is: can I use pointers to the original stream to put less pressure on the allocator (even if we customize the allocator, the gains only accumulate)? For instance, suppose the deserialized object is map
: for (;;) { dynamic_buffer buf; map
result; auto message_size = read(socket, buf, result); process(result); buf.consume(message_size); } Now the container is cheaper.
Ditto. This is indeed something nice which I would like to support. But as I said above I don't know how to achieve this with the current asio buffers and whether this is really useful. Regards, Marcelo