Re: [boost] Is Boost interested in a Redis client library?

9 Apr 2022

      On Sat, 9 Apr 2022 at 03:22, Vinícius dos Santos Oliveira
<vini.ipsmaker@gmail.com> wrote:
...
Can you clarify what you mean by "erased afterwards"?
Afterwards when? Before or after the delivery to the user?
When? I need to know when before I can comment much
further.
Let me give you an example, a map data type with two elements looks
like this on the wire

   "%2\r\n$4\r\nkey1\r\n$6\r\nvalue1\r\n$4\r\nkey2\r\n$6\r\nvalue2\r\n"

The parser will start reading the message header with async_read_until
(\r\n) and see it is a map with size 2. This information will be
passed to the user by means of a callback (adapter in my examples)

   callback({type::map, 2, ...}, ...);

after that the read operation will consume the "%2\r\n" and the buffer
content will be reduced to

   "$4\r\nkey1\r\n$6\r\nvalue1\r\n$4\r\nkey2\r\n$6\r\nvalue2\r\n"

Reading the next element works likewise but now the element is a blob
type and not a map. The parser reads the header to know the size of
the blob (again with read_until) and  "$4\r\n" is consumed, reducing
the buffer to

   "key1\r\n$6\r\nvalue1\r\n$4\r\nkey2\r\n$6\r\nvalue2\r\n"

then it reads the blob "key1" with a read of size 6 (two more to
consume \r\n) and passes that info to the user

   callback({type::blob_string, 1, 1, "key1"}, ...);

"key1\r\n" is then consumed resulting in a buffer

   "$6\r\nvalue1\r\n$4\r\nkey2\r\n$6\r\nvalue2\r\n"

The same procedure will be applied to the remaining elements until the
map is completely processed.

In simple words, as soon as data becomes available it is passed to the
user and consumed.

   callback(...)
   x.consume(n)
...
It's not clear to me at all how aedis manages the buffer.
If the buffer were an internal implementation detail (as
in wrapping the underlying socket) I wouldn't care, but as
it is... it's part of the public interface and I must
understand how to use it.
Sure, does the explanation above make things clearer?
...
Golang's bufio.Scanner implementation will avoid excessive
memcopies to the head of the buffer by using a "moving
window" over the buffer. It only uses the tail of the
buffer to new read operations. Only when the buffer fully
fills it'll memcpy the current message to the head of the
buffer as to have more space.
That is something I would like to see in Asio. It would definitely
improve performance.
...
The pattern to parse the textual protocol is simple: get
message, process message, discard message.
Upon accumulating a whole message, you decode its fields.
What is the point of accumulating the whole message if I am done with
what has already been read? If I am reading a Redis Hash with millions
of elements in a std::unordered_map<std::string, std::string>, I
prefer to get rid of processed data as soon as possible, releasing
memory for next reads.

We have the following scenarios

  1. read doesn't consume the buffer.

  2. read consumes the buffer after data has been passed to the user.

I find number 2. far more attractive

  a. Most users won't be interested in string views as data lifetime
will be too short. They will have to convert to owing strings anyway.

  b. Most users will prefer a custom serialization i.e.  std::map<U,
V> instead of std::map<string_view, string_view> .  Having the string
content available in the buffer will be pointless.

  c. Scenario 2. can still emulate 1. by passing a custom
implementation of dynamic_buffer where consume won't overwrite
elements but add an offset. It may however make more sense to add a
new buffer concept (as you mentioned below from go) rather than hack
an existing one.

Number 1. has the following problems

  e. It may be an issue for users that read large maps, lists, etc.
from Redis. There will be no opt-out, reading will never consume and
may result in higher memory consumption.

  f. It is not clear to me whether Asio buffer concepts support this
properly for RESP3 or whether I need my own buffer concept.
...
Does redis's usage pattern feel similar to this? If it
doesn't, then how does it differ? If it differs, I should
reevaluate my thoughts for this discussion.
I hope these points were also addressed in the comments above. If not,
please ask.
...
As for "[rather] than keeping it in intermediate storage",
that's more complex. The deserialized object *is*
intermediate storage. The question is: can I use pointers
to the original stream to put less pressure on the
allocator (even if we customize the allocator, the gains
only accumulate)? For instance, suppose the deserialized
object is map<string, string>:
for (;;) {
  dynamic_buffer buf;
  map<string_view, string_view> result;
  auto message_size = read(socket, buf, result);
  process(result);
  buf.consume(message_size);
}
Now the container is cheaper.
Ditto. This is indeed something nice which I would like to support.
But as I said above I don't know how to achieve this with the current
asio buffers and whether this is really useful.

Regards,
Marcelo

Re: [boost] Is Boost interested in a Redis client library?

Marcelo Zimbres Silva