2015-08-15 14:32 GMT-03:00 Lee Clagett
: Adding a size_t maximum read argument should be possible at a minimum. I do not see how this could hamper any possible backends, its only role is to explicitly limit how many bytes are inserted to the back of the container in a single call. With this feature, a client could at least reserve bytes in the container, and prevent further allocations through a max_read argument.
This is not a TCP socket, it's an HTTP socket. An HTTP message is not a stream of bytes.
HTTP has two streaming modes: chunked, and "connection:close" with no chunked encoding or defined length. Its not always viewed as that, but certain applications [1] take advantage of these "features" available in HTTP. The problem for many libraries (as you've partially mentioned before), is
On Sat, Aug 15, 2015 at 3:06 PM, VinÃcius dos Santos Oliveira < vini.ipsmaker@gmail.com> wrote: that if data is being sent in that fashion, they will consume memory without bounds if designed to read until the end of the payload. There is an already max read size. It's the buffer size you pass to
basic_socket.
But this is not defined by the http::Socket concept. Its not possible to write a function that takes any http::Socket concept and limits the number of bytes being pushed into the container. A conforming http::Socket implementation is currently allowed to keep resizing the container as necessary to add data (even until payload end), and I thought the prevention of that scenario was being touted as a benefit of Boost.Http. Adding a size_t parameter or a fixed buffer to `async_read_some` is a strong signal of intent to implementors, and a weaker one would be a statement in the documentation that a conforming implementation of the concept can only read/insert an unspecified fixed number of bytes before invoking the callback.
Filling HTTP headers is responsibility of the socket. The socket is the
communication channel, after all. A blacklist of headers wouldn't work always, as the client can easily use different headers. A whitelist of allowed headers can work better. A solution that is more generic is a predicate. It can go into the parser options later.
A predicate design would either have to buffer the entire field which would make it an allocating design, or it would have to provide partial values which would make it similar to a SAX parser but with the confusion of being called a predicate. The only point is that a system that needs ultimate control over memory management would likely need a parser (push or pull) that notifies the client of pre-defined boundaries.
You should also be able to choose a maximum header name size, so it's possible to use a stack-allocated buffer.
The memory requirements are affected by the max header size. With a push/pull parser it is possible to rip out information in a fixed amount of memory. The HTTP parser this library is using is a good example - it never allocates memory, does not require the client allocate any memory, and the #define for the max header size does _not_ change the size requirements of its data structures. It keeps necessary state and information in a fixed amount of space, yet is still able to know whether transfer-encoding: chunked was sent, etc. The initial source of my parser thoughts were how to combine ideas from boost::spirit into a HTTP system. A client could do a POST/PUT/DELETE, and then issue `msg_socket.async_read(http::parse_response_code(), void(uint16_t, error_code))` which would construct a HTTP parser that extracts the response code from the server, tracks a minimal set of headers (content-length, transfer-encoding, connection), yet still operates in a fixed memory budget even if max header / max payload were size_t::max. I still don't see how this is possible without a notification parser exposed somewhere in the design. Again, I'm not downvoting Boost.Http because it lacks this capability. I'm not sure of the demand for such a complicated library just to manipulate Http sockets. [1] http://ajaxpatterns.org/HTTP_Streaming Lee