Re: [boost] [http] Formal Review

14 Aug 2015

      2015-08-14 2:49 GMT-03:00 Lee Clagett <forum@leeclagett.com>:
...
...
No. That's a way to avoid memory copies. That's not necessary to avoid
zero
allocations.
You can have a custom backend tied to a single type of message that will
do
the HTTP parsing for you. It could be inefficient by maybe forcing the
parsing every time you call headers(), but without an implementation I
don't have much to say.
But this would only be able to handle a one HTTP message type? Or would
drop some useful information? I think it would be difficult to implement a
non-allocating HTTP parser unless it was SAX, or stopped at defined points
(essentially notifying you like a SAX parser).
As type of message, I was referring to the Message concept:
https://boostgsoc14.github.io/boost.http/reference/message_concept.html

And yes, this implementation would work with only one type.

The idea is: the message object is just a buffer with an embedded parser
and the socket will just transfer responsibility to the message. The user
API stays the same. A buffer the same size would still be interesting in
the socket to efficiently support HTTP pipelining (we cannot have data from
different messages in the same message object, as it might be dropped at
any time by the user).

I'm not slightly worried about the problem you mention with the parser. I
know it's possible. It won't show itself as a problem in the future.
...
Like you guessed, you pass a buffer to basic_socket. It won't read more
...
bytes than the buffer size.
But how can this be combined with higher order functions? For example a
`async_read_response(Socket, MaxDataSize, void(uint16_t, std::string,
vector<uint8_t>, error_code))`? However such a utility is defined, it will
have to be tied to a specific implementation currently, because theres no
way to control the max-read size via socket concept. Or would such a
function omit a max read size (several other libraries don't have one
either)? Or would it just overread a _bit_ into the container?
The problem isn't "how can this be combined with higher order functions?".
The problem is "how can this feature be exposed portably among different
HTTP backends?" and the answer is "it can't because it might not even make
sense in all HTTP backends". Of course this comment is about the hacky
solution (use a buffer of limited size in the HTTP backend).

On the non-hacky front, some traits exposing extra API could be defined.
The basic_socket could implement these traits without hampering the
implementation of other backends that have different characteristics.
...
About the embedded device situation, it'll be improved when I expose the
...
parser options, then you'll be able to set max headers size, max header
name size, mas header value size and so on. With all upper limits figured
out, you can provide a single chunk of memory for all data.
But what if an implementation wanted to discard some fields to really keep
the memory low? I think that was the point of the OP. I think this is
difficult to achieve with a notifying parser. It might be overkill for
Boost.Http, people under this durress can seek out existing Http parsers.
Filling HTTP headers is responsibility of the socket. The socket is the
communication channel, after all. A blacklist of headers wouldn't work
always, as the client can easily use different headers. A whitelist of
allowed headers can work better. A solution that is more generic is a
predicate. It can go into the parser options later.

A trait could be defined to also expose the same API in different HTTP
backends that might not need a parser.
...
A simple use case: You're not directly exposing your application to the
...
network. You're using proxies with auto load balancing. You're not using
HTTP wire protocol to glue communication among the internal nodes. You're
using ZeroMQ. There is no HTTP wire format usage in your application at
all
and Boost.Http still would be used, with the current API, as is. A
different HTTP backend would be provided and you're done.
It makes no sense to enforce the use of HTTP wire format in this use case
...
at all. And if you're an advocate of HTTP wire format, keep in mind that
the format changed in HTTP 2.0. There is no definitive set-in-stone
serialized representation.
Yes, if HTTP were converted into a different (more efficient) wire format
(as I've seen done in various ways - sandstorm/capnproto now does this
too), a new implementation of http::ServerSocket could re-read that format
and be compatible. It would be useful to state this more clearly in the
documentation, unless I missed it (sorry).
You can already have any message representation you want: It's the message
concept. And it was crafted **very** carefully:
https://boostgsoc14.github.io/boost.http/reference/message_concept.html

-- 
Vinícius dos Santos Oliveira
https://about.me/vinipsmaker