On Fri, Oct 13, 2017 at 11:59 AM, Phil Endecott via Boost
Dear All, A "push" parser, which invokes client callbacks as tokens are processed, is easier to implement but harder to use as the client has to track its state between callbacks with e.g. an explicit FSM. On the other hand, a "pull parser" (possibly using an iterator interface) is easier for the client but instead now the parser may need the explicit state tracking.
That is generally true, and especially true for XML and other
languages that have a similar structure. Specifically, that there are
opening and closing tags which determine the validity of subsequent
grammar, and have a recursive structure (like HTML).
But this is not the case for HTTP. There are no opening and closing
tags. There is no need to keep a "stack" of "open tags". It is quite
straightforward. Therefore, when designing an HTTP parser we can place
less emphasis on the style of parser and instead focus those energies
to other considerations (as I described in my previous post, regarding
the separation of concerns for stream algorithms and parser
consumers).
If you look at the Beast parser derived class, you can see that the
state is quite minimal:
template
Here's a very very rough sketch of what I have in mind, for the case of HTTP header parsing; note that I don't even have a compiler that supports coroutines yet so this is far from real code:
I think it is great that you're providing an example but you have chosen the most simple, regular part of HTTP which is the headers. I suspect that if you try to use the iterator model for the start-line (which is different for requests and responses) and then try to express the message body using iterators you will run into considerable difficulty coming up with a design that is elegant and feature-rich. Especially when you consider the need to transform the chunk-encoding while providing the metadata to the caller. I know this because I went through many iterations before settling on what is in Beast currently. Thanks