Dear All,
This is related to the ongoing discussion of the Beast HTTP parser.
I have been thinking in general about how best to implement parser
APIs in modern and future C++. Specifically, I've been wondering
whether the imminent arrival of low-overhead coroutines ought to
change best practice for this sort of interface.
In the past, I have found that there is a trade-off between parser
implementation complexity and client code complexity. A "push" parser,
which invokes client callbacks as tokens are processed, is easier to
implement but harder to use as the client has to track its state
between callbacks with e.g. an explicit FSM. On the other hand, a
"pull parser" (possibly using an iterator interface) is easier for
the client but instead now the parser may need the explicit state
tracking.
Now, with stackless coroutines due "real soon now", we can avoid
needing explicit state on either side. In the parser we can
co_yield tokens as they are processed and in the client we can
consume them using input iterators. The use of co-routines doesn't
need to be explicit in the API; the parser can be said to return a
range<T>, and then return a generator<T>.
Here's a very very rough sketch of what I have in mind, for the case
of HTTP header parsing; note that I don't even have a compiler that
supports coroutines yet so this is far from real code:
generator<char> read_input(int fd)
{
char buf[4096];
while (1) {
int r = ::read(fd,buf,4096);
if (r == 0) return;
for (int i = 0; i < r; ++i) {
co_yield buf[i];
}
}
}
template <typename INPUT_RANGE>
generator< pair