Re: [boost] Reminder: Review for Zach's parsing library starts on Feb 19th

1 Mar 2024

      Hi everyone,

This is my review of the proposed Boost.Parser. I'm posting my review
without an Accept/Reject vote.

# Are you knowledgeable about the problem domain?

I've had no prior experience using parser combinators. Therefore, I'm
writing this review as someone who attempted to read the documentation
and utilize the library to create a parser for parsing chunk-encoded
HTML bodies.

# How much effort did you put into your evaluation? A glance? A quick
reading? In-depth study?

I've spent approximately 1 day reading the documentation and 1 day
experimenting with the code and comparing it with Boost.Spirit.

# Does this library bring real benefit to C++ developers for
real-world use-case?

Yes, especially given that we frequently encounter applications or
libraries that necessitate parsing strings or plain text network
protocols. Hand-crafted parsers are prone to bugs and require thorough
reviews and fuzz tests to ensure accuracy. Having a library that
enables the construction of complex parsers by combining simple and
verifiable rules can significantly reduce development time.

# Do you have an application for this library?

I find it useful for testing serializers and parsers in network
libraries that implement a plain-text network protocol like HTTP. We
can load test the serializer or parts of it against simple parsers
made using this library, in addition to the more complex parser that
is built for efficiency but is more bug-prone.

# What is your evaluation of the documentation?

I noticed a few areas where the documentation could be improved:
1. The documentation assumes that the reader is already familiar with
using former parsers in Boost libraries like Spirit.QI, making it
nearly unusable for those who lack prior knowledge of Spirit.
2. There are very few examples demonstrating how to use directives,
such as 'repeat', for instance.
3. There is a lack of a step-by-step guide that explains fundamental
concepts through examples.
4. The table of contents is either overly verbose or not categorized
properly, making it difficult to locate related sections.

# What is your evaluation of the design?

I found the necessity to go through multiple steps and use macros for
defining rules a bit cumbersome. It would be much more ergonomic to
enable the definition of rules in a single line, if possible.

# Did you try to use the library? With what compiler? Did you have any problems?

Parsing chunk-encoded HTML bodies requires the utilization of
attributes from a preceding parser to extract each chunk. I found the
placeholders in Boost.Spirit very handy for accessing attributes and
local variables. here is how the parser appears in Boost.Spirit:
```
chunk_rule %= qi::hex[_a = _1] >> qi::eps(_a) >> "\r\n" >>
qi::repeat(_a)[qi::char_] >> "\r\n";
```
In comparison to using lambda expressions and external state variables
in the proposed library:
```
unsigned int size = 0;
auto chunk_parser =
bp::hex[([&size](auto & ctx) { size = bp::_attr(ctx); })] >>
bp::eps([&size](auto &) { return size != 0; }) >> "\r\n" >>
bp::repeat(std::ref(size))[bp::char_] >> "\r\n";
```
I attempted to define a rule with local variables as well, but there
was no way to access the variable within the repeat directive. I'm
unsure if it's technically feasible, but it would be very ergonomic to
write concise code like using placeholders in Boost.Spirit.

Another issue was the fact writing:
```
bp::repeat(std::ref(size))[bp::char_]
```
Is not an efficient method for extracting a chunk of string. This
approach loops and executes the `bp::char_ parser` on each character
and there is no parser available that can chunk a portion of the input
string.

In summary, my experience wasn't great. I anticipated a simple parser
that could be written in one line and was easy to reason about.
However, it ended up requiring manual state management and was quite
convoluted.

I'd like to thank Zach for his work, and I hope this review might
prove useful for improving the library.

- Affiliation disclosure

Please note that I'm affiliated with the C++ Alliance.

Regards,
Mohammad Nejati

On Fri, Mar 1, 2024 at 3:43 PM Andrzej Krzemienski via Boost
<boost@lists.boost.org> wrote:
...
czw., 29 lut 2024 o 23:30 Christian Mazakas via Boost <boost@lists.boost.org>
napisał(a):
...
I'm not entirely sold on the value of Hana's `operator[]` for tuple
accesses.
Especially because in later C++, we have placeholders so we can always just
do:
auto const& [first,_,_] = my_tuple;
https://godbolt.org/z/vvnKxjGos
I largely agree with Alan, Hana's a heavyweight dependency for what I think
is dubious
benefit. Maybe others feel differently here but I'd prefer to just always
work with the
standard tuple and structured bindings than Hana's `operator[]`.
Once/if https://github.com/tzlaine/parser/issues/106 is implemented. The
motivation for using Hana over std::tuple will be reduced practically to
zero.
https://github.com/tzlaine/parser/issues/106 handles the simple cases, and
for more complicated logic, you do not mind calling std::get<>.
Regards,
&rzej;
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On Mon, Feb 12, 2024 at 7:34 PM Marshall Clow via Boost
<boost@lists.boost.org> wrote:
...
That’s a week from now.
Github: https://github.com/tzlaine/parser
Docs:   https://tzlaine.github.io/parser
— Marshall
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [boost] Reminder: Review for Zach's parsing library starts on Feb 19th

Mohammad Nejati [ashtum]