Re: [boost] Reminder: Review for Zach's parsing library starts on Feb 19th

1 Mar 2024

      On Thu, Feb 29, 2024 at 7:16 AM Дмитрий Архипов via Boost
<boost@lists.boost.org> wrote:
...
This is my review of Boost.Parser.
[snip]
...
# What is your evaluation of the documentation?
Currently documentation looks a bit undercooked. It needs some restructuring.
For example, on one page there is a table of attributes for library-provided
parsers. On another, there is a table of attributes for parser operators. On
yet another those two pages are repeated, and also there's a bunch of
extra rules for
how sequences and alternatives affect attributes, followed by a table with
examples. This creates some confusion at first on where to look up information
on attributes.
Hopefully, this is addressed by the Cheat Sheet I made.  It has all
the tables copied into one place.
...
Another complaint I have is that the fact that the attribute for
parsers with semantic
actions is none is not advertised well enough. This is a very important piece of
information, because most parsers use actions. And yet, this is only mentioned
in operator tables, and never mentioned in the section on semantic actions.
Right, good point.   Ticket: https://github.com/tzlaine/parser/issues/148
...
Also, I had a lot of success using _locals. But the docs only mention it in
passing. I feel like overall the documentation would have benefited from more
examples which solve some specific tasks using various library components.
I agree; there's already a ticket open for this one.

[snip]
...
One thing I want to note is that the library seems to have copied several
naming conventions from Spirit. I'm not convinced there's much value in naming
context accessors with a leading underscore. I'm also not convinced dependency
on Boost.Hana for the sole purpose of using its tuple's operator[] is
warranted.
I feel strongly that it is, having used op[] a lot.  That being said,
I'm becoming aware that this is a minority opinion.  Perhaps I should
make std::tuple the default, instead of the other way around.  I don't
want to remove hana::tuple entirely, because I want to use it in my
own parsers.

[snip]
...
While I was writing those parsers using trace::on was of great help for
debugging errors. I still had a bunch of several pages-long template errors,
particularly when parser attributes were not of the type I expected it to be.
A related issue is that I have been using the branch that allowed returning
values from semantic actions, and the library often couldn't correctly figure
out whether I wanted to pass a context to the action, or just the attribute.
In the end, I had to add explicit return type of void to all of the former.
With the latter the problem sometimes manifested when a lambda had several
return statements. I predict that this will be a significant usability issue.
Yeah, I implemented it to see how it works, and it only kind of does.
I might experiment with alternatives that cannot be ambiguous, or
might strike it.  It was an interesting idea, but being forced to
rigorously constrain your lambda's is not fun.
...
Several people have asked for a way to define rules without a macro. I have a
different request: I think that there should be a macro which rather than
relying on a naming convention allows you to explicitly name the parser that
implements the rule. This would allow people to use their own conventions.
This is easy for me to add, but it's also really easy for you to
write.  I find such low-effort macros unappealing as a library
feature, since they're so easy for the user to use, if desired.
...
I tried using nocase and discovered that the symbols parser ignores it. Is this
by design? If so, there should be a separate parser that allows case-insensitive
symbol matching.
Nope, that's just an oversight.  I went through all the parsers one by
one, but the symbol table is its own thing, and got missed.  Ticket:
https://github.com/tzlaine/parser/issues/149
...
Finally, several people asked for benchmarks. Conveniently, the library
includes a JSON parsing example, and conveniently I am maintaining Boost.JSON
and I know how to add new parser implementations to its benchmark runner. The
results aren't very good: Boost.JSON with default allocator is approximately
500 times faster than Parser on apache_builds.json. Niels Lohmann's JSON
library is 100 times faster.
Ha! That's hilariously bad.  I'll have to take a look at what might be
going on there.  Did you happen to take a look at the callback JSON
parser for comparison?
...
* The main requirement is to fix build failures on popular C++ implementations.
  Related to this would be a well-functioning CI setup.
* Either symbols should support case-insensitive mode, or an alternative parser
  should be added to the library.
* There should be a rule definition macro that allows explicitly specifying
  parser that implements the rule.
Overall, I enjoyed my experience with the library. Given that Spirit is being
sunsetted, Boost needs a replacement.
I fully agree with the first two.  I'll probably only do the last one
if Marshall approves the library, and makes me.

Zach

Re: [boost] Reminder: Review for Zach's parsing library starts on Feb 19th

Zach Laine