On Fri, Feb 23, 2024 at 2:59 AM Matt Borland via Boost
Below is my review of the proposed Boost.Parser.
## What is your evaluation of the design?
I thought it was easier to pick-up and use than Spirit. How to use the library is clear, and I didn't see any real gotchas floating around. Like others I appreciate that the library has a descriptive name.
## What is your evaluation of the implementation?
It's generally pretty clean. The numeric parsing in
seems to be a reduced version of the parser from Spirit X3. To test if it was similar I ran this library through the Boost.Charconv benchmarks for performance evaluation with parsing floats, doubles, uint32, and uint64. I ran this using Homebrew GCC 13.2.0 on an M1 Mac. The test set consists of a test set of 2'000'000 random numbers of each type, and that test set is run 5 times. The timing in the left column is the average time to parse 1 number (i.e. total time / 10'000'000). ----from_chars float---- 29.7 ns | std::strtof float scientific 36.7 ns | std::strtod double scientific 519.9 ns | Boost.lexical_cast float scientific 574.0 ns | Boost.lexical_cast double scientific 28.3 ns | Boost.Spirit.Qi float scientific 34.7 ns | Boost.Spirit.Qi double scientific 72.1 ns | Boost.Parser float scientific 78.9 ns | Boost.Parser double scientific 22.0 ns | std::from_chars float scientific 22.2 ns | std::from_chars double scientific 23.0 ns | Boost.Charconv::from_chars float scientific 22.7 ns | Boost.Charconv::from_chars double scientific 65.6 ns | Google double-conversion float scientific 115.2 ns | Google double-conversion double scientific
-----from_chars int----- 23.1 ns | std::strtoul uint32_t 26.8 ns | std::strtoull uint64_t 40.4 ns | Boost.lexical_cast uint32_t 51.4 ns | Boost.lexical_cast uint64_t 29.3 ns | Boost.Spirit.Qi uint32_t 29.4 ns | Boost.Spirit.Qi uint64_t 53.7 ns | Boost.Parser uint32_t 66.3 ns | Boost.Parser uint64_t 11.1 ns | std::from_chars uint32_t 16.4 ns | std::from_chars uint64_t 10.0 ns | Boost.Charconv::from_chars uint32_t 16.3 ns | Boost.Charconv::from_chars uint64_t
The library requires C++17 so it may be worth offloading some of the numeric cases to <charconv> if possible.
Interesting. I would have expected the perf to be basically the same as Qi, since you're right, I took it from Spirit. However, I took it from Spirit X3, not Spirit 2's Qi. Maybe that's the difference? In any case, I think that using one of the from/to_chars implementations would be best. Ticket for that: https://github.com/tzlaine/parser/issues/113
The only thing I didn't like seems to be a carry-over from Spirit. Most numeric types have a parser of the form type_ (e.g. float_, int_, etc). ulong_long and long_long are both missing the trailing underscore which is a small enough difference to be surprising and a bit annoying.
Well, in all cases, there's an extra '_' after the first token of the original type name. See? Consistent. :) Basically, I saw no need to change it -- you strictly need the trailing '_' for float, and you don't need it for long_long.
I think the CI could be more comprehensive. All of the current testing is run on x86_64. The boost drone has ARM Mac, S390X, and aarch64 all running natively. Boost.CI also has some docker images of other architectures. I concur with Chris about adding CodeCov to the CI. I have added it to a number of boost libraries recently, and it has helped us to find, and fix dark corners of the codebase.
I'm all for this -- patches welcome. I don't understand how Drone works at all, or even where the builds happen, or I would have tried setting this up myself. [snip] Thanks for the review, Matt! And the PR. Zach