On Thu, Feb 29, 2024 at 9:21 PM Peter Dimov via Boost
I'm shocked too. That's really crazy. Believe me, I'm not interested in merging something with orders of magnitude of perf overhead into Boost. I will be getting to the bottom of this long before a merge could take
Zach Laine wrote: place.
My profiler (VS2022) says that the top performance problem is the construction of a stringstream here
https://github.com/tzlaine/parser/blob/f99ae3b94ad0acef0cc92166d5108aa de41da4ea/include/boost/parser/detail/printing.hpp#L624
This constructs a std::locale, which is apparently very slow.
When I fix it
diff --git a/include/boost/parser/detail/printing.hpp b/include/boost/parser/detail/printing.hpp index 1e204796..6cbec059 100644 --- a/include/boost/parser/detail/printing.hpp +++ b/include/boost/parser/detail/printing.hpp @@ -621,11 +621,19 @@ namespace boost { namespace parser { namespace detail { flags f, Attribute const & attr) { - std::stringstream oss; if (detail::do_trace(f)) + { + std::stringstream oss; detail::print_parser(context, parser, oss); - return scoped_trace_t
( - first, last, context, f, attr, oss.str()); + + return scoped_trace_t ( + first, last, context, f, attr, oss.str()); + } + else + { + return scoped_trace_t ( + first, last, context, f, attr, {}); + } } template
This change takes me from ~6130ms to ~520ms.
the top function becomes the constructor of scoped_trace_t. (It's probably not getting inlined.)
Looks like the tracing functionality costs a lot even when off.
Commenting out the bodies of scoped_trace_t and ~scoped_trace_t takes me to ~360ms.
The top two functions are now `skip`
https://github.com/tzlaine/parser/blob/f99ae3b94ad0acef0cc92166d5108aade41da...
and `omit_parser::call`
https://github.com/tzlaine/parser/blob/f99ae3b94ad0acef0cc92166d5108aade41da...
both with 5.5%.
The `SkipParser` of `skip` is
` boost::parser::rulejson::ws,boost::parser::detail::nope,boost::parser::detail::nope,boost::pars...`.
Thanks! Template parameter it is then. Zach