Boost.Parser parsing length-prefixed strings efficiently
Hi everyone, I have two questions about the following code, which attempts to parse length-prefixed strings using the proposed Boost.Parser: ``` std::string input = "[5]HELLO[5]WORLD"; unsigned int size = 0; auto action = [&size](auto & ctx) { size = bp::_attr(ctx); }; auto lps_parser = '[' >> bp::uint_[action] >> ']' >> bp::repeat(std::ref(size))[bp::char_]; auto result = bp::parse(input, *lps_parser); ``` 1. Is it possible to eliminate the need for the action and size variable by using some kind of placeholder in bp::repeat that utilizes the attribute of the first parser? 2. How can we make this part of the parser more efficient: bp::repeat(std::ref(size))[bp::char_] The current implementation seems to loop and execute bp::char_ parser on each character, while all it needs to do is chunk a portion of the input string. Regards, Mohammad Nejati
On Thu, Feb 29, 2024 at 2:13 PM Mohammad Nejati [ashtum] via Boost
Hi everyone,
I have two questions about the following code, which attempts to parse length-prefixed strings using the proposed Boost.Parser:
``` std::string input = "[5]HELLO[5]WORLD";
unsigned int size = 0; auto action = [&size](auto & ctx) { size = bp::_attr(ctx); };
auto lps_parser = '[' >> bp::uint_[action] >> ']' >> bp::repeat(std::ref(size))[bp::char_];
auto result = bp::parse(input, *lps_parser); ```
1. Is it possible to eliminate the need for the action and size variable by using some kind of placeholder in bp::repeat that utilizes the attribute of the first parser?
Well, if you put everything into a rule, you could have a locals struct that had room for that size. Then you could refer to that local in the parameter to repeat(). I'll be adding a more explicit example for locals and parameters soon. In the meantime, you can look at the tests to see them in action.
2. How can we make this part of the parser more efficient: bp::repeat(std::ref(size))[bp::char_] The current implementation seems to loop and execute bp::char_ parser on each character, while all it needs to do is chunk a portion of the input string.
Huh. Yeah, I definitely don't provide a way for a parser to bump the input position forward. You could do it in a custom parser you wrote yourself. Each parser has a mutable reference to the current iterator position; you could write one that parsed a number N in brackets, and then did "first += N" if first was random_access. It's at odds with how the iterators are assumed to work -- Parser only assumes forward_ranges, except for the string_view[] directive, which requires contiguous_ranges. Zach
participants (2)
-
Mohammad Nejati [ashtum]
-
Zach Laine