Re: [Boost-users] [Spirit] Qi lexeme only taking the first word

6 Nov 2018

      On 7/11/2018 11:01, Michael Powell wrote:
...
I've got a couple of rules that are perplexing to me. First,
rule<It, std::string(), St> id %= lexeme[qi::alpha >> *char_("A-Za-z0-9_")];
In and of itself, id is working fine. Then I've got a "full id":
rule<It, full_id_t(), St> full_id %= id >> *(char_('.') >> id);
Where:
struct full_id_t {
     std::string val;
};
full_id_t::val is quite intentional for reasons elsewhere in the grammar.
The perplexity comes in, it seems lexeme is only shaving off the first
word as the val.
For instance, parsing "two.oranges.red.test", I receive back "two" in the AST.
Again, I don't really know anything about Spirit, but it's reasonable to 
assume that "lexeme" will group its input sequence into a single token 
output, which is the result of id as a single std::string.

Meanwhile in full_id you're specifying a sequence of input tokens, so it 
will also output a sequence of tokens (which can presumably be captured 
as a std::vector<std::string>, not simply a std::string).

Most likely (though again this is just a guess) given the input 
"two.oranges.red.test" you should end up with std::vector<std::string> { 
"two", "oranges", "red", "test" }.

This is probably what you want (as it will simplify later use of 
subcomponents), especially if the language allows whitespace around the ".".

If you want to disallow whitespace around the "." and get it as a single 
string token, then yes, you will probably have to make full_id call 
lexeme.  I don't know whether that will require extracting the inner 
part of id to a separate rule so that lexeme only ends up being called 
once or if you can "nest" uses of lexeme.

Re: [Boost-users] [Spirit] Qi lexeme only taking the first word

Gavin Lambert