On 04/01/2016 12:19 AM, Lee Clagett wrote:
On Thu, 31 Mar 2016 23:02:05 +0200 Daniel Hofmann
wrote: Suppose I want to parse a list of ";"-separated floating point pairs with "," being the pair separator as in "1,2;3,4". Following this list comes a string literal representing a file extension, such as ".txt".
Therefore what I want to successfully parse input like the following:
1,2;3,4.txt
(For the record, the input could also be 1.1,2.2;3.3,4.4.txt)
The parser I came up with is the 1:1 translation of above's description into the Spirit DSL and shows Spirit's expressive power:
((double_ % ",") % ";") >> ".txt"
Unfortunately, the parser fails on the input with the integral values above. Why? Because the fundamental parser for double_ greedily matches on the "4." in "4.txt". Changing the "4" to "4.0" as in
1,2;3,4.0.txt
parses successfully (but is not an option as it requires the user to always add a trailing ".0" in case the last digit is integral.
I read about Spirit's DSL mapping to Parsing Expression Grammar (PEG) with the choice operator | being evaluated in order. So the next logical step for me was to try making use of it and adapting the parser:
(((int_ | double_) % ",") % ";") >> ".txt"
which works on
1,2;3,4.txt
but no longer on
1,2;3,4.0.txt
Is there a way to adapt the parser to handle both cases?
I asked this on IRC and got the answer to try a solution based on
((double_ >> ".") | (int_ >> ".")) >> "txt"
but when I use use this to parse "4.txt" into a std::vector<double> via
parse(first, last, ((double_ >> ".") | (int_ >> ".")) >> "txt", into);
the vector contains: {4, 4} and its size() is 2, which I can make no sense of at all (but this may be a different problem).
That was me in IRC. I assumed you would be using `variant
` or `double` as your attribute type, and not `std::vector<double>`. If this is part of a larger expression and you need to use a std::vector for some reason look into the hold directive [0]: (hold[double_ >> "."] | (int_ >> ".")) >> "txt"
The sequence operator will immediately call push_back if the left side expression (`double`) succeeds. `hold` creates a copy of the vector, and swaps iff everything in the directive returns true. If you use a `variant` or a `double` as your attribute, then the attribute is overwritten by `int_` and the `hold` is not needed.
I see, so parsers immediately push_back into the vector and in case of failure the items remain in the vector, unless I'm using hold. This perfectly explains what I'm seeing here.
I am not sure why you want to use a `double` in this situation, but
std::vector<unsigned> out; parse(first, last, (+(uint_ >> '.') >> "txt"), out);
or
unsigned one = 0; boost::optional<unsigned> two; parse( first, last, (uint_ >> '.' >> -(uint_ >> '.') >> "txt"), one, two);
will prevent inputs that contain '-' or the various inputs that the real parser [1] accepts. uint_ [2] can also be specialized to have a min,max number of digits which might be useful to your situation.
I'm parsing into a std::vector<double> since I want both 1,2;3,4.txt as well as 1.1,2.2;3.3,4.4.txt to succeed. With a uint_ based parser as you suggest, I get a vector of {1,1,..} for the second example, which does not represent the input or lets me reconstruct it. Looking at strict_real_policies<double> I was under the impression that the default real policy should work for both inputs above, being able to parse both inputs into a vector of {1.0, 2.0, 3.0, 4.0} and {1.1, 2.2, 3.3, 4.4} respectively.
Lee
[0]http://www.boost.org/doc/libs/1_60_0/libs/spirit/doc/html/spirit/qi/referenc... [1]http://www.boost.org/doc/libs/1_60_0/libs/spirit/doc/html/spirit/qi/referenc... [2]http://www.boost.org/doc/libs/1_60_0/libs/spirit/doc/html/spirit/qi/referenc...