[Spirit] Qi lexeme only taking the first word

Michael Powell

6 Nov 2018 6 Nov '18

10:01 p.m.

Hello, I've got a couple of rules that are perplexing to me. First, rule<It, std::string(), St> id %= lexeme[qi::alpha >> *char_("A-Za-z0-9_")]; In and of itself, id is working fine. Then I've got a "full id": rule<It, full_id_t(), St> full_id %= id >> *(char_('.') >> id); Where: struct full_id_t { std::string val; }; full_id_t::val is quite intentional for reasons elsewhere in the grammar. The perplexity comes in, it seems lexeme is only shaving off the first word as the val. For instance, parsing "two.oranges.red.test", I receive back "two" in the AST. Perhaps I should defer specifying the lexeme part of id until later? Thoughts? Suggestions? Thank you! Best regards, Michael Powell

Show replies by date

Gavin Lambert

6 Nov 6 Nov

10:35 p.m.

On 7/11/2018 11:01, Michael Powell wrote:

...

I've got a couple of rules that are perplexing to me. First,

rule<It, std::string(), St> id %= lexeme[qi::alpha >> *char_("A-Za-z0-9_")];

In and of itself, id is working fine. Then I've got a "full id":

rule<It, full_id_t(), St> full_id %= id >> *(char_('.') >> id);

Where:

struct full_id_t { std::string val; };

full_id_t::val is quite intentional for reasons elsewhere in the grammar.

The perplexity comes in, it seems lexeme is only shaving off the first word as the val.

For instance, parsing "two.oranges.red.test", I receive back "two" in the AST.

Again, I don't really know anything about Spirit, but it's reasonable to assume that "lexeme" will group its input sequence into a single token output, which is the result of id as a single std::string. Meanwhile in full_id you're specifying a sequence of input tokens, so it will also output a sequence of tokens (which can presumably be captured as a std::vector<std::string>, not simply a std::string). Most likely (though again this is just a guess) given the input "two.oranges.red.test" you should end up with std::vector<std::string> { "two", "oranges", "red", "test" }. This is probably what you want (as it will simplify later use of subcomponents), especially if the language allows whitespace around the ".". If you want to disallow whitespace around the "." and get it as a single string token, then yes, you will probably have to make full_id call lexeme. I don't know whether that will require extracting the inner part of id to a separate rule so that lexeme only ends up being called once or if you can "nest" uses of lexeme.

Michael Powell

10:40 p.m.

On Tue, Nov 6, 2018 at 5:01 PM Michael Powell <mwpowellhtx@gmail.com> wrote:

...

Hello,

I've got a couple of rules that are perplexing to me. First,

rule<It, std::string(), St> id %= lexeme[qi::alpha >> *char_("A-Za-z0-9_")];

In and of itself, id is working fine. Then I've got a "full id":

rule<It, full_id_t(), St> full_id %= id >> *(char_('.') >> id);

Where:

struct full_id_t { std::string val; };

full_id_t::val is quite intentional for reasons elsewhere in the grammar.

The perplexity comes in, it seems lexeme is only shaving off the first word as the val.

For instance, parsing "two.oranges.red.test", I receive back "two" in the AST.

Perhaps I should defer specifying the lexeme part of id until later?

I elaborated a little on the "simple" full id sub-grammar, but I cannot repro using the GCC compiler. I'm wondering if this has anything to do with the VS2017 fpos issue? http://coliru.stacked-crooked.com/a/adeb42ce2f19b0fd Or there may be insufficient context in the web compiler to adequately demo.

...

Thoughts? Suggestions?

Thank you!

Best regards,

Michael Powell

Michael Powell

11:03 p.m.

On Tue, Nov 6, 2018 at 5:40 PM Michael Powell <mwpowellhtx@gmail.com> wrote:

...

On Tue, Nov 6, 2018 at 5:01 PM Michael Powell <mwpowellhtx@gmail.com> wrote:

...
Hello,

I've got a couple of rules that are perplexing to me. First,

rule<It, std::string(), St> id %= lexeme[qi::alpha >> *char_("A-Za-z0-9_")];

In and of itself, id is working fine. Then I've got a "full id":

rule<It, full_id_t(), St> full_id %= id >> *(char_('.') >> id);

Where:

struct full_id_t { std::string val; };

full_id_t::val is quite intentional for reasons elsewhere in the grammar.

The perplexity comes in, it seems lexeme is only shaving off the first word as the val.

For instance, parsing "two.oranges.red.test", I receive back "two" in the AST.

Perhaps I should defer specifying the lexeme part of id until later?

I elaborated a little on the "simple" full id sub-grammar, but I cannot repro using the GCC compiler. I'm wondering if this has anything to do with the VS2017 fpos issue?

http://coliru.stacked-crooked.com/a/adeb42ce2f19b0fd

Or there may be insufficient context in the web compiler to adequately demo.

I got a repro: http://coliru.stacked-crooked.com/a/069a44296240be7e Although the reasons as to why I do not know. It is a difference in attribute synthesis. When full_id synthesizes a std::string(), the conversion to full_id_t() "just works" magically. I'm guessing by happy accident based on the std::string val being the only member (adaptation, etc). But when I change the synthesis to be its "true" type, that is, AST::full_id_t(), suddenly I see the same behavior. Really and truly, I do not know why. Everything else being equal why would one approach be any different than the other? Anyone with some Spirit, Fusion, AST, insights? Thanks! For now, I'll run with it as has been exposed here, but it's a bit troubling to me not knowing the difference.

...

...
Thoughts? Suggestions?

Thank you!

Best regards,

Michael Powell

rmawatson rmawatson

7 Nov 7 Nov

1:12 a.m.

It's been a long while since I've used spirit::qi. But What it looks like is happeneing in your setup is something liek this, When you have: qi::rule<It, AST::full_id_t()> full_id; the attribute is vector<string> When it matches id >> *(char_('.') >> id) this has an attribute of vector<string,vector<tuple<char,std::string>>> or something similar. spirit appears to compare your target attribute with the synthesised attribute of the parser and for any (trailing?) members of the synthesised attribute that do not match in your attribute, it marks them as unused_type and they are not assigned. You can see overload of assign to is used in your example if you breakpoint it -> boost\spirit\home\qi\detail\assign_to.hpp line 399. It appears in boost\spirit\home\qi\operator\sequence_base.hpp line 74, where the predicate traits::attribute_not_unused<Context, Iterator> is passed to spirit::any_if (boost\spirit\home\support\algorithm\any_if.hpp line 186.) it will basically discard attributes where the LHS sequence is not matched with the RHS. You can see this in your example by adding an additional member to struct full_id_t { std::string val; std::vector<std::string> others; }; BOOST_FUSION_ADAPT_STRUCT(AST::full_id_t, val, others) Your missing bits will appear in this std::vector, as they are now not silently discarded. http://coliru.stacked-crooked.com/a/51f16c6deff45309 I think what the problem fundamentally is the attribute propagation is different when you have a string to when you have a vector<string> as in your two examples. the first kicks in whatever logic exists to flatten the LHS attribute into a string, the second takes the first element, assigns it and marks the rest as unused. One thing you can do is use qi::as<std::string>()[ id >> *(char_('.') >> id) ] to force conversion of synthesised attribute to a string to happen before it is assigned to your attribute. http://coliru.stacked-crooked.com/a/6a060343a390f037 I've only had a quick look and this is pretty half hearted analysis. You'll really have to dig deep to find out exactly what is going on, but I suspect this is somewhat along the right lines. ________________________________ From: Boost-users <boost-users-bounces@lists.boost.org> on behalf of Michael Powell via Boost-users <boost-users@lists.boost.org> Sent: 06 November 2018 23:03 To: boost-users@lists.boost.org Cc: Michael Powell Subject: Re: [Boost-users] [Spirit] Qi lexeme only taking the first word On Tue, Nov 6, 2018 at 5:40 PM Michael Powell <mwpowellhtx@gmail.com> wrote:

...

On Tue, Nov 6, 2018 at 5:01 PM Michael Powell <mwpowellhtx@gmail.com> wrote:

...
Hello,

I've got a couple of rules that are perplexing to me. First,

rule<It, std::string(), St> id %= lexeme[qi::alpha >> *char_("A-Za-z0-9_")];

In and of itself, id is working fine. Then I've got a "full id":

rule<It, full_id_t(), St> full_id %= id >> *(char_('.') >> id);

Where:

struct full_id_t { std::string val; };

full_id_t::val is quite intentional for reasons elsewhere in the grammar.

The perplexity comes in, it seems lexeme is only shaving off the first word as the val.

For instance, parsing "two.oranges.red.test", I receive back "two" in the AST.

Perhaps I should defer specifying the lexeme part of id until later?

I elaborated a little on the "simple" full id sub-grammar, but I cannot repro using the GCC compiler. I'm wondering if this has anything to do with the VS2017 fpos issue?

http://coliru.stacked-crooked.com/a/adeb42ce2f19b0fd

Or there may be insufficient context in the web compiler to adequately demo.

...

...
Thoughts? Suggestions?

Thank you!

Best regards,

Michael Powell

Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users

Michael Powell

2:08 a.m.

On Tue, Nov 6, 2018 at 8:12 PM rmawatson rmawatson <rmawatson@hotmail.com> wrote:

...

It's been a long while since I've used spirit::qi. But What it looks like is happeneing in your setup is something liek this,

When you have:

qi::rule<It, AST::full_id_t()> full_id;

the attribute is vector<string>

When it matches

id >> *(char_('.') >> id)

this has an attribute of vector<string,vector<tuple<char,std::string>>> or something similar.

Where are you getting that from? It makes no sense whatsoever given the struct full_it_t { std::string val; }, which is similarly mapped, and ruled, etc.

...

spirit appears to compare your target attribute with the synthesised attribute of the parser and for any (trailing?) members of the synthesised attribute that do not match in your attribute, it marks them as unused_type and they are not assigned.

Would I need to do some grouping or something to persuade Spirit to treat the struct as I've defined and adapted it?

...

You can see overload of assign to is used in your example if you breakpoint it -> boost\spirit\home\qi\detail\assign_to.hpp line 399.

It appears in boost\spirit\home\qi\operator\sequence_base.hpp line 74, where the predicate traits::attribute_not_unused<Context, Iterator> is passed to spirit::any_if (boost\spirit\home\support\algorithm\any_if.hpp line 186.) it will basically discard attributes where the LHS sequence is not matched with the RHS.

You can see this in your example by adding an additional member to

struct full_id_t { std::string val; std::vector<std::string> others; };

BOOST_FUSION_ADAPT_STRUCT(AST::full_id_t, val, others)

Your missing bits will appear in this std::vector, as they are now not silently discarded. http://coliru.stacked-crooked.com/a/51f16c6deff45309

I think what the problem fundamentally is the attribute propagation is different when you have a string to when you have a vector<string> as in your two examples. the first kicks in whatever logic exists to flatten the LHS attribute into a string, the second takes the first element, assigns it and marks the rest as unused.

One thing you can do is use qi::as<std::string>()[ id >> *(char_('.') >> id) ] to force conversion of synthesised attribute to a string to happen before it is assigned to your attribute. http://coliru.stacked-crooked.com/a/6a060343a390f037

I've only had a quick look and this is pretty half hearted analysis. You'll really have to dig deep to find out exactly what is going on, but I suspect this is somewhat along the right lines. ________________________________ From: Boost-users <boost-users-bounces@lists.boost.org> on behalf of Michael Powell via Boost-users <boost-users@lists.boost.org> Sent: 06 November 2018 23:03 To: boost-users@lists.boost.org Cc: Michael Powell Subject: Re: [Boost-users] [Spirit] Qi lexeme only taking the first word

On Tue, Nov 6, 2018 at 5:40 PM Michael Powell <mwpowellhtx@gmail.com> wrote:

...
On Tue, Nov 6, 2018 at 5:01 PM Michael Powell <mwpowellhtx@gmail.com> wrote:

...
Hello,

I've got a couple of rules that are perplexing to me. First,

rule<It, std::string(), St> id %= lexeme[qi::alpha >> *char_("A-Za-z0-9_")];

In and of itself, id is working fine. Then I've got a "full id":

rule<It, full_id_t(), St> full_id %= id >> *(char_('.') >> id);

Where:

struct full_id_t { std::string val; };

full_id_t::val is quite intentional for reasons elsewhere in the grammar.

The perplexity comes in, it seems lexeme is only shaving off the first word as the val.

For instance, parsing "two.oranges.red.test", I receive back "two" in the AST.

Perhaps I should defer specifying the lexeme part of id until later?

I elaborated a little on the "simple" full id sub-grammar, but I cannot repro using the GCC compiler. I'm wondering if this has anything to do with the VS2017 fpos issue?

http://coliru.stacked-crooked.com/a/adeb42ce2f19b0fd

Or there may be insufficient context in the web compiler to adequately demo.

I got a repro:

http://coliru.stacked-crooked.com/a/069a44296240be7e

Although the reasons as to why I do not know.

It is a difference in attribute synthesis. When full_id synthesizes a std::string(), the conversion to full_id_t() "just works" magically. I'm guessing by happy accident based on the std::string val being the only member (adaptation, etc).

But when I change the synthesis to be its "true" type, that is, AST::full_id_t(), suddenly I see the same behavior.

Really and truly, I do not know why. Everything else being equal why would one approach be any different than the other?

Anyone with some Spirit, Fusion, AST, insights?

Thanks!

For now, I'll run with it as has been exposed here, but it's a bit troubling to me not knowing the difference.

...
...
Thoughts? Suggestions?

Thank you!

Best regards,

Michael Powell

Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users

Gavin Lambert

3:28 a.m.

On 7/11/2018 15:08, Michael Powell wrote:

...

...
When it matches

id >> *(char_('.') >> id)

this has an attribute of vector<string,vector<tuple<char,std::string>>> or something similar.

Where are you getting that from? It makes no sense whatsoever given the struct full_it_t { std::string val; }, which is similarly mapped, and ruled, etc.

This might be wrong, but it's how I read the docs: The output of parsing is a Fusion sequence of the attributes that were parsed. So the output of id >> *(char_('.') >> id) is something like (but not exactly) tuple<string> tuple<string, char, string> tuple<string, char, string, char, string> etc string because that's the output attribute declared for id. char because you've used char_ instead of using '.' by itself (otherwise it would just disappear). And the latter two can be repeated zero or more times because you've used *. When you assign this to a rule with %=, it tries to best-fit this against the rule's declared output attribute. full_id_t contains a single string field, so the Fusion adaptation makes it equivalent to tuple<string>, and apparently this results in any additional values being discarded, not in concatenating as you expect. You can probably use an explicit semantic action to build a single string instead of using %=. Or you can make full_id_t contain vector<string> as rmawatson and I previously suggested, which should give you all the values. Another possibility, which I can't test because coliru appears to be grumpy at present, is to try using: full_id %= as_string[lexeme[id >> *(char_('.') >> id)]];

Michael Powell

4:46 a.m.

On Tue, Nov 6, 2018 at 10:28 PM Gavin Lambert via Boost-users <boost-users@lists.boost.org> wrote:

...

On 7/11/2018 15:08, Michael Powell wrote:

...
...
When it matches

id >> *(char_('.') >> id)

this has an attribute of vector<string,vector<tuple<char,std::string>>> or something similar.

Where are you getting that from? It makes no sense whatsoever given the struct full_it_t { std::string val; }, which is similarly mapped, and ruled, etc.

This might be wrong, but it's how I read the docs:

The output of parsing is a Fusion sequence of the attributes that were parsed.

So the output of

id >> *(char_('.') >> id)

is something like (but not exactly)

tuple<string> tuple<string, char, string> tuple<string, char, string, char, string> etc

string because that's the output attribute declared for id. char because you've used char_ instead of using '.' by itself (otherwise it would just disappear). And the latter two can be repeated zero or more times because you've used *.

When you assign this to a rule with %=, it tries to best-fit this against the rule's declared output attribute.

full_id_t contains a single string field, so the Fusion adaptation makes it equivalent to tuple<string>, and apparently this results in any additional values being discarded, not in concatenating as you expect.

You can probably use an explicit semantic action to build a single string instead of using %=.

Or you can make full_id_t contain vector<string> as rmawatson and I previously suggested, which should give you all the values.

Another possibility, which I can't test because coliru appears to be grumpy at present, is to try using:

full_id %= as_string[lexeme[id >> *(char_('.') >> id)]];

This approach works for me. And remains true to the AST. +1 Thanks!

...

_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users

Michael Powell

8 Nov 8 Nov

7:09 p.m.

On Tue, Nov 6, 2018 at 11:46 PM Michael Powell <mwpowellhtx@gmail.com> wrote:

...

On Tue, Nov 6, 2018 at 10:28 PM Gavin Lambert via Boost-users <boost-users@lists.boost.org> wrote:

...
On 7/11/2018 15:08, Michael Powell wrote:

...
...
When it matches

id >> *(char_('.') >> id)

this has an attribute of vector<string,vector<tuple<char,std::string>>> or something similar.

Where are you getting that from? It makes no sense whatsoever given the struct full_it_t { std::string val; }, which is similarly mapped, and ruled, etc.

This might be wrong, but it's how I read the docs:

The output of parsing is a Fusion sequence of the attributes that were parsed.

So the output of

id >> *(char_('.') >> id)

is something like (but not exactly)

tuple<string> tuple<string, char, string> tuple<string, char, string, char, string> etc

string because that's the output attribute declared for id. char because you've used char_ instead of using '.' by itself (otherwise it would just disappear). And the latter two can be repeated zero or more times because you've used *.

When you assign this to a rule with %=, it tries to best-fit this against the rule's declared output attribute.

full_id_t contains a single string field, so the Fusion adaptation makes it equivalent to tuple<string>, and apparently this results in any additional values being discarded, not in concatenating as you expect.

You can probably use an explicit semantic action to build a single string instead of using %=.

Or you can make full_id_t contain vector<string> as rmawatson and I previously suggested, which should give you all the values.

Another possibility, which I can't test because coliru appears to be grumpy at present, is to try using:

full_id %= as_string[lexeme[id >> *(char_('.') >> id)]];

This approach works for me. And remains true to the AST. +1 Thanks!

Boy, wow... I'll qualify that with this: in "this" case I was able to persuade Spirit/Fusion to produce what I wanted. In other cases, not so much. It really, I mean **REALLY**, wants to produce that std::vector<...>, doesn't it? It will take a bit of digesting to adjust the AST, etc, to that, but it's good (no, GREAT) to know about.

...

...
_______________________________________________ Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users

Gavin Lambert

7 Nov 7 Nov

6:02 a.m.

On 7/11/2018 16:28, I wrote:

...

Another possibility, which I can't test because coliru appears to be grumpy at present, is to try using:

full_id %= as_string[lexeme[id >> *(char_('.') >> id)]];

Actually, since you're consuming a consecutive sequence of input characters without skipping any whitespace, you could probably use this instead, which might be faster (though that's just a guess; measure it!): full_id %= as_string[raw[id >> *('.' >> id)]]; (I was half expecting as_string to not be needed here, but apparently it still is.)

rmawatson rmawatson

8:25 a.m.

...

this has an attribute of vector<string,vector<tuple<char,std::string>>> or something similar.

Where are you getting that from? It makes no sense whatsoever given the struct full_it_t { std::string val; }, which is similarly mapped, and ruled, I've just had a look and the synthesized attribute is actually boost::fusion::vector<std::basic_string<char,std::char_traits<char>,std::allocator<char> >,std::vector<boost::fusion::vector<char,std::basic_string<char,std::char_traits<char>,std::allocator<char> > >,std::allocator<boost::fusion::vector<char,std::basic_string<char,std::char_traits<char>,std::allocator<char> > > > > > Cleaned up that is, boost::fusion::vector<std::string,std::vector<boost::fusion::vector<char,std::string>>> So almost exactly what I said. You can get this using spirit::traits::attribute_of. It makes perfect sense. This is the synthesized attribute of the RHS parser, not the attribute you have passed it. Your struct full_it_t { std::string val; } appears to spirit as boost::fusion::vector<std::string> this is precisely what the BOOST_FUSION_ADAPT_STRUCT is for. The RHS parsers is the result of the various parsers you use. In your case... Sequence Parser (a >> b) Expression --> Attribute a: A, b: B --> (a >> b): tuple<A, B> Kleene Parser (*a) Expression --> Attribute a: A --> *a: vector<A> Character Parser (char_, lit) Expression --> Attribute ns::char_ --> The character type of the Character Encoding Namespace, ns. With your expression, where id's attribute is string, you have id >> *(char_('.') >> id) gives tuple<string,vector<tuple<char,string>>> This is then assigned to the target attribute of the LHS rule. which in your setup is, boost::fusion::vector<std::string> ________________________________ From: Boost-users <boost-users-bounces@lists.boost.org> on behalf of Michael Powell via Boost-users <boost-users@lists.boost.org> Sent: 07 November 2018 02:08 To: boost-users@lists.boost.org Cc: Michael Powell Subject: Re: [Boost-users] [Spirit] Qi lexeme only taking the first word On Tue, Nov 6, 2018 at 8:12 PM rmawatson rmawatson <rmawatson@hotmail.com> wrote:

...

It's been a long while since I've used spirit::qi. But What it looks like is happeneing in your setup is something liek this,

When you have:

qi::rule<It, AST::full_id_t()> full_id;

the attribute is vector<string>

When it matches

id >> *(char_('.') >> id)

this has an attribute of vector<string,vector<tuple<char,std::string>>> or something similar.

Where are you getting that from? It makes no sense whatsoever given the struct full_it_t { std::string val; }, which is similarly mapped, and ruled, etc.

...

spirit appears to compare your target attribute with the synthesised attribute of the parser and for any (trailing?) members of the synthesised attribute that do not match in your attribute, it marks them as unused_type and they are not assigned.

Would I need to do some grouping or something to persuade Spirit to treat the struct as I've defined and adapted it?

...

You can see overload of assign to is used in your example if you breakpoint it -> boost\spirit\home\qi\detail\assign_to.hpp line 399.

It appears in boost\spirit\home\qi\operator\sequence_base.hpp line 74, where the predicate traits::attribute_not_unused<Context, Iterator> is passed to spirit::any_if (boost\spirit\home\support\algorithm\any_if.hpp line 186.) it will basically discard attributes where the LHS sequence is not matched with the RHS.

You can see this in your example by adding an additional member to

struct full_id_t { std::string val; std::vector<std::string> others; };

BOOST_FUSION_ADAPT_STRUCT(AST::full_id_t, val, others)

Your missing bits will appear in this std::vector, as they are now not silently discarded. http://coliru.stacked-crooked.com/a/51f16c6deff45309

I think what the problem fundamentally is the attribute propagation is different when you have a string to when you have a vector<string> as in your two examples. the first kicks in whatever logic exists to flatten the LHS attribute into a string, the second takes the first element, assigns it and marks the rest as unused.

One thing you can do is use qi::as<std::string>()[ id >> *(char_('.') >> id) ] to force conversion of synthesised attribute to a string to happen before it is assigned to your attribute. http://coliru.stacked-crooked.com/a/6a060343a390f037

I've only had a quick look and this is pretty half hearted analysis. You'll really have to dig deep to find out exactly what is going on, but I suspect this is somewhat along the right lines. ________________________________ From: Boost-users <boost-users-bounces@lists.boost.org> on behalf of Michael Powell via Boost-users <boost-users@lists.boost.org> Sent: 06 November 2018 23:03 To: boost-users@lists.boost.org Cc: Michael Powell Subject: Re: [Boost-users] [Spirit] Qi lexeme only taking the first word

On Tue, Nov 6, 2018 at 5:40 PM Michael Powell <mwpowellhtx@gmail.com> wrote:

...
On Tue, Nov 6, 2018 at 5:01 PM Michael Powell <mwpowellhtx@gmail.com> wrote:

...
Hello,

I've got a couple of rules that are perplexing to me. First,

rule<It, std::string(), St> id %= lexeme[qi::alpha >> *char_("A-Za-z0-9_")];

In and of itself, id is working fine. Then I've got a "full id":

rule<It, full_id_t(), St> full_id %= id >> *(char_('.') >> id);

Where:

struct full_id_t { std::string val; };

full_id_t::val is quite intentional for reasons elsewhere in the grammar.

The perplexity comes in, it seems lexeme is only shaving off the first word as the val.

For instance, parsing "two.oranges.red.test", I receive back "two" in the AST.

Perhaps I should defer specifying the lexeme part of id until later?

I elaborated a little on the "simple" full id sub-grammar, but I cannot repro using the GCC compiler. I'm wondering if this has anything to do with the VS2017 fpos issue?

http://coliru.stacked-crooked.com/a/adeb42ce2f19b0fd

Or there may be insufficient context in the web compiler to adequately demo.

I got a repro:

http://coliru.stacked-crooked.com/a/069a44296240be7e

Although the reasons as to why I do not know.

It is a difference in attribute synthesis. When full_id synthesizes a std::string(), the conversion to full_id_t() "just works" magically. I'm guessing by happy accident based on the std::string val being the only member (adaptation, etc).

But when I change the synthesis to be its "true" type, that is, AST::full_id_t(), suddenly I see the same behavior.

Really and truly, I do not know why. Everything else being equal why would one approach be any different than the other?

Anyone with some Spirit, Fusion, AST, insights?

Thanks!

For now, I'll run with it as has been exposed here, but it's a bit troubling to me not knowing the difference.

...
...
Thoughts? Suggestions?

Thank you!

Best regards,

Michael Powell

Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users

Boost-users mailing list Boost-users@lists.boost.org https://lists.boost.org/mailman/listinfo.cgi/boost-users

Larry Evans

7:41 a.m.

On 11/6/18 7:12 PM, rmawatson rmawatson via Boost-users wrote:

...

It's been a long while since I've used spirit::qi. But What it looks like is happeneing in your setup is something liek this,

When you have:

qi::rule<It, AST::full_id_t()> full_id;

the attribute is vector<string>

When it matches

id >> *(char_('.') >> id)

this has an attribute of vector<string,vector<tuple<char,std::string>>> or something similar. [snip]

...

One thing you can do is use qi::as<std::string>()[ id >> *(char_('.') >> id) ] to force conversion of synthesised attribute to a string to happen before it is assigned to your attribute. [snip]

rmawatson's as<std::string> suggestion works: https://coliru.stacked-crooked.com/a/a2c9435ee9e88bad Yeah rmawatson!

Larry Evans

6:42 a.m.

On 11/6/18 4:40 PM, Michael Powell via Boost-users wrote:

...

On Tue, Nov 6, 2018 at 5:01 PM Michael Powell <mwpowellhtx@gmail.com> wrote:

...
Hello,

I've got a couple of rules that are perplexing to me. First,

rule<It, std::string(), St> id %= lexeme[qi::alpha >> *char_("A-Za-z0-9_")];

In and of itself, id is working fine. Then I've got a "full id":

rule<It, full_id_t(), St> full_id %= id >> *(char_('.') >> id);

Where:

struct full_id_t { std::string val; };

full_id_t::val is quite intentional for reasons elsewhere in the grammar.

The perplexity comes in, it seems lexeme is only shaving off the first word as the val.

For instance, parsing "two.oranges.red.test", I receive back "two" in the AST.

Perhaps I should defer specifying the lexeme part of id until later?

[snip] The following simplification:

https://coliru.stacked-crooked.com/a/1adacde1a472d7a7 shows the full_id_t has the full attributes; however, it does *not* join them with the '.' char. Instead, it's a vector<std::string>. Unfortunately, I don't know how to automatically combine into a single string, but maybe this simplification will give you a starting point to figure that out. -regards, Larry

2441

Age (days ago)

2443

Last active (days ago)

List overview

Download

0 comments

1 participants

participants (1)

Gavin Lambert
Larry Evans
Michael Powell
rmawatson rmawatson