From boost-1.30.2 I built the list_parser.cpp example using MSVC++7.1. In the CSV list I added ",,2" to the end of the line to be parsed. The program ran but the neither of the added elements were included in the output vector. Having used the tokenizer before I know it will generally gobble up adjacent delimiters without any output. You can get the empty token if you ask for the delimiters as well and then just discard the delimiters (sort of). But I was surprised that the 2 didn't make it to the output vector. I expected the empty list entry would not make the output vector. I tried adding ~anychar_p to the rule but that didn't help. If I change the input list to add ",\"\",2" I get "" and 2 in the output vector. Applications like spreadsheets do not put missing column values for numeric columns as "" items generally. Maybe I am missing something because it certainly isn't the behavior I would expect (either from the parser or the tokenizer).
Larry wrote:
From boost-1.30.2 I built the list_parser.cpp example using MSVC++7.1. In the CSV list I added ",,2" to the end of the line to be parsed.
Could you please be more specific, where did you added the ",,2" ?
The program ran but the neither of the added elements were included in the output vector. Having used the tokenizer before I know it will generally gobble up adjacent delimiters without any output. You can get the empty token if you ask for the delimiters as well and then just discard the delimiters (sort of). But I was surprised that the 2 didn't make it to the output vector. I expected the empty list entry would not make the output vector. I tried adding ~anychar_p to the rule but that didn't help. If I change the input list to add ",\"\",2" I get "" and 2 in the output vector. Applications like spreadsheets do not put missing column values for numeric columns as "" items generally. Maybe I am missing something because it certainly isn't the behavior I would expect (either from the parser or the tokenizer).
Please post a small (minimal) sample, which exposes the behaviour in question. This helps us to spot the problem. Regards Hartmut BTW: Spirit has its own mailing list: Spirit-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/spirit-general Please post further questions there. Thanks.
I should have said that this was a first attempt at using Spirit and I was just experimenting with some simple examples. There are three examples in the code. The adjustment was made to the second example which is for CSV. The string to be parsed was originally: char const *plist_csv = "\"string\",\"string with an embedded \\\"\"," "12345,0.12345e4"; Changed to: char const *plist_csv = "\"string\",\"string with an embedded \\\"\"," "12345,0.12345e4,,2"; These are at line 104. The original parser line was: list_csv_item = confix_p('\"', *c_escape_ch_p, '\"') | longest_d[real_p | int_p] ; Again, I expected the null item to be omitted from my experience with the tokenizer code but I didn't expect the 2 to be omitted. After the initial failure I changed the line to: list_csv_item = confix_p('\"', *c_escape_ch_p, '\"') | longest_d[real_p | int_p|~anychar_p] ; It made no difference. Inserting \"\" between the two commas preceding the 2 did make it get the "null" token and the 2. HartmutKaiser@t-online.de wrote:
Larry wrote:
From boost-1.30.2 I built the list_parser.cpp example using MSVC++7.1. In the CSV list I added ",,2" to the end of the line to be parsed.
Could you please be more specific, where did you added the ",,2" ?
The program ran but the neither of the added elements were included in the output vector. Having used the tokenizer before I know it will generally gobble up adjacent delimiters without any output. You can get the empty token if you ask for the delimiters as well and then just discard the delimiters (sort of). But I was surprised that the 2 didn't make it to the output vector. I expected the empty list entry would not make the output vector. I tried adding ~anychar_p to the rule but that didn't help. If I change the input list to add ",\"\",2" I get "" and 2 in the output vector. Applications like spreadsheets do not put missing column values for numeric columns as "" items generally. Maybe I am missing something because it certainly isn't the behavior I would expect (either from the parser or the tokenizer).
Please post a small (minimal) sample, which exposes the behaviour in question. This helps us to spot the problem.
Regards Hartmut
BTW: Spirit has its own mailing list:
Spirit-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/spirit-general
Please post further questions there. Thanks.
Info: http://www.boost.org Wiki: http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl Unsubscribe: mailto:boost-users-unsubscribe@yahoogroups.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Larry Knain
I should have said that this was a first attempt at using Spirit and I was just experimenting with some simple examples.
There are three examples in the code. The adjustment was made to the second example which is for CSV. The string to be parsed was originally:
char const *plist_csv = "\"string\",\"string with an embedded \\\"\"," "12345,0.12345e4"; Changed to:
char const *plist_csv = "\"string\",\"string with an embedded \\\"\"," "12345,0.12345e4,,2"; These are at line 104.
The original parser line was:
list_csv_item = confix_p('\"', *c_escape_ch_p, '\"') | longest_d[real_p | int_p]
Try this: list_csv_item = confix_p('\"', *c_escape_ch_p, '\"') | longest_d[real_p | int_p] | eps_p ; list_csv = ( list_csv_item[append(vec_item)] % ',' )[append(vec_list)] I guess the list_p was not designed to work with null entries. For such simple tasks, the % operator will do just fine. Regards, -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net PS> If you have further questions, please post it to: Spirit-general mailing list Spirit-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/spirit-general Thanks!
Larry Knain wrote:
I should have said that this was a first attempt at using Spirit and I was just experimenting with some simple examples.
There are three examples in the code. The adjustment was made to the second example which is for CSV. The string to be parsed was originally:
char const *plist_csv = "\"string\",\"string with an embedded \\\"\"," "12345,0.12345e4"; Changed to:
char const *plist_csv = "\"string\",\"string with an embedded \\\"\"," "12345,0.12345e4,,2"; These are at line 104.
If you look at the parser definition, you'll notice, that this particular list parser is designed to parse escaped C-strings, integers or reals separated by commas. I.e. no empty elements. If you would like to make your list_p to match even empty elements, you'd have to make the list_csv_item optional: list_csv = list_p( !list_csv_item[append(vec_item)], // ---------^ note this exclamation sign! ',' )[append(vec_list)] ;
The original parser line was:
list_csv_item = confix_p('\"', *c_escape_ch_p, '\"') | longest_d[real_p | int_p] ; Again, I expected the null item to be omitted from my experience with the tokenizer code but I didn't expect the 2 to be omitted. After the initial failure I changed the line to:
list_csv_item = confix_p('\"', *c_escape_ch_p, '\"') | longest_d[real_p | int_p|~anychar_p] ;
It made no difference. Inserting \"\" between the two commas preceding the 2 did make it get the "null" token and the 2.
Adding the ~anything_p to the list_item parser isn't the correct thing. This actually won't match any input at all. HTH Regards Hartmut
HartmutKaiser@t-online.de wrote:
If you look at the parser definition, you'll notice, that this particular list parser is designed to parse escaped C-strings, integers or reals separated by commas. I.e. no empty elements. If you would like to make your list_p to match even empty elements, you'd have to make the list_csv_item optional:
list_csv = list_p( !list_csv_item[append(vec_item)], // ---------^ note this exclamation sign! ',' )[append(vec_list)] ;
Ok, I stand corrected. Please disregard my previous comment that "I guess the list_p was not designed to work with null entries". I have a question though: -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net
Joel de Guzman
Ok, I stand corrected. Please disregard my previous comment that "I guess the list_p was not designed to work with null entries". I have a question though:
Darn! No more question, sir! <<Hit the wrong button>> -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net
HartmutKaiser@t-online.de wrote:
If you look at the parser definition, you'll notice, that this particular list parser is designed to parse escaped C-strings, integers or reals separated by commas. I.e. no empty elements. If you would like to make your list_p to match even empty elements, you'd have to make the list_csv_item optional:
list_csv = list_p( !list_csv_item[append(vec_item)], // ---------^ note this exclamation sign! ',' )[append(vec_list)] ;
Ok, I stand corrected. Please disregard my previous comment that "I guess the list_p was not designed to work with null entries". Hartmut is the list_p expert ;-) -- Joel de Guzman http://www.boost-consulting.com http://spirit.sf.net PS> Can we move the discussion to Spirit's mailing list now?
Thanks Hartmut. I inserted the code change you suggested and it picks up the 2 but there is no indication of the "missing" list element. I'll do some more reading and see what I can come up with. (I have a number of CSV files created by various spreadsheets and DB programs that I need to read that have "missing" list items and was trying for a more elegant solution in Spirit. I have done this with the tokenizer by keeping the delimiter tokens.) HartmutKaiser@t-online.de wrote:
Larry Knain wrote:
I should have said that this was a first attempt at using Spirit and I was just experimenting with some simple examples.
There are three examples in the code. The adjustment was made to the second example which is for CSV. The string to be parsed was originally:
char const *plist_csv = "\"string\",\"string with an embedded \\\"\"," "12345,0.12345e4"; Changed to:
char const *plist_csv = "\"string\",\"string with an embedded \\\"\"," "12345,0.12345e4,,2"; These are at line 104.
If you look at the parser definition, you'll notice, that this particular list parser is designed to parse escaped C-strings, integers or reals separated by commas. I.e. no empty elements. If you would like to make your list_p to match even empty elements, you'd have to make the list_csv_item optional:
list_csv = list_p( !list_csv_item[append(vec_item)], // ---------^ note this exclamation sign! ',' )[append(vec_list)] ;
The original parser line was:
list_csv_item = confix_p('\"', *c_escape_ch_p, '\"') | longest_d[real_p | int_p] ; Again, I expected the null item to be omitted from my experience with the tokenizer code but I didn't expect the 2 to be omitted. After the initial failure I changed the line to:
list_csv_item = confix_p('\"', *c_escape_ch_p, '\"') | longest_d[real_p | int_p|~anychar_p] ;
It made no difference. Inserting \"\" between the two commas preceding the 2 did make it get the "null" token and the 2.
Adding the ~anything_p to the list_item parser isn't the correct thing. This actually won't match any input at all.
HTH Regards Hartmut
Info: http://www.boost.org Wiki: http://www.crystalclearsoftware.com/cgi-bin/boost_wiki/wiki.pl Unsubscribe: mailto:boost-users-unsubscribe@yahoogroups.com
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
Larry Knain wrote:
I inserted the code change you suggested and it picks up the 2 but there is no indication of the "missing" list element. I'll do some more reading and see what I can come up with. (I have a number of CSV files created by various spreadsheets and DB programs that I need to read that have "missing" list items and was trying for a more elegant solution in Spirit. I have done this with the tokenizer by keeping the delimiter tokens.)
Honestly, I've already thought, that you'll bring up this issue :-) It's simply a matter of operator precedence in C++. If you write: list_csv = list_p( !list_csv_item[append(vec_item)], ',' )[append(vec_list)] ; you make the whole item construct optional: !(list_csv_item[append(vec_item)]), i.e. the attached semantic action is optional too and will be executed only, if/when the parser matches. But if you'd write: list_csv = list_p( (!list_csv_item)[append(vec_item)], ',' )[append(vec_list)] ; (please note the additional parenthesis pair), you'd make optional the parser part only. The action is executed always (even if the parser does not match). HTH Regards Hartmut
participants (4)
-
HartmutKaiser@t-online.de
-
Joel de Guzman
-
Larry
-
Larry Knain