regex_search (v4) raises exception for wregex
hi! the following pattern (35 matching token, and at least one more) with this simple expression ("[_]+$") throws "memory exhausted" when using wchar_t strings. the same works with char strings even triple sized. matching doesnt throw, if we use "^[_]+$", "[_]{1,}$", "[_][_]*$" etc. so this is a bug. it seems the expression-string is short enough to cause some problem, but i was unable to track down the error. wchar_t * wp = L"___________________________________x"; wregex wre(L"[_]+$"); try { regex_search(wp, wre); } catch (bad_expression & e) { cout << e.what() << endl; } another issue: i think, "(one|two|three|)" style alternation (notice the empty string on the right) should be accepted and handled as "(one|two|three)?" now it throws bad_expression
the following pattern (35 matching token, and at least one more) with this simple expression ("[_]+$") throws "memory exhausted" when using wchar_t strings. the same works with char strings even triple sized. matching doesnt throw, if we use "^[_]+$", "[_]{1,}$", "[_][_]*$" etc. so this is a bug.
Confirmed, it's a bug, I'm testing the fix now.
another issue: i think, "(one|two|three|)" style alternation (notice the empty string on the right) should be accepted and handled as "(one|two|three)?" now it throws bad_expression
I'm not sure about this one, Boost.regex has always deliberately rejected that, and although I realise that perl5 does accept this, perl6 will regard it as an error I believe. Thanks, John.
John Maddock wrote:
i think, "(one|two|three|)" style alternation (notice the empty string on the right) should be accepted and handled as "(one|two|three)?" now it throws bad_expression
I'm not sure about this one, Boost.regex has always deliberately rejected that, and although I realise that perl5 does accept this, perl6 will regard it as an error I believe.
Indeed, this will be an error in perl 6. See "Null String Reform" at http://www.perl.com/pub/a/2002/06/04/apo5.html?page=10. You will have to write this as (one|two|three|<null>). I think it would be a horrible mistake for (one|two|three|) to be handled as (one|two|three). Either accept it for what it is or reject it, but don't try to second-guess what the programmer really meant. -- Eric Niebler Boost Consulting www.boost-consulting.com
On May 3, 2004, at 3:15 PM, Eric Niebler wrote:
John Maddock wrote:
i think, "(one|two|three|)" style alternation (notice the empty string on the right) should be accepted and handled as "(one|two|three)?" now it throws bad_expression I'm not sure about this one, Boost.regex has always deliberately rejected that, and although I realise that perl5 does accept this, perl6 will regard it as an error I believe.
Indeed, this will be an error in perl 6. See "Null String Reform" at http://www.perl.com/pub/a/2002/06/04/apo5.html?page=10. You will have to write this as (one|two|three|<null>).
I think it would be a horrible mistake for (one|two|three|) to be handled as (one|two|three). Either accept it for what it is or reject it, but don't try to second-guess what the programmer really meant.
Ugh, that'd be bad. I don't think anyone is looking for that; the "(one|two|three)?" is like "(one|two|three|<null>)". [*] I bet you just didn't notice the question mark? I didn't either on my first read. I think one argument against <null> is that deviating from the normal basis case would needlessly complicate algorithms that generate regexs on-the-fly. (Requiring extra logic to avoid an arbitrarily-imposed error.) But as I can't think of a good example where it'd come up, maybe it's not so important. [*] - except I think "(one|two|three)?" generates no capture group on an empty match. but I assume he didn't mean the comparison to extend that far. Scott
Scott Lamb wrote:
... I bet you just didn't notice the question mark? I didn't either on my first read.
You're right, I didn't see the question mark. But as you've noticed, (a|b|c|) is semantically different than (a|b|c)? becuase of the capture. -- Eric Niebler Boost Consulting www.boost-consulting.com
participants (4)
-
Adam Molnar
-
Eric Niebler
-
John Maddock
-
Scott Lamb