Nick wrote:
According to the Boost documentation for Back references [...]
I use regex's frequently in Perl, C++ and other languages. I use Boost Regex exclusively in C++, even though I probably should be using std::regex instead. I'm much happier using Boost Regex and std::string than I was with PCRE and char*. I only read the Boost regex docs for the API, not for regex syntax.
For example the expression:
^(a*).*\1$
Will match the string:
aaabbaaa
But not the string:
aaabba
This example doesn't make sense to me, either, for same reasons that you mentioned: Both strings begin and end with "a", and a* can match 0 or more characters. I tested this example on Perl 5.22, adding the string "unmatchable" to the test input: $ perl -nle 'print "$_ ($1)" if /^(a*).*\1$/' <<HERE > aaabbaaa > aaabba > unmatchable > HERE The output shows both the input string and the submatch captured by $1 and indicates that the pattern will match any input string: aaabbaaa (aaa) aaabba (a) unmatchable () I haven't tested this with Boost Regex or any other C++ library, but I believe that this example should be edited or replaced. I would suggest changing (a*) to (a+) and "aaabba" to "aaabbb". Alternatively, you could change (a*) to (a{2}) for the original input strings, but (a+) makes simpler example.
2. The other issue is that when I try this example with the string which is posted to not match, the Boost regex engine runs for a while and ultimately crashes with a memory error. (seems like it might be an endless loop of some sort). Is that a bug?
Ouch! I'm glad nothing like that has happened to me.
It's generally a bad idea to use * too frequently in a regex, especially .* . Adding a back reference to a* only aggravates the issue. I had an experience years ago where too many .*'s killed the performance of a Perl script, and probably consumed way too much memory.
|+| M a r k |+|
Mark Stallard
Engineering & Operations Application Development
Business Application Services
Global Business Services Information Technology
Raytheon Company
Billerica, MA (US)
This message contains information that may be confidential and privileged. Unless you are the addressee (or authorized to receive mail for the addressee), you should not use, copy or disclose to anyone this message or any information contained in this message. If you have received this message in error, please so advise the sender by reply e-mail and delete this message. Thank you for your cooperation.
-----Original Message-----
From: Boost-users [mailto:boost-users-bounces@lists.boost.org] On Behalf Of Nick via Boost-users
Sent: Thursday, July 13, 2017 10:04 AM
To: boost-users@lists.boost.org
Cc: Nick