On Thu, Apr 24, 2003 at 11:37:34PM -0000, Dean wrote:
Hi all,
... snip ...
\d{3}-\d{2}-\d{4}
As expected, that pattern was found in "123-12-1234" but not in "1234- 12-1234". However it *was* found in "1234567-12-1234".
Is this behavior by design or is it a bug?
It is (probably) design. Intervals can specify a min and a max, for example: \d{3,3}-\d{2,2}-\d{4,4} will match "123-12-1234" but NOT "1234567-12-1234". \d{3}-\d{2}-\d{4} will match "123-12-1234" but NOT "1234567-12-1234" also. It will, however, yeild a correct search (there is a difference between search and match). You didn't mention if you were doing a regex_search or regex_match ? Begin Stolen from the boost regex docs: The algorithm regex_match determines whether a given regular expression matches a given sequence denoted by a pair of bidirectional-iterators, the algorithm is defined as follows, note that the result is true only if the expression matches the whole of the input sequence, the main use of this function is data input validation: End Stolen from the boost regex docs
If it's a bug, has it been fixed in a subsequent boost release? Also what is the correct behavior? Should "a{1}b" be found in "aab" (albeit starting at the second character)?
See above. Yes, a{1}b should be found in a search, but not a match.
FWIW, it's easy enough for me to workaround the current behavior with a pattern like this:
(^|[^a])a{1}b
You could use this, but I wouldn't recomend it (but that's just me...regex construction is deeply personal :) ). HTH. -jbs