On 11/6/2010 4:58 AM, Alex Dubov wrote:
Eric Niebler
writes: On 11/5/2010 10:21 AM, Dominique Devienne wrote:
On Fri, Nov 5, 2010 at 3:31 AM, Alex Dubov
wrote: So the question is, how can one establish the size for this iterator adapter's internal storage so as to avoid match failures due to incomplete storage and avoid storing too much unnecessary data? I suppose, the storage can be reset when the match is reported. This, however, still leaves an open problem for situations where matches are separated by, possibly, gigabytes of non- data.
In the case of single-line matching, backtracking need only going back to the start of the current line, so I think the buffering only needs to be line based. For arbitrarily long multi-line matching, I don't know... --DD
In that case, the iterator may be backed up all the way to the beginning of the match, so if the iterator is caching the input, it would need to cache it all.
That's exactly the approach I would like to follow. Unfortunately, xpressive currently has no way to communicate the necessary information out.
What could be nice to have is a notification of some sort for three events: 1. First possibly matching character is encountered (at which point iterator adapter can start caching).
I thought about this and don't think it's practical. Xpressive's state machine is built from matcher components. What you're asking for is a way to signal that you are *exactly* one character into a match. That would require changing *all* matcher components so that (a) they know if they are the first non-zero-width matcher in the state machine, and (b) they signal when they successfully match the first character. It could be done, but I'm not going to maintain that. Sorry.
2. All possible matches abandoned (iterator cache can be truncated). 3. Match is found (well, it's already there:-).
The not-so-exciting alternative to having this functionality in xpressive is going back to C-style matching engines for stream filtering.
Yep. :-( But like I said, if your stream is like a file stream, then you can manipulate the underlying buffers to give you a sliding window, paging data in and out as needed. Good luck. -- Eric Niebler BoostPro Computing http://www.boostpro.com