regular expression too complex
Hello,
I'm generating a complex regular expression for removing word of
textes, but now I get this exception:
terminate called after throwing an instance of
'boost
::exception_detail
::clone_impl
Is there a solution to create any removing operation? I have got a vector with strings and I must remove each element on the texts, so I create a regular expression with "or" and case-insensitive search ans use regex_replace to remove the words
If you care about performance, write your own matching routine. I'd build a tree/forest of chars from your matching words, one pointer goes through original string, N pointers may follow the matching tree, the original text gets copied char by char (one pass), the matching pointers runs on tree/forest, if one of the matching pointers goes through, you have the match and move the writing output pointer back. The details like longest match or encoding support are up to you how to handle. Should be much faster than general regexp. -- Slava
Am 24.05.2011 um 09:02 schrieb Viatcheslav.Sysoltsev@h-d-gmbh.de:
Is there a solution to create any removing operation? I have got a vector with strings and I must remove each element on the texts, so I create a regular expression with "or" and case-insensitive search ans use regex_replace to remove the words
If you care about performance, write your own matching routine. I'd build a tree/forest of chars from your matching words, one pointer goes through original string, N pointers may follow the matching tree, the original text gets copied char by char (one pass), the matching pointers runs on tree/forest, if one of the matching pointers goes through, you have the match and move the writing output pointer back. The details like longest match or encoding support are up to you how to handle. Should be much faster than general regexp.
Performance is not my primary aspect. I would like to use a component that can do this, because the remove only runs one time. Is there a framework of the Boost that I can use like state machines or anything else? But the idea with tree / forest is very nice Thx Phil
On 5/24/2011 5:13 PM, Alexander Mingalev wrote:
On 24.05.2011 12:58, Kraus Philipp wrote:
Performance is not my primary aspect. I would like to use a component that can do this, because the remove only runs one time.
Maybe, Boost.Xpressive will work for you.
Yes, it can. Static xpressive has "symbol tables" (a search trie). You
put all the string into a std::map. It'd look something like this:
#include <string>
#include <iostream>
#include
participants (4)
-
Alexander Mingalev
-
Eric Niebler
-
Kraus Philipp
-
Viatcheslav.Sysoltsev@h-d-gmbh.de