Most efficient way to convert a run of Perl substitutions til boost::regex?
Hi! I'm converting a Perl program that basically consists of a succession of regex transformations of the input string, eg.: #Case 1 - mutually exclusive endings s/erens$/\@r\@ns_/; s/ernes$/\@rn\@s_/; s/endes$/\@n\@s_/; s/estes$/\@st\@s_/; #Case 2: - a load of manipulations done in sequence s/jou/sju/g; s/jau/sjo/g; s/nch/ngk/g; Now the question is, how do I convert this to boost::regex code that runs as fast as possible. I expect to have to call the function implementing these transformations (there are /many/) about half a million times at runtime, and as I'm concerned about runtime performance, I'd obviously like to use the most efficient method available. I expect that Case #2 is simple, as I can use the regex_merge function, and nest the calls as in: static const boost::regex re1("jou"); static const boost::regex re1("jau"); static const boost::regex re1("nch"); static const std::string fs1("sju"); static const std::string fs2("sjo"); static const std::string fs3("ngk"); std::string res = regex_merge( regex_merge( regex_merge(input_string, re1, fs1) , re2, fs2) , re3, fs3); As to case 1, I guess I'd have to use a combination of the grep and format functions, and in the case as shown it'd be easy, as there is either one match or none for each line. Any suggestions wrt. improving run-time performance? Sincerely Anders S. Johansen
static const boost::regex re1("jou"); static const boost::regex re1("jau"); static const boost::regex re1("nch"); static const std::string fs1("sju"); static const std::string fs2("sjo"); static const std::string fs3("ngk");
std::string res = regex_merge( regex_merge( regex_merge(input_string, re1, fs1) , re2, fs2) , re3, fs3);
Have you thought about combining these into one big regex: static const boost::regex e("(jou)|(jau)|(nch)"); static const std::string format_string("(?1sju)(?2sjo)(?3ngk)"); // conditional format string regex_merge(input_string, e, format_string); // search and replace everything at once John.
John Maddock wrote:
Have you thought about combining these into one big regex:
Err... Yes, but I didn't know that the syntax you describe was available. Where is this documented? I can't find any references to this in the accompanying documentation wrt. regex syntax. As the code is machine-generated from the original Perl code, it's semi-trivial for me to generate the most efficient C++/regex code, so even outlandish syntax is OK by me ;) Anders
Err... Yes, but I didn't know that the syntax you describe was available. Where is this documented? I can't find any references to this in the accompanying documentation wrt. regex syntax.
Look in the section on the format string syntax. John.
participants (2)
-
Anders Johansen
-
John Maddock