RegEx Split function dies.
Hi- I am attempting to parse email headers using the boost regex library. I would prefer that each email address in the TO field (delimited by a comma) would be placed into a vector<string>. Currently, I am able to obtain the entire to line (all the to addresses). I can use another regular expression to seperate it out but I'd rather do it all in one. However, If I try to add another more to this I get an exception: "Max regex search depth exceeded." I am assuming that this is because I've created a infinite loop. I'm not sure how to extend this expression to place every email address in the vector seperatly. Thanks here's my code snippet: char * expression_text = //Any possible leading whitespace: "^[ ]*" //The "To" field name in any case: "TO" //Any more whitespace: "[ ]*" //Then a colon: ":" //More white space potentially "[ ]*" //Everything here is an address up to "\r\n"; "([-@_\\.\"<>A-Za-z0-9]+),?"; //Create the expression class to use: RegEx expression(expression_text, true); //Attempt to grep the mail_buffer: int num_recv = 0; try { num_recv = expression.Split(to, mail_buffer); } catch (...){ //throw some exception } }
If I try to add another more to this I get an exception:
"Max regex search depth exceeded."
I am assuming that this is because I've created a infinite loop. I'm not sure how to extend this expression to place every email address in the vector seperatly.
That usually happens if you create an "ambiguous" expression and the regex matcher starts thrashing trying to find the best possible match. Try to ensure that each time the state machine has to make a choice, that the result is unambiguous. However I think of an easy way to do a one stop possessing like you want to with regex_split, probably getting the To line and then splitting that is the way to go. John.
participants (2)
-
John Maddock
-
tfandango