boost::regex_replace problem
Greetings everyone! My name is Ilya and i have such problem with boost::regex_replace : I want to process some text messages to find and replace with * swears there. I wrote regular expressions, this is an example of one of them: "([[:punct:]]|[[:space:]]|[[:digit:]]|^)(?:a)(?:[[:punct:]]|[[:space:]]|[[:digit:]])*(?:b)(?:[[:punct:]]|[[:space:]]|[[:digit:]])*(?:c)([[:punct:]]|[[:space:]]|[[:digit:]]|$)". The format string i used is: "$1***$2" Then i run boost::regex_replace() and it works almoust correctly, except one case: when i have message with abc's standing next to each other with one separator beetween them, like this: "abc abc", "--abc-abc--" etc. So i need to run boost::regex_replace() twice to process corectly such messages. Are there any other ways to solve this problem? Thank you for answers. Ilya Bindiug
AMDG On 04/03/2014 09:18 AM, Илья wrote:
Greetings everyone! My name is Ilya and i have such problem with boost::regex_replace : I want to process some text messages to find and replace with * swears there. I wrote regular expressions, this is an example of one of them: "([[:punct:]]|[[:space:]]|[[:digit:]]|^)(?:a)(?:[[:punct:]]|[[:space:]]|[[:digit:]])*(?:b)(?:[[:punct:]]|[[:space:]]|[[:digit:]])*(?:c)([[:punct:]]|[[:space:]]|[[:digit:]]|$)". The format string i used is: "$1***$2" Then i run boost::regex_replace() and it works almoust correctly, except one case: when i have message with abc's standing next to each other with one separator beetween them, like this: "abc abc", "--abc-abc--" etc. So i need to run boost::regex_replace() twice to process corectly such messages. Are there any other ways to solve this problem? Thank you for answers.
If I'm understanding correctly, you should be able to use lookahead (?=) and/or lookbehind (?<=), I've simplified it to use just \s, as a separator: "(?:^|(?<=\\s))(?:a)\\s*(?:b)\\s*(?:c)(?:(?=\\s)|$)" and then just use "***" as the format. (warning: untested) In Christ, Steven Watanabe
Thanks a lot! Thats exactly what i needed. Ilya Bindiug --- Исходное сообщение --- От кого: "Steven Watanabe" < watanabesj@gmail.com > Дата: 3 апреля 2014, 21:24:39 AMDG On 04/03/2014 09:18 AM, Илья wrote:
Greetings everyone! My name is Ilya and i have such problem with boost::regex_replace : I want to process some text messages to find and replace with * swears there. I wrote regular expressions, this is an example of one of them: "([[:punct:]]|[[:space:]]|[[:digit:]]|^)(?:a)(?:[[:punct:]]|[[:space:]]|[[:digit:]])*(?:b)(?:[[:punct:]]|[[:space:]]|[[:digit:]])*(?:c)([[:punct:]]|[[:space:]]|[[:digit:]]|$)". The format string i used is: "$1***$2" Then i run boost::regex_replace() and it works almoust correctly, except one case: when i have message with abc's standing next to each other with one separator beetween them, like this: "abc abc", "--abc-abc--" etc. So i need to run boost::regex_replace() twice to process corectly such messages. Are there any other ways to solve this problem? Thank you for answers.
If I'm understanding correctly, you should be able to use lookahead (?=) and/or lookbehind (?<=), I've simplified it to use just \s, as a separator: "(?:^|(?<=\\s))(?:a)\\s*(?:b)\\s*(?:c)(?:(?=\\s)|$)" and then just use "***" as the format. (warning: untested) In Christ, Steven Watanabe _______________________________________________ Boost-users mailing list Boost-users@lists.boost.org http://lists.boost.org/mailman/listinfo.cgi/boost-users
participants (2)
-
Steven Watanabe
-
Илья