Hi,
here is an unicode regex example, which I want to get matched:
#include <iostream>
#include
here is an unicode regex example, which I want to get matched:
#include <iostream> #include
#include int main() { std::setlocale(LC_ALL, "");
boost::wregex condition(L"\\p{u}");
std::wstring test_word(L"Ü");
if (boost::regex_match(test_word, condition)) { std::wcout << L"Matches!" << std::endl; }
boost::wregex condition2(L"[[:upper:]]");
if (boost::regex_match(test_word, condition2)) { std::wcout << L"Matches!" << std::endl; }
boost::u32regex condition3 = boost::make_u32regex(L"\\p{u}");
if (boost::u32regex_match(test_word, condition3)) { std::wcout << L"Matches using lib icu!" << std::endl; }
return 0; }
Compiled with -lboost_regex -licuuc.
Result: Only the last regex condition matches.
So I have a few questions:
1. Is u32regex + make_u32regex the *only* way to get my regex condition matched?
Not necessarily, but it's the only way to get *consistent* Unicode support.
2. Why does "upper class" in the second regex condition not match. E.g. when I use:
echo "Ü" | grep '[[:upper:]]'
on command line - it works properly :)
Thanks in advance + regards,
The "Ü" is treated as upper case if std::locale treats it as upper case - I would sort of expect that to be the case - but apparently not :-( HTH, John.
The "Ü" is treated as upper case if std::locale treats it as upper case - I would sort of expect that to be the case - but apparently not :-(
Hi John, thanks for std::locale hint! This piece of code does now work: std::setlocale(LC_ALL, ""); boost::wregex condition; condition.imbue(std::locale("")); condition.assign(L"\\p{u}"); std::wstring test_word(L"Ü"); if (boost::regex_match(test_word, condition)) { std::wcout << L"Matches!" << std::endl; } Regards, Stefan
participants (2)
-
John Maddock
-
Stefan Schweter