[regex] character classes when using wide chars
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Hi
Documentation says that when using wide character strings with
boost::wregex a character class like [[:alpha:]] depends on the system's
implementation of iswalpha() function.
My system seems to have a working implementation of iswalpha() function,
but [[:alpha:]] still only seems to match ASCII alphabet characters.
For example the following code:
#define UNICODE
#include
Tomaž Šolc wrote:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hi
Documentation says that when using wide character strings with boost::wregex a character class like [[:alpha:]] depends on the system's implementation of iswalpha() function.
My system seems to have a working implementation of iswalpha() function, but [[:alpha:]] still only seems to match ASCII alphabet characters.
I'm using Debian GNU/Linux with Boost 1.33.1. I also tried a similar program using boost::wregex and std::iswalpha() classes instead of the POSIX interface with the same results.
Can anyone give me some advice on what I'm doing wrong here?
Nothing: it looks like a bug, the current implementation was changed to use the C++ locale by default, but I forgot to change the POSIX API's to explicitly use the C locale. Try setting std::locale::global to the required locale and that should then work (provided your C++ std library supports all the locales that setlocale does). It's probably a bit late to fix this for 1.35, but will you please open a Track issue at svn.boost.org so I don't forget about this? Thanks, John Maddock.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi
Nothing: it looks like a bug, the current implementation was changed to use the C++ locale by default, but I forgot to change the POSIX API's to explicitly use the C locale. Try setting std::locale::global to the required locale and that should then work (provided your C++ std library supports all the locales that setlocale does).
Changing: setlocale(LC_ALL, "en_US.utf8"); to: std::locale en("en_US.utf8"); std::locale::global(en); fixed the problem. Thanks.
It's probably a bit late to fix this for 1.35, but will you please open a Track issue at svn.boost.org so I don't forget about this?
I opened ticket #1446 Best regards Tomaz Solc -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.6 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHPZVdsAlAlRhL9q8RAt7DAJ4jadXslEmeM3xe4MMENBSwPAcaMACgsw7t tMivJk8D0np/dkhqP23jUkk= =b/vR -----END PGP SIGNATURE-----
Tomaz Solc wrote:
Changing:
setlocale(LC_ALL, "en_US.utf8");
to:
std::locale en("en_US.utf8"); std::locale::global(en);
fixed the problem. Thanks.
OK good.
It's probably a bit late to fix this for 1.35, but will you please open a Track issue at svn.boost.org so I don't forget about this?
I opened ticket #1446
Thanks, John.
participants (2)
-
John Maddock
-
Tomaž Šolc