Yes thank you, I had spent so long checking that the wstring I was passing the regex_replace was correct that I forgot to simply test the size of the returned wstring which would have revealed that it too was fine. Careless on my part. So I'm very grateful, you've save me from a lot of frustration. And thank you so much for the Xpressive library, it's such a joy to use. Kind regards, Eóin Eric Niebler wrote:
Eoin wrote:
Hello, I've been playing around with Xpressive and have hit a stumbling block when working with Unicode. I really am not sure if what I am trying to do is possible and that I'm just doing something silly wrong. Here is a very small example-
wstring in = L"A ЏиϊсoδΣ Hello //World"; wsregex comments = wsregex::compile(L"(//[^\\n]*|/\\*.*?\\*/)");
wstring clear(L""); wstring out = regex_replace(in, comments, clear);
(Note if the wstring 'in' gets scrambled in email I have attached a tiny UTF-8 encoded file of what it should contain.) This code compiles but after executing the wstring 'out' only contains "A ". If there are no Unicode characters in 'in' then the regex replacement works as expected.
I can't reproduce this problem, but I bet I can guess what is going on for you. I bet after the regex_replace, you're trying to write the string to std::wcout, like this:
std::wcout << out << std::endl;
That's the first thing I tried, and the output is "A ". That's on Windows after building with VC8. But that's just because the Windows console doesn't know what to do with this Unicode characters, so std::wcout enters an error state, and nothing further gets displayed. If you look at the result in a debugger, you can see the Unicode string is as it should be.
If you don't think that's what's going on, please send me a *complete* program that reproduces the error, and let me know what compiler/platform you're on.
Thanks,