This is a very good question.
On windows I need to convert for obvious reason.
Now the question is what I accept as valid and what is not valid and where do I draw the line. ---------------------------------------------------------------------------------------------------------------------------
Now as you have seen there are many possible "non-standard" UTF-8 variants.
What should I accept?
I still strongly suggest you simply call RtlUTF8ToUnicodeN() (https://msdn.microsoft.com/en-us/library/windows/hardware/ff563018(v=vs.85)....) to do the UTF-8 conversion. Do **nothing** else.
Niall
Actually I think you provided me a good direction I hadn't considered before. RtlUTF8ToUnicodeN and other way around function does something very simple: It substitutes invalid codepoints/encoding with U+FFFD - REPLACEMENT CHARACTER which is standard Unicode way to say I failed to convert a thing. It is something similar to current ANSI / Wide conversions creating ? instead. It looks like it is better way to do it instead of failing to convert entire string all together. If you get invalid string conversion will success but you'll get special characters (that are usually marked as � in UI) that will actually tell you something was wrong. This way for example getenv on valid key will not return NULL and create ambiguity of what happened and it is actually something that is more common behavior in Windows. I like it and I think I'll change the behavior of the conversion functions in Boost.Nowide to this one Thanks! Artyom Beilis