But maybe I missed something here. If there really is a good reason for enforcing valid UTF-8 in some situation, please let me know :)
Ok I want to summarize my response regarding WTF-8 especially from security point of view. There is lots of dangers in this assumption - generation of WTF-8 as if it was UTF-8 can lead to serious issues. It almost never OK to generate invalid UTF-8 especially since most of chances 99% of users will not understand what are we talking about round-trip/WTF-8 and so - and I'm talking from experience. Just to make clear how serious it can be - this is CVE regarding one bug in Boost.Locale: http://people.canonical.com/~ubuntu-security/cve/2013/CVE-2013-0252.html Lets show a trivial example that WTF-8 can lead to: Tool: - User monitoring system that monitors all files and creates report with all changes by XML to some central server Deny of Service Attack Example: - User creates a file with invalid UTF-16 - System monitors the file system and adds it to the XML report in WTF-8 format - The central server does not accept the XML since it fails UTF-8 validation - User does whatever he wants without monitoring - It removes the file - There were no reports generated during the period user needed -DOS attack Bottom line: (a) Since 99% of programmers are barely aware of various Unicode issues it is dangerous to assume that giving such a round trip in trivial way is OK (b) If you need to do complex manipulations of file system you my consider Boost.Filesystem that keeps internal representation in native format and convert to UTF-8 for other uses (nowide provides integration with Boost.Filesystem) (c) It is possible to add WTF-8 support under following restrictions: (c.1) only by adding wtf8_to_wide and wide_to_wtf8 - if user wants it explicitly(!) (c.2) No "C/C++" like API should accept one - i.e. nowide::fopen/nowide::fstream must not do it. Artyom Beilis