To be honest I don't know what guys who designed <codecvt> in first place
It was done in the early and mid 1990's, with primary input coming from Asian national bodies and the now long gone Unix vendors who had a big presence in that market.
I'm not talking about std::codecvt<> but new C++11 codecvt header that provides utf8_codecvt - which actually useless for char16_t or wchar_t on Windows. Because you need to use utf8_utf16_codecvt - very unintuitive and would likely to make lots of troubles in future. Major flaw of std::codecvt is mbstate_t that isn't well defined makeing it impossible to work with stateful encoding or do some composition/decomposition withing the facet.
Header <codecvt> isn't what we need, as you point out below.
Boost.Locale provides one but currently it is deep internal and complex part of library.
The code I written for Boost.Nowide or one I suggest to put into Boost.Locale header-only part is codecvt that converts between utf8 and utf-16/32 according to size of character:
boost::(nowide|or locale)::utf8_facet
- utf-8 to utf-16 (windows)
utf-32 (posix)
Don't forget utf-8 to utf-8 (some embedded systems).
IAFIR std::codecvt
IMO, a critical aspect of all of those, including utf-8 to utf-8, is that
they detect all utf-8 errors since ill-formed utf-8 is used as an attack vector.
See Markus Kuhn's https://www.cl.cam.ac.uk/~mgk25/ucs/examples/UTF-8-test.txt
It should. Actually if you want to validate/encode/decode UTF (8/16/32) there is boost::locale::utf::utf_traits that does it for yyou Also it is good test to take a look on for boost.locale Artyom