Re: [boost] [program_options] Proposal: self-contained, header-only port of Boost Program Options library

17 Sep 2019

      On Tue, Sep 17, 2019 at 8:17 AM Peter Dimov via Boost <boost@lists.boost.org>
wrote:
...
Rainer Deyke wrote:
...
Or the user could be running a non-UTF-8 locale, but accessing a
filesystem created by somebody who was using UTF-8 - in which case any
filenames should be in UTF-8, even if the user's locale disagrees.
It is because of this last possibility that I recommend treating all
command-line arguments as UTF-8 on Unix systems, even if running a
non-UTF-8 locale, for all cases where treating them as binary blobs is
impractical.  Unix filenames are binary blobs, but the de-facto standard
for interpreting these binary blobs as text is to use UTF-8. [...]
How does any of this affect the library? It just gives you whatever you
passed as `argv`, without needing to interpret it.
Windows is a different story.
Indeed, you can just use UTF-8 (as long as you document this!) for
everything except Windows.  With Windows, you need to provide a
wchar_t/UTF-16 overload for every char/UTF-8 overload in your lib.

If you want 100% correctness, you are not allowed to arbitrarily convert
the wchar_t strings.  In particular, you are not allowed to convert them to
UTF-8, because it is possible that one of them is a filename, and it is
possible to construct filenames on the Windows platform that are not
properly UTF-16-encoded.  This means that the UTF-16 -> UTF-8 conversion is
lossy, if you follow the Unicode guidelines for that conversion -- you
should produce a replacement character (U+FFFD) where you encounter the
broken UTF-16.

Though such broken-UTF-16-named files are possible to create, they do not
come up often in practice; they almost never do.  So, if you don't care
about this case that prevents 100% correctness, just provide wchar_t
overloads, and implement each one by converting to UTF-8 and calling your
UTF-8 overload, and only define the wchar_t overloads when building on
Windows.

Zach

Re: [boost] [program_options] Proposal: self-contained, header-only port of Boost Program Options library

Zach Laine