[Boost-users] Re: Re: Feature request for boost::filesystem

6 Jul 2004

      Eddie Diener wrote:
...
...
Now, I have a number of library packages. Say no application uses
wide char interface at the moment, so only narrow libraries are
installed, Now, a single application decides to use wide interface,
so its package now depends on wide libraries. As the result, I have
to install, in addition to some narrow libraries, their wide
equivalents.
You only have to install the appropriate wide equivalents. There is
nothing to say that a wide character application uses all wide character
libraries.
But a few wide applications can span a lot of libraries.
...
...
After enough applications decide to use Unicode, most
libraries will have to be installed in two flavours.
How is this worse than having a single version which has both wide and
narrow character equivalents ? You are not saving anything in this latter
way, and you are definitely worse than if the libraries were separate and
you only used one or the other versions in your applications.
I never talked about "single version which has both wide and
narrow character equivalents". What I'm after is a single version which
supports both wide and narrow interface/operations. But internally, it's
just one code.
...
...
I don't think this is very likely for new character types to appear.
I do. I would be very surprised if C++ does not adapt new character types
in the years to come. Do you really think that if the programming world
settles on other standard character representations that C++ will
adamantly ignore it ?
Isn't there one representation already?
...
Even now a number of programmers would like to see 
C++ support one of the Unicode standards natively, most likely UTF-32.
If you recall the Unicode discussions we had on the main list (hmm.... we
probably should have this discussion there as well), there were two
problems:
1. wchar_t is 16 bit on some platforms
2. even if it's 32 bit, wchar_t represents only codepoint, and complete
character with all the accents and other marks might take several
codepoints.

The second problem is actually most serious, and I'm really not sure that
the right solution would be yet another character type.
...
...
I'm actually worried that when using templates in a straight-forward
way, all libraries will have to some in two variants or be twice
larger, which is bad because of:
No. There is nothing saying that a library must support more than one
character type. But if it does, isolating each character type in its own
header files
I don't understand this. For templated implementation, you sure can't have
wide and narrow version in different headers.
...
and libraries is the right design.
...
...
- code size reasons,
- configurations reason (just one more configuration variant to worry
about)
- interoperability/convenience? (what if I use unicode paths and want
  to pass narrow string to one of the operators?)
None of your reasons holds much weight. Code size wouldn't be affected
since each implementation is in its own library.
Only if you don't buy my argument about system-wide code size.
...
There is nothing to 
configure since character types are part of the C++ standard.
And? You still need to build two library variants, test them separately,
make two packages. Current Boost build process creates a huge number of
library variants (debug/release, MT/ST, stldebug ...). Is there a need to
double that number for libraries which might need unicode?
...
If you need 
to pass a unicode path to a narrow string operator, you the programmer are
either doing something wrong or, if there is a valid conersion, you can
make it yourself ( like wcstombs ).
If I have basic_path<char> and want to convert it into basic_path<wchar_t>,
do I really have to use mbstowcs? So, I need to iterate over all elements
of a path, calling that function, and creating the path? Sorry, there
should be a simpler way. And that simpler way is converting constructor.
...
...
With a bit of additional design, it's possible to make library use one
representation internally, and have either non-templated interface,
or a tiny templated facade. E.g:
boost::path p;
   p = p / L"foo" / "bar";
does not seem all that bad thing for me.
It is possible to do that if you can convert all character types into your
internal representation. Even here I am paying for conversionsa back and
forth I may not need.
If you want to append narrow path element to a unicode string, you *need* to
convert. Besides, I'm not all that sure this conversion is performance
bottleneck, given that boost::path need to use OS services. A single 'stat'
that fs::exists does might make performance of conversion non-important.
...
I therefore would prefer separate templated 
libraries. Why make headaches for oneself ?
The templated library is much bigger headache that it seems. Unless you're
willing to put template code in header (which is bad for big libraries, and
is really bad for boost::fs which has to include system headers), you need:

- declare templates in public headers
- define templates in private headers/sources
- explicitly instantiate the templates for char and wchar_t.

Not so nice.
...
I am always in favor of 
designs which are clear and understandable over all other considerations.
And what's so un-understantable about boost::path which has both narrow and
wide methods?

- Volodya

[Boost-users] Re: Re: Feature request for boost::filesystem

Vladimir Prus