On 11/06/2020 19:37, Glen Fernandes via Boost wrote:
The library provides three layers:
- The string layer, a set of types that constitute "a better std::string"
- The Unicode layer, consisting of the Unicode algorithms and data
- The text layer, a set of types like the string layer types, but
providing transparent Unicode support
Firstly, I'd like to say that proposing a new string implementation is
probably one of the most masochistic things that you can do in C++. Even
more than proposing a result type. So, I take a bow to you Mr.
Laine, and I salute your bravery.
I'll put aside the Unicode and Text layers for now, and just consider
the String layer. I have to admit, I'm not keen on the string layer.
Firstly I hate the naming. Everything there ought to get more
descriptive naming. But more importantly, how the design of the string
implementation has been broken up and designed, there's just something
about it which doesn't sit right with me. You seem to me to have
prioritised certain use cases over others which would not be my own
choices i.e. I don't think the balance of tradeoffs is right in there.
For example, I wouldn't have chosen an atomically reference counted rope
design the way you have at all: I'd have gone for a fusion of your
static string builder with constexpr-possible (i.e. non-atomic)
reference counted fragments, using expression templates to lazily
canonicalise the string depending on sink (e.g. if the sink is cout, use
gather i/o sequence instead of creating a new string). That sort of thing.
Zach, could you take this opportunity to compare your choice of string
design with the string designs implemented by each of the following
libraries please?
- LLVM strings, string refs, twines etc.
- Abseil's strings, string pieces.
- Folly's strings, string pieces and ranges.
- CopperSpice's CsString.
I feel like I am forgetting at least another two. But, point is, I'd
like to know why you chose a different design to each of the above, in
those situations where you did so.
I'll nail my own colours to the mast on this topic: I've thought about
this long and hard over many many years, and I've personally arrived on
the opinion that C needs to gain an integral string object built into
the language, which builds on top of an integral variably sized array
object (NOT current C VLAs). Said same built-in string object would also
be available to C++, by definition.
I have arrived at this opinion because I don't think that ANY library
solution can have the right balance of tradeoffs between all the
competing factors. I think that only a built-in object to the language
itself can deliver the perfect string object, because only the compiler
can deliver a balance of optimisability with developer convenience.
I won't go into any more detail, as this is a review of the Text C++
library. And I know I've already discussed my opinion on SG16 where you
Zach were present, so you've heard all my thoughts on this already.
However, if you were feeling keen, I'd like to know if you could think
of any areas where language changes would aid implementing better
strings in C++?
Niall