Re: [boost] [Boost-users] Interest in a Unicode library for Boost?
On Wed, Oct 30, 2019 at 4:51 AM Leon Mlakar
On 29.10.2019 17:11, Zach Laine wrote:
- for the sake of completeness the normalization type used at the text level ought to be a policy parameter; although I do understand your arguments against it I think it should be there even at the cost of different text types being inoperable without conversions
I disagree. Policy parameters are bad for reasoning. If I see a text::text, as things currently stand, I know that it is stored as a contiguous array of UTF-8, and that it is normalized FCC. If I add a template parameter to control the normalization, I change the invariants of the type. Types with different invariants should have different names. To do otherwise is a violation of the single responsibility principle.
Okay, the policy or not the policy was not my point ... it was to allow for different underlying normalizations. Granted, it may only be important to (a few) corner cases where input and/or output normalizations are given, and your assessment that it may not be worth the effort is reasonable ... unless you are aiming towards adding to the standard. Then the completeness imho becomes more important.
Frankly, I'm not proficient enough in the meta-programming to make a strong case either for policy parameter or for explicit types/templates. I just happen to prefer the policy based approach.
Understood. FWIW, the algorithms provided by Boost.Text make it possible to use any normalization representation, though at times conversions may be necessary. Some of those conversions are mandated by the Unicode standard itself -- you cannot feed NFC, NFKC, or NFKC to the collation algorithm, for instance (though implementations are possible for NFD and FCC).
- at the text level I'm not sure I'm willing to cope with different
fundamental text types; I just want to use boost::text::text, pretty much the same as I use std::string as an alias to much more complex class template; heck, even at the string layer I'd probably prefer rope/contiguous concept to be a policy parameter to the same type template.
That would be like adding a template parameter to std::vector that makes it act like a std::deque for certain values of that parameter. Changing the space and time complexity of a type by changing a template parameter is the wrong answer.
No, that is not making the std::vector to act as std::deque - the text would still remain the text and act as a text, with the same interface. It's more like FIFO implementation using either std::vector or std::dequeu for its store - since in both cases the FIFO has the same interface and functionally behaves the same, I really don't want two distinct types. The type template with the parameter that makes the choice between the underlying storage seems much more natural to me.
You example highlights my point. For N inputs to your FIFO queue, a deque underlying implementation is worst-case O(N). A vector implementation is worst-case O(N*N). The invariants of the type matter, and they matter a lot. Saying a foo may be like a bar or like a baz only work when bar and baz are so similar that you cannot observe a difference in the behavior of foo. Zach
participants (1)
-
Zach Laine