I have personally seen cases where sorting speed is a bottleneck.
However, there are implementations that are still quite faster than this library (I have one that is at least 3.5 faster than std::sort), and that fares better with small sizes.
I'd appreciate if you could point me to specific implementations and/or data, so I can run a direct comparison. Did you test their relative speeds? This library provides a significantly larger speedup on easy problems, such as evenly distributed random data or where the data range is not much larger than the number of elements being sorted. This library is optimized and speed-tested for the worst case, so it provides a consistent speedup over comparison-sorting on the vast majority of distributions, and provides a better worst-case guarantee than both conventional hybrid radix and pure comparison-based approaches. Many hybrid-radix approaches do poorly with data distributions like that in https://github.com/spreadsort/sort/blob/master/example/alrbreaker.cpp , often significantly slower (5X runtime) than pure comparison-based sorts. When integer_sort or float_sort run into a problem like that, they automatically revert to std::sort and end up having comparable speed to pure comparison-based sorting. To restate the case: for any specific data distribution, it is possible to construct a faster sorting implementation than this one. The goal of this library is to provide most of the speedup of specialized radix-based or hybrid-radix algorithms, without sacrificing significant speed on problems where such approaches do poorly, and without having to write specialized code each time.