On 15/06/2023 13:42, Marshall Clow wrote:
At Boostcon last month, there was a discussion of indirect sorting algorithms to boost::algorithm.
An “Indirect sort” (called 'argsort' in numpy - https://numpy.org/doc/stable/reference/generated/numpy.argsort.html) returns a sequence of indices into the collection describing how the collection can be rearranged in order to become sorted.
So I created https://github.com/boostorg/algorithm/pull/117 to start down this road. (There’s clearly more that could be done here - stable_sort, partial_sort, nth_element, etc)
Comments welcomed.
It invites an interesting question. A sequence of indexes is only particularly useful for a random-access collection. If it instead generated and then outputed a sorted sequence of iterators, it could potentially support other input collection types (such as std::list, or something else with forward-only iterators) as well. But the utility of this depends on the subsequent usage. A collection of iterators is better than a collection of indexes (and frequently the same size, when not using checked iterators) if your purpose is to then iterate the collection in sorted order, or otherwise perform read-only operations. A collection of iterators is less safe if you intend to perform mutations on the underlying list (such as trying to actually realize the sorted order), as this will sometimes invalidate them depending on the underlying collection type and the specific operations used. (Indexes can also be rendered similarly useless, but are somewhat easier to perform equivalent operations to keep in sync with less performance issues than trying to do the same with non-random-access iterators.)