Hi, I'm trying to improve insertion times for flat associative containers. In range insertions one optimization is to insert the whole range in the end of the container and sort the whole vector instead of trying a loop. An additional optimization proposed in the mailing list is to have an option to create externally the internal vector type of the flat map/set, fill it and then try to move it into the flat container. That requires sorting the moved vector (and removing duplicates if needed). I could try to use std::sort/std::stable_sort but there are some problems: -> Non-portable for C++03 compilers and Boost.Move, and Boost.Container fully supports those platforms. -> Additional O(N) temporary memory for stable sorting. Usually this means a temporary N element array, although there are algorithm to use only N/2 temporary elements. With big containers, this is a lot of memory. So I started analyzing several papers and implementations to see if a custom implementation was possible, but I felt Boost.Sort could be a good place to have that implementation and the Boost community a good place to discuss design alternatives. My initial requirements: 1) Stable sort. A non-stable sort is faster and useful for unique keys containers, but stable sort can be used initially in all flat containers until the non-stable version is implemented. 2) Externally supplied auxiliary memory with optionally no auxiliary memory at all. It seems that there are newer algorithms like Grailsort (https://github.com/Mrrl/GrailSort) or WikiSort (https://github.com/BonzaiThePenguin/WikiSort) that claim O(1) memory, even no auxiliary buffer, and key comparison and data exchanges with complexity O(n*log(n)). However I don't think those implementation correctly handle the basic exception guarantee or non-trivially copyable types. Reason: I want to avoid allocating short-lived temporary memory: if it's available I want to pass the extra capacity that flat containers' internal vector holds for future insertions. 3) Customization points (without ADL, which IMHO is not customizable enough). The sort algorithm should use a SortTraits template parameter for basic operations like "move", "swap", "initialized_move". E.g: template<class T> struct sort_traits { static? void move(T &from, T &to); static? void swap(T &l, T &r); static? void uninitialized_move(T &from, void *to); static? void destroy(T &from, void *to); }; I don't know if this trait should be stateful or not (static members). I could imagine the usefulness of a stateful implementation of sort_traits that could store a reference to an allocator so that some special "construct" version is used in some operations. Or just a version that counts the number of each operation to ease debugging. In any case, a stateless version would be enough for most cases. Reason: In Boost.Container, the idea is to use boost::move + boost::adl_move_swap in the initial implementation, maybe some destructive move implementation in a second iteration if Boost.Move implements it. In any case those operations should be customizable by the user, independently from the type to be sorted. This also avoids unnecessary dependencies in Boost.Sort. -> Dependencies: the implementation should only rely on Boost.Core, Boost.Config, maybe Boost.TypeTraits and no heavy STL dependencies (optionally). Currently Boost.Sort depends on Serialization for BOOST_STATIC_WARNING, but I think this is easily solvable, maybe this static assert utility should go to the core/utility library. Reason: Sort is a low-level algorithm that should avoid pulling unneeded dependencies to other boost libraries. * * * I don't know if other Boost libraries could benefit from this new stable_sort implementation. Maybe it's too special and I should try to write it myself in Boost.Container. I would love to hear comments and suggestions from sort experts, this might be a common need that could enrich the Boost.Sort library. Best, Ion