On 21/09/2015 20:53, Phil Endecott wrote:
Ion Gazta?aga wrote:
I'm trying to improve insertion times for flat associative containers. In range insertions one optimization is to insert the whole range in the end of the container and sort the whole vector instead of trying a loop.
Hi Ion,
We've discussed this before...
Why do you (still) want to sort the whole container, rather than just sorting the new items and then using inplace_merge?
Well, I actually want to have both merge and sort. And yes, if understand this correctly sort + merge would be faster. In any case the idea is to have both (sort and merge) as adaptable algorithms, so that the extra capacity of the vector can be used to speed up the both sort and merge. For already sorted inputs and multikey containers, when size()+std::distance(begin, end) > capacity(), then a usual O(N) merge can be done in the newly allocated memory. The latest version of flat containers (Boost 1.59) already does this.
The idea of using the allocated-but-unused part of the buffer for temporary storage is an interesting one, which would benefit inplace_merge as well as stable_sort. Have you tried to quantify the benefits?
Not yet, but if known algorithms are used, the difference between inplace_sort|merge and sort|merge is theoretically big. Merge becomes O(N) instead of O(NlogN). Sort becomes O(logN) instead of O(N(logN)^2). When mergingsorting a range bigger than the extra capacity, steps up the the extra capacity would be O(N) and the last merge step would be N*logN. There are some mergesort implementations that only require N/2 extra memory, which really makes the extra capacity more attractive: "Fast mergesort implementation based on half-copying merge algorithm" http://kicia.ift.uni.wroc.pl/algorytmy/mergesortpaper.pdf. When growing the vector, the extra capacity is sometimes quite big (depending on the growth factor). I think speedups will be noticeable. Obviously, nothing is real until it's measured. There are C++ implementations of stable sort algorithms which claim they beat std::stable_sort, only require O(1) extra memory (although I guess it's really O(logN) extra memory if recursion is used), and still guarantee O(N*log(N)) complexity instead of O(N*(logN)^2). Based on these papers: http://www.math.ntua.gr/~symvonis/publications/c_1994_S_Optimal%20Stable%20M... https://github.com/BonzaiThePenguin/WikiSort/blob/master/tamc2008.pdf http://comjnl.oxfordjournals.org/content/35/6/643.full.pdf In any case those implementations are mainly tested with integers, they use memcpy so work only with PODs and completely ignore exception safety. But fixing that seems quite easier than writing a new algorithm ;-) Best, Ion PS: Extra memory is also currently used by an undocumented function, called "insert_ordered_at" that calculates an array of positions where the new values should be inserted, and then only moves already inserted elements once which minimizes copies to O(N).