While working on the corrections suggested on my sorting project, I tried
to test the swapping optimization on timsort and compared it with other
sorting algorithms.
I used "benchmark_objects.cpp" file in the boost library for the testing
purpose.
I have attached the results in this email.
Summary -
When arrays comparison is done by calculating the sum of all the numbers of
the array, following was the result:
1) timsort with swapping optimization gives the best result, total time =
17.3, average time = 1.15
2) flat_stable_sort gave the second best result, total time = 33 seconds,
average time = 2.2 seconds
When arrays comparison is done between the first element of the array, as a
key, following was the result:
1) timsort with swapping optimization gives the best result, total time =
3.52 seconds, average time = 0.23 seconds
2) pdqsort gave the second best result, total time = 4.6 seconds, average
time = 0.31 seconds
I have not yet pushed the swapping function on my GitHub. Please let me
know if any queries.
Regards,
Fenil Mehta
On Tue, Feb 12, 2019 at 9:21 AM Steven Ross via Boost
On Sat, Feb 9, 2019 at 11:45 PM degski via Boost
wrote: On Sat, 9 Feb 2019 at 14:39, Francisco José Tapia via Boost < boost@lists.boost.org> wrote:
The results con ska_sort are in the attached file numbers_02.txt
Thanks for adding this [and it was worthwhile I would say], this is a
great
table and shows ska_sort is weak on sorted data [maybe that's fixable], but otherwise better than [Boost] Spreadsort and Fastest-Integer-Sort [sic] and overall fastest on random data.
I say you the same things as Fenil. This is an external test of the
algorithm. The internal test is other question very different ...
I don't understand why the "internal test" [unless you mean testing for bugs] has any relevance.
It looks like ska_sort is a radix sort designed mostly for the best-case, and that if you feed it the alrbreaker or binaryalrbreaker distributions it will perform poorly (though not as poorly as radix implementations with no fallback or that fallback to insertionsort). There might be some counting and swapping optimizations here to copy into spreadsort (I see block operations), though those might not work if the items being sorted are references or pointers.
Mostly-sorted data is a common case to take seriously, and it took only modest tweaking of spreadsort to do well there.
Based on Francisco's numbers, I should retune spreadsort to take advantage of the higher speed of pdqsort vs std::sort, at least if I want to optimize for 64-bit integer sorting on modern processors.
https://www.boost.org/doc/libs/1_67_0/boost/sort/spreadsort/detail/constants... A higher log_min_split_count (especially) and log_finishing_count is probably appropriate.
degski PS: Please don't top-post, it's just so messy. -- *“If something cannot go on forever, it will stop" - Herbert Stein*
_______________________________________________ Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost