Vladimir,
On Thu Nov 13 2014 at 2:07:53 AM Vladimir Prus
If that's what you're looking for, I can create the equivalent graph for string_sort too. The crossover point of about 1000 elements is quite visible, and both std::sort and integer_sort have some minor parallelization inefficiency, but I see no surprises, and expect none for string_sort.
Steven,
what input data distribution is this? Could you run the tests with worst-case data distribution for your algorithm?
The input data distribution was evenly distributed 32-bit random data. The absolute worst-case distribution I've been able to design for spreadsort is binaryalrbreaker with an ALR_THRESHOLD of 3. With 32-bit integers, it has 512 elements, and takes 15 microseconds for integer_sort, and 14 microseconds for std::sort. With 64-bit integers, it has 131072 elements, and takes 2.8ms for integer_sort, and 1.6ms for std::sort. If you add or subtract any elements, it won't be the worst-case distribution! If I simply double the number of elements, 64-bit integer_sort takes 5.35ms, and std::sort takes 4.56ms. I'm not sure what kind of graph would make sense with this type of situation. I'd also like to note that this distribution is specifically designed to make integer_sort slow, and cannot be scaled up without improving the relative speed of integer_sort.