On 25 July 2013 20:15, Steven Ross
On Tue, Jul 23, 2013 at 11:19 AM, Jeremy Murphy < jeremy.william.murphy@gmail.com> wrote:
Yes, radix sorts work great when the data type is small (such as 8-bit or 16-bit integers). To get a better comparison against std::sort for the larger integers, I suggest you use data in a range roughly matching the size of the data type (what is RAND_MAX on your system? It's usually less than 1 << 16).
Actually, RAND_MAX = INT_MAX = (1 << 31) - 1 on my system (glibc-2.15). So how do I generate random 64-bit numbers... (rand() << 32) + rand()? Bah, I should just be using Boost.Random! As a note, MSD radix sort is extremely fast on the evenly-distributed
random data that it looks like you're using. I test Spreadsort with a bunch of uneven distributions that make the problem harder, in addition to a few tests with evenly distributed random data. LSD radix sort doesn't care much about distribution, but most other algorithms (including std::sort) do.
Yes, I understand that's the assumption that bucket sort works on. So I guess that's also an argument for having multiple, different implementations. I'm keen to do a comprehensive comparative performance test, all the algorithms across different distributions, but it will take some time. I've also been thinking that the LSD radix-sort can be optimized (especially for large-width integers) where k <= (k_max >> 1). That is, where some of the most-significant digits are not being used, the number of column/digit sort passes with counting-sort can potentially be reduced. Anyway, maybe in version 2.0. :) Jeremy