Re: [boost] radix sort

26 Jul 2013

      On 25 July 2013 20:15, Steven Ross <spreadsort@gmail.com> wrote:
...
On Tue, Jul 23, 2013 at 11:19 AM, Jeremy Murphy <
jeremy.william.murphy@gmail.com> wrote:
Yes, radix sorts work great when the data type is small (such as 8-bit or
16-bit integers).  To get a better comparison against std::sort for the
larger integers, I suggest you use data in a range roughly matching the
size of the data type (what is RAND_MAX on your system?  It's usually less
than 1 << 16).
Actually, RAND_MAX = INT_MAX = (1 << 31) - 1 on my system (glibc-2.15).  So
how do I generate random 64-bit numbers... (rand() << 32) + rand()?  Bah, I
should just be using Boost.Random!

As a note, MSD radix sort is extremely fast on the evenly-distributed
...
random data that it looks like you're using.  I test Spreadsort with a
bunch of uneven distributions that make the problem harder, in addition to
a few tests with evenly distributed random data.  LSD radix sort doesn't
care much about distribution, but most other algorithms (including
std::sort) do.
Yes, I understand that's the assumption that bucket sort works on.  So I
guess that's also an argument for having multiple, different
implementations.  I'm keen to do a comprehensive comparative performance
test, all the algorithms across different distributions, but it will take
some time.

I've also been thinking that the LSD radix-sort can be optimized
(especially for large-width integers) where k <= (k_max >> 1).  That is,
where some of the most-significant digits are not being used, the number of
column/digit sort passes with counting-sort can potentially be reduced.
 Anyway, maybe in version 2.0.  :)

Jeremy