Re: [boost] Integer sorting

9 Feb 2019

      The results con ska_sort are in the attached file numbers_02.txt

I say you the same things as Fenil. This is an external test of the
algorithm. The internal test is other question very different, and who have
more knowledge about these algorithms is, undoubtedly, Steven

The memory used by ska_sort to sort 100000000 uint64_t numbers is 784 MB.

To measure the memory used by a program in Linux  /usr/bin/time -f "%M KB
used"./benchmark_ska_sort_numbers

I didn't find any similar in Windows. I see the task manager, and take the
highest value in the memory used field. ( if someone knows how to do, it
will be appreciate)

To test the code, change the program benchmark_numbers of the library with
the provided in the message.

Modify the script runGCC_benchmark_numbers.sh and add the path to the
folder of your code.

The code is very easy to read and understand, modify it as you prefer.

The results of this program, on my machine are in the file numbers_02.txt
attached to this file.

My machine is

   - Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz
   - 6 cores, 12 threads
   - Caché L1d: 32K
   - Caché L1i: 32K
   - Caché L2: 256K
   - Caché L3: 15360K

Different machines implies different results

El dom., 3 feb. 2019 a las 14:24, Fenil Mehta via Boost (<
boost@lists.boost.org>) escribió:
...
Thank you for your detailed analysis.
I will review your feedback and take corrective actions as required.
If you don't mind, can you share the script or the C++ program you have
used to test the performance of all the algorithms for various input array
conditions so that I can do the same in future before any review requests.
Please tell how did you measure the memory usage and running time for each
algorithm.
Regards,
Fenil Mehta
On Thu, Jan 31, 2019 at 10:46 PM Francisco José Tapia via Boost <
boost@lists.boost.org> wrote:
...
Hi Fenil,
I examined and ran your code. About it we have two perspectives
- Internal, about the algorithm. The Steven’s opinion is the most
   qualified about the algorithm. He signed several lacks that must be
   resolved, before any serious consideration.
   - External. Test the algorithm and measure the speed, the memory used,
   The documentation and the code. I had seen a couple of questions
1.- The documentation is really poor. You need to explain questions about
the worst case, the memory used. The goal of any documentation is help to
the user to understand and run the code. If you write a documentation
where
explain something, and the final user don’t understand. It is not a good
documentation
2.- The organization of the code is messy. You must include in the code a
.cpp file. This can be a problem in a separate compilation where several
files use your code, and include the .cpp file
3.- In the command line to compile your code don’t use any optimization
parameter. It is strange, talk about the speed, and don’t use the
optimization
4.- I run your code with the other algorithms of the Boost Sort library
100 000 000 uint64_t random numbers
Fastest-Integer-Sort 3.79 secs  1565M used
pdq_sort             3.89 secs   785M used
spreadsort           4.79 secs   785M used
spin_sort            9.60 secs  1175M used
flat_stable_sort    10.68 secs   788M used
According this your code have a small difference with pdq_sort, but use
the
double of memory
5.- Many times the data are near sorted. Typically, adding data to the
begin or the end of the data sorted, or same element inside is changed.
It’s important detect and use this to reduce the time needed.
The results obtained was
************************************************************
**               B O O S T S O R T                        **
**           S I N G L E T H R E A D                      **
**      I N T E G E R   B E N C H M A R K                 **
**                                                        **
**     SORT OF 100 000 000 NUMBERS OF 64 BITS            **
************************************************************
[ 1 ] std::sort  [ 2 ] pdqsort          [ 3 ] std::stable_sort
[ 4 ] spin_sort  [ 5 ] flat_stable_sort [ 6 ] timsort
[ 7 ] spreadsort [ 8 ] Fastest-Integer-S
|      |      |      |      |      |      |      |
|
| [ 1 ]| [ 2 ]| [ 3 ]| [ 4 ]| [ 5 ]| [ 6 ]| [ 7 ]| [
8
]|
--------------------+------+------+------+------+------+------+------+------+
...
random              | 8.17 | 3.89 |10.17 | 9.60 |10.68 |12.46 | 4.79 |
3.79
...
|
|      |      |      |      |      |      |      |
|
sorted              | 1.94 | 0.14 | 2.95 | 0.07 | 0.07 | 0.07 | 0.07 |
0.54
|
sorted + 0.1% end   | 6.21 | 2.85 | 2.94 | 0.53 | 0.37 | 0.20 | 3.78 |
3.78
|
sorted + 1% end     |10.65 | 3.33 | 3.07 | 0.69 | 0.50 | 0.32 | 4.02 |
3.76
|
sorted + 10% end    | 7.64 | 4.12 | 3.91 | 1.47 | 1.40 | 1.35 | 4.79 |
4.12
|
|      |      |      |      |      |      |      |
|
sorted + 0.1% mid   | 4.70 | 3.20 | 3.14 | 1.97 | 2.54 | 2.25 | 3.78 |
3.79
|
sorted + 1% mid     | 4.37 | 3.56 | 3.29 | 2.27 | 3.18 | 2.84 | 4.28 |
3.86
|
sorted + 10% mid    | 6.36 | 4.70 | 5.17 | 4.13 | 5.65 | 6.27 | 5.79 |
4.69
|
|      |      |      |      |      |      |      |
|
reverse sorted      | 1.46 | 0.28 | 3.30 | 0.14 | 0.14 | 0.14 | 1.47 |
0.54
|
rv sorted + 0.1% end| 7.30 | 2.90 | 3.33 | 0.65 | 0.43 | 0.27 | 3.47 |
3.82
|
rv sorted + 1% end  | 5.09 | 3.32 | 3.38 | 0.79 | 0.56 | 0.39 | 3.78 |
3.82
|
rv sorted + 10% end | 4.68 | 4.15 | 4.30 | 1.58 | 1.46 | 1.44 | 4.82 |
4.19
|
|      |      |      |      |      |      |      |
|
rv sorted + 0.1% mid| 4.77 | 3.26 | 3.16 | 1.97 | 2.56 | 2.28 | 3.82 |
3.83
|
rv sorted + 1% mid  | 4.38 | 3.55 | 3.27 | 2.29 | 3.19 | 2.85 | 4.29 |
3.86
|
rv sorted + 10% mid | 6.37 | 4.62 | 5.08 | 4.11 | 5.53 | 6.19 | 5.64 |
4.57
|
--------------------+------+------+------+------+------+------+------+------+
...
jue ene 31 15:06:13 CET 2019
.
If you admit my guidance. You have the most important qualities, energy
and
...
initiative. But must learn some important questions. Be rigorous and
strict, check deeply your code. Is someone find a fail in your code, that
you didn’t detect, must be unacceptable for you.
If you are looking for something to work, consider that if many people is
working about it, except if you have a brilliant new idea, you will
discover, after a lot of work, that your results are similar. Study,
read,
learn, and when the adequate idea pass in front your eyes, catch it, and
work hard.
Yours
Francisco Tapia
El vie., 25 ene. 2019 a las 1:40, Steven Ross via Boost (<
boost@lists.boost.org>) escribió:
...
On Tue, Jan 22, 2019 at 4:09 AM Fenil Mehta via Boost
<boost@lists.boost.org> wrote:
...
I communicate with this E-mail account only: fenilgmehta@gmail.com
I have no other E-mail account.
Please forgive me for my mistyped time complexity, I have corrected
...
O(nk). I am not quite good at calculating the time complexity. Hence
...
may be an error in my time complexity but the sorting idea is good.
I have been working on my project for the past 6 months and it was
for
it
to
there
the
...
first time I heard about ska_sort. When I read its implementation, it
had
already implemented what I was planning to do in the near future.
I could not find ska_sort in the Boost library which I had recently
installed. If someone could throw some light on this, then it would
be
greatly appreciated.
I completely agree that ska_sort is highly optimized. For all arrays
of
primary data types(like int, float, char) or small objects(like
pair<int,int>) ska_sort performance is the best. However, for arrays
of
large objects, ir_sort is better.
The test cases are generated completely randomly. I do not know the
test
cases at all when I make the comparison. The source code for test
case
generation is there on the GitHub[1], you can scrutinize it and a
snippet
is as follows:
    // here I pass a vector for arr
    template<typename T>
    void fillRandArray(T &arr, const int64_t &low, const int64_t
&high,
const int64_t &randStart, const int64_t &randEnd) {
        std::random_device rd;
        std::mt19937_64 gen(rd());
        std::uniform_int_distribution<ArrayIndexType> dis(randStart,
randEnd);
for (int64_t i = low; i <= high; i++) arr[i] = dis(gen);
    }
As of my claims, let me give the clear details of the input.
ir_sort is faster when I tested it for an array of objects of size =
1000*64 bits, and array length = 2000 only.
I have to test it for other conditions.
I test using this random number generation approach (in addition to
the worst-case analysis):
https://github.com/boostorg/sort/blob/master/example/randomgen.cpp
So that I can vary the range in both the top and bottom 2 bytes of a
4-byte integer.  The same data can be cast to 8-byte if you wish.
I use this script to run through the various combinations of
distributions and different data types:
https://github.com/boostorg/sort/blob/master/tune.pl
tune.pl is also capable of tuning constants to the processor with the
-tune option.
The optimal sorting approach varies based on input array size, data
distribution, processor, and data type.  2000 randomly-distributed
elements should fit in L1 cache and be split evenly by a single radix
pass; the optimal approach for more complicated distributions and
millions of elements, sorted data, or for 256 elements may be quite
different.  spreadsort is optimized to provide good performance across
all these cases (for 256 elements it just uses pdqsort).
...
Steven, at present I have only implemented the swapping optimization
for
...
integers only. ir_sort is in the beginning phase. Once it finalized
for
integers, I will plan a solution for other data types.
I have not looked much at the worst case distribution. At present, I
am
testing it for random distributions. I will have a look at the worst
scenario and mostly sorted data.
For memory overhead, it will require two integer arrays of size "n",
and
probably it can be reduced to a single array of size "n".
Spreadsort uses just the input array and kilobytes of RAM overhead for
the bins.  For large amounts of data the RAM overhead can be
important.  Your swapping approach would probably need to change if
you sorted in-place.
...
I agree that these algorithms have been designed keeping generality
as
the
focus. However, generally, objects are indexed based on integers,
keeping
this in mind I decided to write ir_sort.
The reason it is way faster for large object arrays is that I only
make
"n"
swaps to sort the array. I use indexes to reduce the overhead of
swapping.
I believe the swapping optimization I have written is not limited to
ir_sort, it can be used along with other sorting algorithms to
improve
their running speed for large objects.
Francisco, thank you for your quick response. I wait for your
feedback.
[1]
https://github.com/fenilgmehta/Fastest-Integer-Sort/blob/b11ae9ddb3c9cc7e9d1...
...
...
...
[2] https://github.com/fenilgmehta/Fastest-Integer-Sort/
Regards,
Fenil Mehta
On Mon, Jan 21, 2019 at 8:18 PM Steven Ross via Boost <
boost@lists.boost.org>
wrote:
...
Fenil,
Based on the description, this looks like spreadsort without the
worst
case
analysis, and with a new swapping optimization (I know there is
room
for
improvement in the swapping).  I expect this algorithm will perform
significantly worse than std::sort (or pdqsort) in the worst-case
situations, without applying similar worst-case analysis.
spreadsort
is
comparable to std::sort in the worst case distribution for
spreadsort
because of careful analysis and threshold optimization.  If you
tweaked the
spreadsort constants
<
https://github.com/boostorg/sort/blob/master/include/boost/sort/spreadsort/d...
...
...
...
...
...
I
think you should be able to get comparable performance to your
algorithm,
minus the impact of your swapping optimization.
How does your swapping optimization perform on mostly-sorted data?
spreadsort does well in this common case relative to std::sort,
which
is
somewhat tricky with in-place radix sorting.
You should try seeing how your algorithm performs on the
distributions
in
https://github.com/boostorg/sort/blob/master/example/alrbreaker.cpp
and
...
https://github.com/boostorg/sort/blob/master/example/binaryalrbreaker.cpp,
...
...
...
...
tweaking those to hit your worst case.
What's the memory overhead?
What's the threshold size at which it starts becoming faster than
std::sort
and pdqsort?  comparison-based sorting is surprisingly fast on
small
lists
of integers that fit on L1 cache.
How well does it handle common prefixes in string sorting?
string_sort is
optimized to handle that case extremely quickly.  It's a serious
performance issue if you don't optimize for it.
I wish more people realized the generality of these algorithms; if
you
really want a fast sort, you can use spreadsort (or an equivalent
algorithm), you just need to define a way to index into the key.
This
was
published a while ago in Engineering Radix Sort, but most people
use
comparison sorts because they're easier to use (and for most
applications,
sorting compute time is minor).
On Mon, Jan 21, 2019, 3:29 AM Francisco José Tapia via Boost <
boost@lists.boost.org> wrote:
...
Hi Fenil,
I am Francisco Tapia, one of the maintainers of the Sort Library.
I had been watching your code, and find it interesting. But give
me a
week
to say you something. These days I am very busy, and I expect to
have
time
the next weekend, and examine and test your code.
Yours
Francisco Tapia
El lun., 21 ene. 2019 a las 8:41, Richard Hodges via Boost (<
boost@lists.boost.org>) escribió:
> On Sat, 29 Dec 2018 at 03:36, Fenil Mehta via Boost <
boost@lists.boost.org
> >
> wrote:
>
> > I have written a sorting algorithm which is way faster than
std::sort
for
> > all array size.
>
>
> "Extraordinary claims require extraordinary evidence" - Carl
Sagan
>
> https://en.wikipedia.org/wiki/Sagan_standard
>
>
>
>
> > The link is as follows:
> > https://github.com/fenilgmehta/Fastest-Integer-Sort
> >
> > Some guidance would be appreciated on how to improve the
project
> structure
> > and the steps to get it included in boost.
> >
> > Regards,
> > Fenil Mehta
> >
> > _______________________________________________
> > Unsubscribe & other changes:
> > http://lists.boost.org/mailman/listinfo.cgi/boost
> >
>
>
> --
> Richard Hodges
> hodges.r@gmail.com
> office: +442032898513
> home: +376841522
> mobile: +376380212 (this will be *expensive* outside Andorra!)
> skype: madmongo
> facebook: hodges.r
>
> _______________________________________________
> Unsubscribe & other changes:
> http://lists.boost.org/mailman/listinfo.cgi/boost
>
_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost