Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

27 Sep 2017

      Den 26-09-2017 kl. 23:43 skrev Joaquin M López Muñoz via Boost:

Thanks for the thorough review.
...
3. The question arises of whether segment access can gain us some speed. 
I've
written a small test to measure the performance of a plain std::for_each 
loop
over a batch_deque vs. an equivalent sequence of segment-level loops
(attached, batch_deque_for_each.cpp), and this is what I got for Visual 
C++ 2015
32-bit (x86) release mode in a Windows 7 64-bit box with an Intel Core 
i5-2520M
@2.5GHz:
  [](int x){return x;}
   segment size: 32
   n       plain   segmented
   10E3    25.5472 23.6305
   10E4    24.5778 23.6907
   10E5    24.5821 22.8076
   10E6    25.5007 23.1037
   10E7    27.1452 24.0339
   segment size: 512
   n       plain   segmented
   10E3    23.8384 23.6638
   10E4    23.0284 23.8705
   10E5    22.8449 22.8187
   10E6    23.8485 23.7454
   10E7    24.1711 23.5404
  [](int x){return x%4?x:-x;}
   segment size: 32
   n       plain   segmented
   10E3    33.9795 23.6662
   10E4    32.4817 24.023
   10E5    32.8731 23.3803
   10E6    33.5396 22.9298
   10E7    33.1034 23.0206
   segment size: 512
   n       plain   segmented
   10E3    25.0623 23.3205
   10E4    25.1048 23.5812
   10E5    25.3343 21.7686
   10E6    25.6961 22.4639
   10E7    25.8664 22.9964
For 32-bit release mode on windows 7 64 bit with intel i7-2700K:

[](int x){return x;}
segment size: 32
n       plain   segmented
10E3    21.4589 21.8351
10E4    19.9545 20.5133
10E5    19.4889 20.6197
10E6    19.2552 19.6976
10E7    19.2919 19.5425
segment size: 512
n       plain   segmented
10E3    20.2503 20.6372
10E4    19.0234 19.3367
10E5    18.5394 18.6171
10E6    18.555  18.5816
10E7    19.0918 19.1833

[](int x){return x%4?x:-x;}
segment size: 32
n       plain   segmented
10E3    28.743  19.7501
10E4    26.8371 19.0719
10E5    27.0304 18.7624
10E6    26.9561 18.2357
10E7    27.2985 18.6425
segment size: 512
n       plain   segmented
10E3    22.1073 20.0347
10E4    20.7825 19.5639
10E5    20.6122 18.0773
10E6    20.6039 18.4895
10E7    21.7964 19.1822

So basically the same as your results. The case for segment size 32 and 
a non-trivial lambda does show some speedup, doesn't it?

For 64-bit release mode on windows 7 64 bit with intel i7-2700K:

[](int x){return x;}
segment size: 32
n       plain   segmented
10E3    34.748  21.1357
10E4    32.8879 19.8592
10E5    32.6779 18.955
10E6    32.6255 19.3307
10E7    33.2282 19.3158
segment size: 512
n       plain   segmented
10E3    28.442  20.0265
10E4    26.5783 18.5851
10E5    26.4857 18.6023
10E6    26.4884 18.6571
10E7    27.0076 19.1338

[](int x){return x%4?x:-x;}
segment size: 32
n       plain   segmented
10E3    43.0149 18.8431
10E4    42.2736 18.5071
10E5    42.4035 18.7087
10E6    42.1964 18.3355
10E7    42.8113 18.7723
segment size: 512
n       plain   segmented
10E3    40.3695 19.0028
10E4    38.5371 18.2029
10E5    38.2163 17.85
10E6    38.2952 17.9199
10E7    38.7489 18.6342

I don't know why a 64-bit program would be slower, but there seems to be 
a larger difference here.

I'm wondering how the results would be on 32/64 bit ARM.

Also, I do expect a benchmark of serialization to be much better. I 
don't think one do that optimally without access to the segments.

Benedek, could you please make a test of the performance of 
serialization for both devector/batch_deque vs 
boost::vector/boost::deque (release mode, full speed optimization), 
perhaps using the same measuring technique as employed by Joaquin. And 
then post results and code so people can run it on their favorite 
system. You should use types char, int, and something bigger, e.g. 
string or array<int,32>.

kind regards

Thorsten

Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

Thorsten Ottosen