On Mon, Oct 2, 2017 at 7:30 PM, Thorsten Ottosen
Hi,
Some more points to consider. I tried comparing push_back/unsafe_push_back a little. I don't see any performance speedup on my system if the loop body is just the push_back. If I make the loop a little more complicated than just a push_back, I get some differences:
Hi,
My concern with the above benchmark, is that it does not compare the same thing: it calls reserve here but not there (std::deque and boost::container::deque have no reserve). I tried a similar test, comparing devector/vector/container.vector unsafe_push_back/push_back, using google benchmark. See the code attached. This is how I run it: $ uname -r 3.13.0-24-generic $ g++ ---version g++-5 (Ubuntu 5.4.1-2ubuntu1~14.04) 5.4.1 20160904 $ g++ -Wall -Werror -Wextra -std=c++11 -O2 -march=native -I ../include/ -I ../../google-benchmark/include/ -I $BOOST_BUILD_PATH -L ../../google-benchmark/src/ google_push_back.cpp -DNDEBUG -lbenchmark -lpthread $ sudo su # echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor # nice -20 taskset -c 0 ./a.out Run on (8 X 3500 MHz CPU s) 2017-10-02 23:55:35 ***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead. (I consider this warning a false positive. CPU0 has no scaling, and this task is pinned there) ----------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------- devector_unsafe_push_back/8 785 ns 782 ns 897711 devector_unsafe_push_back/64 846 ns 844 ns 819441 devector_unsafe_push_back/512 1219 ns 1216 ns 582811 devector_unsafe_push_back/4096 4207 ns 4238 ns 164192 devector_unsafe_push_back/32768 30885 ns 30876 ns 24689 vector_push_back/8 796 ns 793 ns 890232 vector_push_back/64 967 ns 965 ns 724603 vector_push_back/512 2109 ns 2104 ns 335052 vector_push_back/4096 11279 ns 11300 ns 61336 vector_push_back/32768 84355 ns 84253 ns 8332 cvector_push_back/8 791 ns 788 ns 883267 cvector_push_back/64 861 ns 858 ns 809779 cvector_push_back/512 1330 ns 1326 ns 527179 cvector_push_back/4096 5081 ns 5109 ns 135001 cvector_push_back/32768 34288 ns 34286 ns 20202 # echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor Re-run results in very similar results. I think the numbers are convincing in favor of devector::unsafe_push_back. Please repeat the test. Thanks, Benedek