Hi, Some more points to consider. I tried comparing push_back/unsafe_push_back a little. I don't see any performance speedup on my system if the loop body is just the push_back. If I make the loop a little more complicated than just a push_back, I get some differences: 32 bit: boost::vector::push_back vs devector::unsafe_push_back 10E3 29.0508 24.8578 10E4 26.9987 22.7853 10E5 26.9468 22.3659 10E6 26.9766 22.3562 10E7 27.1064 22.5215 64 bit: 10E3 28.6106 24.2881 10E4 27.9322 23.6613 10E5 28.0182 23.3318 10E6 28.0493 23.3803 10E7 28.1458 23.5394 32 bit: boost::deque::push_back vs batch_deque::unsafe_push_back 10E3 34.803 45.6774 10E4 36.0686 46.3888 10E5 36.7773 48.6228 10E6 40.7754 49.6144 10E7 43.1134 49.1949 64 bit: 10E3 33.5997 30.0433 10E4 35.5992 32.8985 10E5 34.4685 48.9641 10E6 40.7974 48.6893 10E7 43.2836 48.9001 I'm not 100% sure why batch_deque is so much slower, but looking a little at the assembler, it seems that going from an iterator to a pointer may be expensive because the batch_deque::iterator uses a segment pointer and an index. (normal push_back in batch_deque therefore also appears somewhat slower than in boost::deque). kind regards Thorsten