Hi Steven, This Christmas I had been working hard. I developed a new version of the merge sort about 10-15% faster than the previous, using an additional memory only ½ of the memory used by the data. The GCC and TBB algorithms use an additional memory as the size of the data. This version is exception safe. If an exception occurs due to the algorithm ( only throw exceptions in the allocation and deallocation of the additional memory), the integrity of the data is guarantee. You have all the original data (unordered in the allocation, and ordered in the deallocation) I examined the TBB version and the secret is the sample sort algorithm. The version of TBB is provisional and it's not exception safe. I am working in the design of a new version of this algorithm exception safe. The actual version of the parallel stable algorithm is about 20% slower than the TBB version in the worse case, but when the size of the elements grows, the indirect version is faster than all the others, with 64 bytes, is around 10% faster than the best TBB version, with 128 bytes , the best version of TBB is around a 90% slower, and with 256 is 240% slower. An additional advantage of the indirect sort is they use additional memory for the pointers, not for the data, and the additional memory is around a 10% of the used by the data. I hope to have the new algorithm in two weeks, more or less. Then I will put all in the name space boost::sort, and design all the test and benchmarks, as commented in the previous message. Yours Francisco