What would be interesting to see is a bench-mark against tbb's (Thread Building Blocks) concurrent_vector (and maybe some of the other containers in that library) as this has a similar growth strategy (and other characteristics), but alllowing for (some) concurrent access...
Actually tbb concurrent_vector doesn't allow erasures other than clear(), so it wouldn't be suitable for comparisons in the situations where you'd use a colony. Interesting structure though. A colony should be able to have more concurrent accesses than an equivalently-multithreaded vector, as the block-based approach means you can have mutexes on individual blocks rather than the whole thing. Also some reads and writes can occur at the same time.
Bench-marking anything against std::deque on Windows is a rather futile exercise as the implementation is broken, the result of a maximum chunk-size of 16 bytes (no, no typo) or the size of the (one) object if larger. Changing this is on the M$-to-do-list, but will not feature untill a major version change (source STL)...
Yes, I noticed the MSVC deque performance results were weak, that's a pretty extraordinarily bad implementation! Wow. Hence why the main benchmarks are in GCC. I've noticed deque is actually better than vector under GCC, for most circumstances. Though I don't know how that holds up under vectorisation. M@