Hi,
Flat containers are not well suited for frequent modifications, so that is kind of expected.
Understood. The described use-case constructs the flat_set only once (but a big one and one insert at a time). So, on the surface all appeared by the book -- constructed once, traversed/searched often. Still, the observed construction overhead really stunned me.
I think the issue is that you use the wrong insert method. a single insert is O(N)in the size of the container (worst case: insert sorted from largest to smallest element), so inserting K element into an empty flat set has runtime O(K^2). You are essentially doing insertion sort on your data. flat_set has range-insert: template<typename InputIterator> void insert(InputIterator, InputIterator); template<typename InputIterator> void insert(ordered_unique_range_t, InputIterator, InputIterator); inserting a range of size K in a flat_set of size N should have complexity O(K log(K) + K+N) in the first case (implementable as: copy all new elements to the end, sort the new elements and perform an inplace-merge). in the second call, only a simple merge is needed, runtime O(K+N). Best, Oswin