On 03/26/17 06:02, Vladimir Batov via Boost wrote:
I've been using boost::container::flat_set with quite a success until recently when I profiled my app to see that 30-50% of the run-time was spent in insert() for a container of around 50,000 in size.
So, I had to replace boost::container::flat_set with a vector variation which inserts as a vector (no sorting). However, when it is the time to call begin(), end(), find(), it sorts once. That shaved the mentioned overhead off. Say, one run is down from 17s to 10s and the likes.
AFAIU, flat_set should not sort on insert. It should do lower_bound to find the insert position, which is O(log(N)). Sorting instead would give O(N*log(N)). What you're probably seeing is the overhead that comes from inserting in the middle of the vector that is used by flat_set internally. This especially matters if elements have inefficient move/copy. Flat containers are not well suited for frequent modifications, so that is kind of expected.
I wonder if that should be the boost::container::flat_set default behavior rather than sorting the underlying container with every insert(). After all, as a user I do not care if the underlying container is sorted... until I start using/traversing it, right?
No, this won't work in general because these methods are not supposed to modify the container. In particular, they don't invalidate iterators or references, and calling these methods concurrently is thread-safe.
Or maybe I am missing something in boost::container::flat_set which already allows me to do something of that sort?
You can try using the methods and constructors that accept the ordered_unique_range tag. If you can prepare a sequence of ordered and unique elements, you can call insert(ordered_unique_range, s.begin(), s.end()) or similarly construct the flat_set to avoid ordering the sorted elements.