[container] small_flat_set and small_flat_map
Given that Boost.Container has a small_vector now, would a piece of work that changed flat_set and flat_map to use small_vector<0> internally be feasible? I ask because I feel like if this was the case it would be almost trivial to expose flat_tree<N> through small_flat_set<N> and small_flat_map<N> types to provide the small size optimisation for these two containers.
On 15/10/2016 10:27, Sam Kellett wrote:
Given that Boost.Container has a small_vector now, would a piece of work that changed flat_set and flat_map to use small_vector<0> internally be feasible? I ask because I feel like if this was the case it would be almost trivial to expose flat_tree<N> through small_flat_set<N> and small_flat_map<N> types to provide the small size optimisation for these two containers.
It wouldn't be hard. Some have suggested that flat containers could be adaptors so that the underlying vector-like container could be customized. In any case, the idea of making flat containers compatible with small_vector seems interesting. Please fill a ticket with the request so that it is not forgotten. Best, Ion
done: https://svn.boost.org/trac/boost/ticket/12536
i'm happy to take a stab at implementing this if you're happy to review it?
On 16 October 2016 at 21:09, Ion Gaztañaga
On 15/10/2016 10:27, Sam Kellett wrote:
Given that Boost.Container has a small_vector now, would a piece of work that changed flat_set and flat_map to use small_vector<0> internally be feasible? I ask because I feel like if this was the case it would be almost trivial to expose flat_tree<N> through small_flat_set<N> and small_flat_map<N> types to provide the small size optimisation for these two containers.
It wouldn't be hard. Some have suggested that flat containers could be adaptors so that the underlying vector-like container could be customized. In any case, the idea of making flat containers compatible with small_vector seems interesting. Please fill a ticket with the request so that it is not forgotten.
Best,
Ion
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
On Wed, 19 Oct 2016 at 13:21 Sam Kellett
done: https://svn.boost.org/trac/boost/ticket/12536
i'm happy to take a stab at implementing this if you're happy to review it?
While I hesitate to assume I know the best solution to anyone else's particular problem, I have to question the real usefulness of this in terms of flat_set and flat_map. Typically, such a small space optimisation is applicable when 90% of one's use cases fall within the small size. For strings and vectors, this makes sense because there could be anywhere from a few items, to many items. But, typically set or map is chosen precisely because there will be many items in the container, otherwise you would choose something else, like an unsorted array like structure. So, to me the case of optimising for the small size makes little sense. -- chris
On 10/19/16 20:51, Chris Glover wrote:
But, typically set or map is chosen precisely because there will be many items in the container, otherwise you would choose something else, like an unsorted array like structure. So, to me the case of optimising for the small size makes little sense.
I would say that it's likely the opposite - larger flat containers will have more expensive insert and erase. Flat containers are especially useful with small number of (small) elements because they are cache-friendly, and small size optimization makes perfect sense to improve that use case even further.
I would say that it's likely the opposite - larger flat containers will have more expensive insert and erase.
That's true, but often the data is relatively stable, in which case the flat containers are a good choice.
Flat containers are especially useful with small number of (small) elements because they are cache-friendly, and small size optimization makes perfect sense to improve that use case even further.
It is my experience that the flat containers *always* outperform the node based containers on searching. So when the data is stable, I always use a flat container regardless of how many elements are in the container. One of the problems with that approach is that, because insert is slow, I often have to build up a std::vector of items, sort it and then copy them into the flat container using the ordered_unique_t overload. If this change was done in a way that allowed me to pass ownership of that sorted vector to the flat_* container, that would be an additional optimisation I would find useful. -- chris
On 19 October 2016 at 19:42, Chris Glover
[snip]
Flat containers are especially
useful with small number of (small) elements because they are cache-friendly, and small size optimization makes perfect sense to improve that use case even further.
It is my experience that the flat containers *always* outperform the node based containers on searching. So when the data is stable, I always use a flat container regardless of how many elements are in the container.
One of the problems with that approach is that, because insert is slow, I often have to build up a std::vector of items, sort it and then copy them into the flat container using the ordered_unique_t overload. If this change was done in a way that allowed me to pass ownership of that sorted vector to the flat_* container, that would be an additional optimisation I would find useful.
yeah i agree, although that's a different piece of work, no? i've mentioned this before[1] and i'm pretty sure it's on the roadmap (i've seen it mentioned since). [1] http://lists.boost.org/Archives/boost/2015/09/225568.php
On 19-10-16 19:51, Chris Glover wrote:
But, typically set or map is chosen precisely because there will be many items in the container, otherwise you would choose something else, like an unsorted array like structure. So, to me the case of optimising for the small size makes little sense.
This is anecdotal only. You have to ask WHY is "set or map [typically]
chosen precisely because...". The main reason might well be "because the
cost of node-based data structures is too high for small volumes" not
because they're not a good match.
Quite often I need small collections of characters (say, characters to
be escaped) which are logically best expressed as a set.
It's just a lot more readably to use a set than to use e.g. a your
small_vector<char> and manually maintain sort order and deduplication.
The only reason NOT to use a set when logically it is a set was performance.
Tiny local maps are frequent occurrence too. I find myself writing
linear searches across local vector . Imagine how much better it would be with `small_flat_map Summarizing: I find myself focusing on expressive code and often find
myself hindered by the lack of low-cost small associative/ordered
containers.
My $0.02
Seth
On 16-10-16 22:09, Ion Gaztañaga wrote:
Some have suggested that flat containers could be adaptors so that the underlying vector-like container could be customized
Measurements often show that ordering keys for lookup doesn't yield benefit at very small n (because linear prefetch/branch reduction favours linear search). I'd love to see a flat_map/flat_set that - besides being configurable for their storage strategy, also can optionally be configured to not be sorted.
participants (5)
-
Andrey Semashev
-
Chris Glover
-
Ion Gaztañaga
-
Sam Kellett
-
Seth