[container] moving a pre-built vector into a flat_set
a common problem i encounter is that i have two stage's w/r/t to a large collection of data: building and querying. vector is obviously the king of building. flat_set is the king of querying. so what i often tend to do is build up a very large vector, sort it then create a flat_set using the flat_set(ordered_unique_t) constructor. however this still copies the entire vector into the flat_set, right? what i would like to be able to do is move the vector straight into the flat_set, something like: std::vector<T> data; flat_set data_set1(std::move(data)) // move vector in and sort it // or: std::sort(data.begin(), data.end()); // sort vector flat_set data_set2(ordered_unique_t, std::move(data)) // and move it in is this doable (am i just being blind?) or is there plans/interest for this?
<snip>
what i would like to be able to do is move the vector straight into the flat_set, something like:
std::vector<T> data;
flat_set data_set1(std::move(data)) // move vector in and sort it // or: std::sort(data.begin(), data.end()); // sort vector flat_set data_set2(ordered_unique_t, std::move(data)) // and move it in
is this doable (am i just being blind?) or is there plans/interest for this?
</snip> You'll probably never be able to move the whole allocation, but you might be able to move each element via std::make_move_iterator flat_set data_set2(ordered_unique_t, std::make_move_iterator(data.begin()), std::make_move_iterator(data.end())); // moves each element, but not the whole allocation, so still not free. I haven't tried this though. -- chris
On 16 September 2015 at 01:24, Chris Glover
<snip>
what i would like to be able to do is move the vector straight into the flat_set, something like:
std::vector<T> data;
flat_set data_set1(std::move(data)) // move vector in and sort it // or: std::sort(data.begin(), data.end()); // sort vector flat_set data_set2(ordered_unique_t, std::move(data)) // and move it in
is this doable (am i just being blind?) or is there plans/interest for this?
</snip>
You'll probably never be able to move the whole allocation, but you might be able to move each element via std::make_move_iterator
how come? i get that it's exposing the implementation, but the gains would be so so massive. and the name flat_ basically confirms the use of a sorted vector internally. flat_set data_set2(ordered_unique_t, std::make_move_iterator(data.begin()),
std::make_move_iterator(data.end())); // moves each element, but not the whole allocation, so still not free.
I haven't tried this though.
that's definitely a compromising solution in some cases, but when you've got a billion ints in a vector it doesn't help (or, like any standard layout struct really)
On Wed, Sep 16, 2015 at 4:19 AM, Sam Kellett
On 16 September 2015 at 01:24, Chris Glover
wrote: <snip>
what i would like to be able to do is move the vector straight into the flat_set, something like:
std::vector<T> data;
flat_set data_set1(std::move(data)) // move vector in and sort it // or: std::sort(data.begin(), data.end()); // sort vector flat_set data_set2(ordered_unique_t, std::move(data)) // and move it in
is this doable (am i just being blind?) or is there plans/interest for this?
</snip>
You'll probably never be able to move the whole allocation, but you might be able to move each element via std::make_move_iterator
how come? i get that it's exposing the implementation, but the gains would be so so massive. and the name flat_ basically confirms the use of a sorted vector internally.
Moving in a std::vector would force the flat_set to be implemented in terms of a std::vector. However, what about a type from flat_set, e.g. flat_set<T>::storage and document what concept this storage implements (e.g. Random Access Container), and then we could do what you want in a maybe cleaner way. flat_set<T>::storage data; // fill data flat_set<T> the_set{std::move(data)}; -- François
[snip]
how come? i get that it's exposing the implementation, but the gains would
be so so massive. and the name flat_ basically confirms the use of a sorted vector internally.
Moving in a std::vector would force the flat_set to be implemented in terms of a std::vector. However, what about a type from flat_set, e.g. flat_set<T>::storage and document what concept this storage implements (e.g. Random Access Container), and then we could do what you want in a maybe cleaner way.
flat_set<T>::storage data; // fill data flat_set<T> the_set{std::move(data)};
yeah that would be perfect!
On 16-09-2015 14:13, Sam Kellett wrote:
[snip]
how come? i get that it's exposing the implementation, but the gains would
be so so massive. and the name flat_ basically confirms the use of a sorted vector internally.
Moving in a std::vector would force the flat_set to be implemented in terms of a std::vector. However, what about a type from flat_set, e.g. flat_set<T>::storage and document what concept this storage implements (e.g. Random Access Container), and then we could do what you want in a maybe cleaner way.
flat_set<T>::storage data; // fill data flat_set<T> the_set{std::move(data)};
Well, almost. IMO, we need three things to make it perfect: A. flat_set/map should have an additional template argument specifying the internal vector type (it can default to the current choice). B. You should be able to move the whole container, as suggested. C. The constructor should contain an assertion with a call to is_sorted() kind regards -Thorsten
On Wed, Sep 16, 2015 at 9:23 AM, Thorsten Ottosen
On 16-09-2015 14:13, Sam Kellett wrote:
[snip]
how come? i get that it's exposing the implementation, but the gains would
be so so massive. and the name flat_ basically confirms the use of a
sorted
vector internally.
Moving in a std::vector would force the flat_set to be implemented in terms of a std::vector. However, what about a type from flat_set, e.g. flat_set<T>::storage and document what concept this storage implements (e.g. Random Access Container), and then we could do what you want in a maybe cleaner way.
flat_set<T>::storage data; // fill data flat_set<T> the_set{std::move(data)};
Well, almost.
IMO, we need three things to make it perfect:
A. flat_set/map should have an additional template argument specifying the internal vector type (it can default to the current choice).
B. You should be able to move the whole container, as suggested.
C. The constructor should contain an assertion with a call to is_sorted()
So essentially, flat_set/map would now be an adapter like a stack or queue. Makes sense. About C, I rather think that we should have two constructors: flat_set/map(storage raw) flat_set/map(ordered_unique_range_t, storage sorted) and then the is_sorted() assert check should be done in the second constructor, and the first one would call sort(). -- François
Well, almost.
IMO, we need three things to make it perfect:
A. flat_set/map should have an additional template argument specifying the internal vector type (it can default to the current choice).
B. You should be able to move the whole container, as suggested.
C. The constructor should contain an assertion with a call to is_sorted()
So essentially, flat_set/map would now be an adapter like a stack or queue. Makes sense. About C, I rather think that we should have two constructors:
flat_set/map(storage raw) flat_set/map(ordered_unique_range_t, storage sorted)
+1
and then the is_sorted() assert check should be done in the second constructor, and the first one would call sort().
i'd be scared that this check will nullify a lot of the benefit of this new constructor. as the is_sorted assertion i assume would have to go through the data to verify this which brings this constructor down to the same O(n) complexity as it's iterator pair counterpart. does the flat_set(ordered_unique_t, first, last) constructor do the is_sorted check? or is it undefined behaviour if the input data is out of order? i guess it does seeing as how it has to loop through the data anyway it might as well.. i would err on the side of this new constructor simply being undefined behaviour if non-sorted data is given to it
On Wed, Sep 16, 2015 at 11:10 AM, Sam Kellett
Well, almost.
IMO, we need three things to make it perfect:
A. flat_set/map should have an additional template argument specifying the internal vector type (it can default to the current choice).
B. You should be able to move the whole container, as suggested.
C. The constructor should contain an assertion with a call to is_sorted()
So essentially, flat_set/map would now be an adapter like a stack or queue. Makes sense. About C, I rather think that we should have two constructors:
flat_set/map(storage raw) flat_set/map(ordered_unique_range_t, storage sorted)
+1
and then the is_sorted() assert check should be done in the second constructor, and the first one would call sort().
i'd be scared that this check will nullify a lot of the benefit of this new constructor. as the is_sorted assertion i assume would have to go through the data to verify this which brings this constructor down to the same O(n) complexity as it's iterator pair counterpart.
does the flat_set(ordered_unique_t, first, last) constructor do the is_sorted check? or is it undefined behaviour if the input data is out of order? i guess it does seeing as how it has to loop through the data anyway it might as well..
i would err on the side of this new constructor simply being undefined behaviour if non-sorted data is given to it
The is_sorted check would be in an assert, so it would go away in release builds (this is just validation of precondition, much like we would assert in std::vector<T>::operator[] for out-of-range indices). Granted, this check is costly, and ideally we would like to have control on how expensive we allow validation of preconditions to be. Personally, I wouldn't mind if the check is not done at all. I was merely suggesting in which case we might add this check. -- François
On 16-09-2015 19:40, Francois Duranleau wrote:
The is_sorted check would be in an assert, so it would go away in release builds (this is just validation of precondition, much like we would assert in std::vector<T>::operator[] for out-of-range indices).
Yes, of course in an assertion.
Granted, this check is costly, and ideally we would like to have control on how expensive we allow validation of preconditions to be. Personally, I wouldn't mind if the check is not done at all. I was merely suggesting in which case we might add this check.
Well, you already paid at least O(n) time to build the container that you are moving from. You won't get a different complexity because of this check. On the other hand, you might catch a lot of accidental problems (say, not being sorted or being sorted with the wrong predicate). These operations allow the user to break the invariant of the container, an assertion would be the minimum to expect as a guard against this. kind regards Thorsten
On 16/09/2015 15:23, Thorsten Ottosen wrote:
Well, almost.
IMO, we need three things to make it perfect:
A. flat_set/map should have an additional template argument specifying the internal vector type (it can default to the current choice).
B. You should be able to move the whole container, as suggested.
C. The constructor should contain an assertion with a call to is_sorted()
kind regards
-Thorsten
Seems like a cool feature for several users. Create a ticket and I'll try to put it in the to-do list. Best, Ion
Well, almost.
IMO, we need three things to make it perfect:
A. flat_set/map should have an additional template argument specifying the internal vector type (it can default to the current choice).
B. You should be able to move the whole container, as suggested.
C. The constructor should contain an assertion with a call to is_sorted()
kind regards
-Thorsten
Seems like a cool feature for several users. Create a ticket and I'll try to put it in the to-do list.
re a: so i guess the Allocator template parameter that's currently in flat_set/map will be swapped out for a Container one ala std::queue, right?
On 16/09/2015 20:24, Sam Kellett wrote:
Well, almost.
IMO, we need three things to make it perfect:
A. flat_set/map should have an additional template argument specifying the internal vector type (it can default to the current choice).
B. You should be able to move the whole container, as suggested.
C. The constructor should contain an assertion with a call to is_sorted()
kind regards
-Thorsten
Seems like a cool feature for several users. Create a ticket and I'll try to put it in the to-do list.
re a: so i guess the Allocator template parameter that's currently in flat_set/map will be swapped out for a Container one ala std::queue, right?
Nope, we need to maintain backwards compatibility. I don't know if making it an adaptor is the most needed feature. vector will be the case in 99'9%. Ion
On 16 September 2015 at 17:46, Ion Gaztañaga
On 16/09/2015 15:23, Thorsten Ottosen wrote:
Well, almost.
IMO, we need three things to make it perfect:
A. flat_set/map should have an additional template argument specifying the internal vector type (it can default to the current choice).
B. You should be able to move the whole container, as suggested.
C. The constructor should contain an assertion with a call to is_sorted()
kind regards
-Thorsten
Seems like a cool feature for several users. Create a ticket and I'll try to put it in the to-do list.
also i see that the container used by flat_set is boost::container::vector, is this for encapsulation or is there advantages of boost::container::vector over std::vector in this capacity? cheers
On 16/09/2015 21:44, Sam Kellett wrote:
also i see that the container used by flat_set is boost::container::vector, is this for encapsulation or is there advantages of boost::container::vector over std::vector in this capacity?
boost::container::vector has some new functions (undocumented, as they are not ready to be public) to improve performance: merge, merge_unique, insert_ordered_at. Note that boost::container::vector implements many C++11 features in C++03 compilers. Without C++11, flat_xxx would be broken. Best, Ion
On 16 September 2015 at 20:52, Ion Gaztañaga
On 16/09/2015 21:44, Sam Kellett wrote:
also i see that the container used by flat_set is boost::container::vector,
is this for encapsulation or is there advantages of boost::container::vector over std::vector in this capacity?
boost::container::vector has some new functions (undocumented, as they are not ready to be public) to improve performance: merge, merge_unique, insert_ordered_at.
ah ok, so does that mean that if the user specified a std::vector to be used for flat_set via the Container template param then it would not work because std::vector is missing these methods or are they used behind the standard vector interface? if so i don't think the new Container param would be the best idea, maybe just have that as unconfigurable (as now), but accessible via a storage_type typedef.
Note that boost::container::vector implements many C++11 features in C++03 compilers. Without C++11, flat_xxx would be broken.
right, gotcha
It strikes me that more containers could enjoy this mechanism. Perhaps we need a concept for "MovableContiguousContainer" and "MovableContiguousStorage". Perhaps with sub concepts for the aligned versions too. Because I can totally see `basic_string<T>::basic_string(vector<T>&&)` and `std::vector<T>::vector(basic_string<T>&&)` being helpful just the same. So, if we think about this beyond the scope of flat_* then we might actually create something that could be proposed for future c++. Thinking of `flat_set<T>::flat_set(MovableContiguousStorage<T>&&)`¹ ¹ (I don't know concepts syntax) -- Seth On Tue, Sep 15, 2015, at 11:03 PM, Sam Kellett wrote:
a common problem i encounter is that i have two stage's w/r/t to a large collection of data: building and querying.
vector is obviously the king of building. flat_set is the king of querying. so what i often tend to do is build up a very large vector, sort it then create a flat_set using the flat_set(ordered_unique_t) constructor.
however this still copies the entire vector into the flat_set, right?
what i would like to be able to do is move the vector straight into the flat_set, something like:
std::vector<T> data;
flat_set data_set1(std::move(data)) // move vector in and sort it // or: std::sort(data.begin(), data.end()); // sort vector flat_set data_set2(ordered_unique_t, std::move(data)) // and move it in
is this doable (am i just being blind?) or is there plans/interest for this?
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
FWIW I needed a small_flat_set (that I later switched to a static_set), so
I went on to define a type alias for boost::flat_set just to discover that
the container type is not customizable...
So I just quickly reimplemented flat_set as a container adaptor, and now I
have type aliases for small_flat_set and static_set for free.
I've extracted my impl and put it in a gist on github, in case somebody
runs into the same issue and wants a workaround, it depends on some
internal assert macros and range-v3 for concept-checking, but it shouldn't
be hard to workaround those and adapt it to your needs:
https://gist.github.com/gnzlbg/3949636aee663927d1cb338f00ce6733
It's "good enough for me" but is not perfect (e.g. no methods with a
position hint, one can certainly optimize the range inserts further, its
not stable, and it does not model AssociativeContainer because it doesn't
provide a value_type).
Bests, Gonzalo.
On Thu, Sep 17, 2015 at 2:07 PM, Seth
It strikes me that more containers could enjoy this mechanism.
Perhaps we need a concept for "MovableContiguousContainer" and "MovableContiguousStorage". Perhaps with sub concepts for the aligned versions too.
Because I can totally see `basic_string<T>::basic_string(vector<T>&&)` and `std::vector<T>::vector(basic_string<T>&&)` being helpful just the same.
So, if we think about this beyond the scope of flat_* then we might actually create something that could be proposed for future c++.
Thinking of `flat_set<T>::flat_set(MovableContiguousStorage<T>&&)`¹
¹ (I don't know concepts syntax)
-- Seth
On Tue, Sep 15, 2015, at 11:03 PM, Sam Kellett wrote:
a common problem i encounter is that i have two stage's w/r/t to a large collection of data: building and querying.
vector is obviously the king of building. flat_set is the king of querying. so what i often tend to do is build up a very large vector, sort it then create a flat_set using the flat_set(ordered_unique_t) constructor.
however this still copies the entire vector into the flat_set, right?
what i would like to be able to do is move the vector straight into the flat_set, something like:
std::vector<T> data;
flat_set data_set1(std::move(data)) // move vector in and sort it // or: std::sort(data.begin(), data.end()); // sort vector flat_set data_set2(ordered_unique_t, std::move(data)) // and move it in
is this doable (am i just being blind?) or is there plans/interest for this?
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman /listinfo.cgi/boost
On Sun, Mar 12, 2017 at 7:22 AM, Gonzalo BG via Boost wrote:
FWIW I needed a small_flat_set (that I later switched to a static_set), so I went on to define a type alias for boost::flat_set just to discover that the container type is not customizable...
So I just quickly reimplemented flat_set as a container adaptor, and now I have type aliases for small_flat_set and static_set for free.
I've extracted my impl and put it in a gist on github, in case somebody runs into the same issue and wants a workaround, it depends on some internal assert macros and range-v3 for concept-checking, but it shouldn't be hard to workaround those and adapt it to your needs:
https://gist.github.com/gnzlbg/3949636aee663927d1cb338f00ce6733
It's "good enough for me" but is not perfect (e.g. no methods with a position hint, one can certainly optimize the range inserts further, its not stable, and it does not model AssociativeContainer because it doesn't provide a value_type).
Shouldn't leveraging the EBO for Compare should probably only happen conditionally: i.e. when !std::is_final_v<Compare> is true? On the other hand, maybe your small_flat_set could also be achieved with boost::container::flat_set with a custom allocator (instead of the default allocator). Glen
participants (8)
-
Chris Glover
-
Francois Duranleau
-
Glen Fernandes
-
Gonzalo BG
-
Ion Gaztañaga
-
Sam Kellett
-
Seth
-
Thorsten Ottosen