[review] The review of Boost.DoubleEnded starts today: September 21 - September 30 - Boost

newer
property_map/compose_property_map...

[review] The review of Boost.DoubleEnded starts today: September 21 - September 30

older
[boostbook] Building documentation...

Thorsten Ottosen

21 Sep 2017 21 Sep '17

5:38 p.m.

Dear users and members of Boost, I'm proud to announce that the formal review of Benedek Thaler's Boost.DoubleEnded library starts today, September 21, and runs through September 30. This library is the result of Benedek's participation in Google Summer of Code 2015. The library may be downloaded from https://github.com/erenon/double_ended and the documentation may be found here: http://erenon.hu/double_ended/ Anyone is welcome to post a review and/or take part in subsequent discussions (see below for review guidelines). Introduction ------------ This is a container library that provides two containers: A. boost::devector B. boost::batch_deque Both containers provides efficient push_front and push_back operations, hence the name DoubleEnded. The boost::devector class can be seen as an extension of std::vector with efficient support for push_front (so boost::devector provides contiguous storage). In addition, it provides a customizable small buffer optimization like boost::small_vector(*). It furthermore allows for customization of the growth factor such that the user can decide if a reallocation should allocate 1.5 or 2 or whatever more memory. And finally it provides new functionality that does not exist in normal vector implementations -- this functionality allow the class to excel in performance in certain situations like serialization or transfer of data from sources that require a preallocated buffer. The boost::batch_deque is similar to std::deque, but with two important twists. Like std::deque, it provides reference stability when calling push_front/push_back. Reference stability is crucial for programs that uses intrusive containers e.g. using Boost.Intrusive (**). boost::batch_deque has the following main advantages: 1. It allows the user to specify the segment size 2. It allows for external iteration over the segments Taken together, they provide a very efficient and flexible basis for intrusive containers. Access to the segments also increases performance for tasks such as serialization. Finally, boost::batch_deque may be seen as an efficient alternative to std::vector if you want to replace a percent-wise memory overhead with a constant memory overhead and have no need for contiguous storage. Review guidelines ----------------- Please provide in your review whatever information you think is valuable to understand your final choice of ACCEPT or REJECT including Fit as a Boost library. Please be explicit about your decision. Some other questions you might want to consider answering: - What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With which compiler(s)? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain? More information about the Boost Formal Review Process can be found here: http://www.boost.org/community/reviews.html Kind regards Thorsten, Review manager (*) boost::small_vector: https://tinyurl.com/ycslp4ot (**) http://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html

Show replies by date

Zach Laine

25 Sep 25 Sep

12:52 a.m.

On Thu, Sep 21, 2017 at 12:38 PM, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

Dear users and members of Boost,

I'm proud to announce that the formal review of Benedek Thaler's Boost.DoubleEnded library starts today, September 21, and runs through September 30. This library is the result of Benedek's participation in Google Summer of Code 2015.

[snip]

...

Please provide in your review whatever information you think is valuable to understand your final choice of ACCEPT or REJECT including Fit as a Boost library. Please be explicit about your decision.

I think you mean Boost.DoubleEnded, not Fit :). I vote to reject Boost.DoubleEnded.

...

Some other questions you might want to consider answering:

- What is your evaluation of the design?

I don't think it is up to Boost's standards. My first concern is that this library is centered around runtime performance, yet there appears to be no measurement done to determine what -- or even whether -- optimization occurs when using this library instead of the standard containers. My second concern is that the containers in this library are difficult to reason about. You'll find these two issues reinforced multiple times in the details of the review below. Here's the first snippet of code from the examples in the online docs: de::devector<int, de::small_buffer_size<16>> small_devector; BOOST_ASSERT(small_devector.capacity() == 16); // but no dynamic memory was allocated small_devector.push_back(2); small_devector.push_back(3); small_devector.push_back(4); small_devector.push_front(1); // allocates There are no standard containers for which capacity() == 16 implies that when there are three elements, an allocation will occur when I add a fourth. That's substantially odd. It means that as a user I must know about extrinsic state information (how many pushes were to the front or back) in order to predict memory usage and the number of allocations even roughly. The next example is this: de::devector<int> reserved_devector(32, 16, de::reserve_only_tag{}); for (int i = 0; i < 32; ++i) { reserved_devector.push_front(i); } for (int i = 0; i < 16; ++i) { reserved_devector.push_back(i); } That's not a compelling use case to me. Why can't I just do this: std::vector<int> reserved_vector; vector.reserve(48); // Fill in the values, etc. Why must there be an implicit zero-point, such that front-pushes allocate before the capacity is reached, and back-pushes allocate before the capacity is reached (but after a different number of pushes)? This is all pretty wonky. If I only care about growth in one direction, I just use a vector, and don't worry about the push-front case, because I have reverse iterators. Otherwise, I'm inclined just to use deque, because I don't have to consider these wonky semantics with that, and I have no measurements to recommend the use of this library instead. ( BTW, please create an object of your tag type: de::devector<int> reserved_devector(32, 16, de::reserve_only_tag{}); -> de::devector<int> reserved_devector(32, 16, de::reserve_only); ) For that matter, is the performance of devector better than std::deque when I don't know the relative frequency of front- and back-pushes? It seems in many cases that it would be quite a bit worse. But it seems that no one knows -- not me, not the author -- because no one has measured it. If I'm wrong about this, please share the measurements! A custom growth policy must have an integral growth factor, but a commonly used growth factor is 3/2. I don't understand why should_shrink() is part of GrowthPolicy. If I'm growing the container, why am I worried about shrinking it? Another example from the docs: de::devector<std::string> reversed_lines; reversed_lines.reserve_front(24); std::string line; while (std::getline(std::cin, line)) { reversed_lines.push_front(line); } std::cout << "Reversed lines:\n"; for (auto&& line : reversed_lines) { std::cout << line << "\n"; } How is this better than doing: std::deque<std::string> reversed_lines; std::string line; while (std::getline(std::cin, line)) { reversed_lines.push_front(line); } std::cout << "Reversed lines:\n"; for (auto it = reversed_lined.begin(), end = reversed_lines.rend()); it != end; ++it) { std::cout << line << "\n"; } ? If it's just to avoid the verbose for-loop, that's not enough for me. If it's an efficiency claim, I have no numbers to back that up, and the use of IO in the example makes it moot anyway. I need a justification for unsafe_push_front() besides "avoiding a check, thus sparing a few instructions". My intuition is that it does not matter, since branch prediction is quite good these days, and a single allocation will swallow all the gains if there are any. However, *measurement* is the only thing that can make this justification, pro or con. Again, from the example docs: // int sockfd = socket(...); connect(...); de::devector<char> buffer(256, de::unsafe_uninitialized_tag{}); long recvSize = recv(sockfd, buffer.data(), buffer.size(), 0); if (recvSize >= 0) { buffer.unsafe_uninitialized_resize_back(recvSize); // process contents of buffer } else { /* handle error */ } The unsafe resize does nothing in this case, since char has a trivial dtor. For nontrivially destructed types, we get RAII violations. I don't know how to use a type that does that.

...

- What is your evaluation of the implementation?

I did not look.

...

- What is your evaluation of the documentation?

There are a lot of aspects of the docs I found to be unclear. The Benchmarks section contains no benchmarks. There is this: " Reference stability is an important feature of the batch_deque. Inserting elements to the front or to the back of the sequence never invalidates any references. Moreover, a special insert operation is provided, stable_insert, which takes an iterator, as a hint, and inserts a sequence of elements (not necessary directly) before the element pointed by the hint, in a way that it doesn't invalidate references. " I find that very difficult to understand. It seems to be saying that stable_insert() does not always insert where you wanted in the sequence, because the insertion point is just a hint. I think what is intended is that the insertion will not necessarily be contiguous (or at least I hope that's what is intended). These are other examples like this, but until the high-level design issues are resolved, I don't want to take the time to get into all the specifics.

...

- What is your evaluation of the potential usefulness of the library?

It does not seem useful to me, except that the ability to specify the block-size for a deque would sometimes come in handy.

...

- Did you try to use the library? With which compiler(s)? Did you have any problems?

I did not.

...

- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?

I spent 4-5 hours. - Are you knowledgeable about the problem domain?

...

Yes. Zach

Thorsten Ottosen

5:17 p.m.

Hi Zach, Thanks for spending a lot of time thinking about the library. Some minor points for discussion follow.

...

My first concern is that this library is centered around runtime performance, yet there appears to be no measurement done to determine what -- or even whether -- optimization occurs when using this library instead of the standard containers.

For devector: ------------- Is it it not obvious that an O(1) push_front performs better than an O(n) vec.insert( vec.begin(), x ) ? Is it not obvious that having a small buffer optimization can be useful? boost::small_vector does it. Nearly all std::string does it. stlsoft::auto_buffer has done it for a decade, described in detail in Matthew Wilson's book. Chandler Carruth had the following presentation last year https://www.youtube.com/watch?v=vElZc6zSIXM about small buffer optimizations use in Clang data structures. For batch_deque: ---------------- With std::deque you cannot control the segment size, and implementations differ widely. Is it not obvious that having a block size of 1 or 2 (as e.g. visual c++ does) is very fast. That iteration over a deque can be slow has been known for two decades, Matt Austern wrote an article about it.

...

My second concern is that the containers in this library are difficult to reason about.

Look at it this way: devector is like vector with O(1) front insertion; batch_deque is like deque with custom control over the segment size. Are you opposed to such containers or is it "all the extra" stuff (which you don't have to use btw) that makes you uneasy? Serialization: -------------- std types are over encapsulated which means that they can't avoid double initialization, nor process elements in batches. Is it not obvious that double initialization is more expensive than single initialization?

...

For that matter, is the performance of devector better than std::deque when I don't know the relative frequency of front- and back-pushes? It seems in many cases that it would be quite a bit worse.

You are comparing apples and oranges. push_back on vector may be worse than on deque (just try to store array<T,1024>). It will depend on the situation, how expensive move operations are, how expensive allocation is etc. But if you need or want a contiguous layout, you can't use deque.

...

Another example from the docs:

de::devector<std::string> reversed_lines; reversed_lines.reserve_front(24);

[snip]

...

? If it's just to avoid the verbose for-loop, that's not enough for me. If it's an efficiency claim, I have no numbers to back that up, and the use of IO in the example makes it moot anyway.

It's not about rewriting a loop. It about having a different layout for your container's sequence because that layout is either more natural or convenient or simply needed. The first use case for devector is actually in the implementation of batch_deque. One could use some deque implementation, but that would not be optimal for performance. Being able to use devector there has made the code so much simpler.

...

I need a justification for unsafe_push_front() besides "avoiding a check, thus sparing a few instructions". My intuition is that it does not matter, since branch prediction is quite good these days, and a single allocation will swallow all the gains if there are any. However, *measurement* is the only thing that can make this justification, pro or con.

Yes, and what do you do when you actually measure and find out push_back is the bottleneck? vector has both operator[] and at(), and I'm pretty sure people would go mad if they were forced to use an operator[] that did what at() does. So why is push_back any different? You have basically the same situation: two functions that differ only slightly in their precondition.

...

Again, from the example docs:

// int sockfd = socket(...); connect(...); de::devector<char> buffer(256, de::unsafe_uninitialized_tag{}); long recvSize = recv(sockfd, buffer.data(), buffer.size(), 0);

if (recvSize >= 0) { buffer.unsafe_uninitialized_resize_back(recvSize); // process contents of buffer } else { /* handle error */ }

The unsafe resize does nothing in this case, since char has a trivial dtor.

It changes the size of the container so you can't read uninitialized values. kind regards -Thorsten

Zach Laine

26 Sep 26 Sep

1:39 a.m.

On Mon, Sep 25, 2017 at 12:17 PM, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

Hi Zach,

Thanks for spending a lot of time thinking about the library. Some minor points for discussion follow.

My first concern is that this library is centered around runtime

...
performance, yet there appears to be no measurement done to determine what -- or even whether -- optimization occurs when using this library instead of the standard containers.

For devector: -------------

Is it it not obvious that an O(1) push_front performs better than an O(n) vec.insert( vec.begin(), x ) ?

I find the use of the word "obvious" when applied to optimization to be a little suspect. Also, that statement is not quite true. push_front() is *amortized* O(1), not O(1). A particular call to push_front() is actually O(N). But the larger point to me is that I can use vector when I care about amortized O(1) growth in either direction. I only care about devector or deque when I need amortized O(1) growth in *both* directions. Then, it is a question of whether the relative performance recommends devector or deque. As of this writing, I have to write benchmarks or do a lot of custom profiling to figure out which works better. My strong preference would be that the author provide sufficient measurements of a few use cases of both of these data structures at various values of N, so that I can make a reasonable educated guess when first selecting the data structure to use. It won't allow me to elide profiling altogether, but it will get me going in the right direction more often, with less effort for me as a user.

...

Is it not obvious that having a small buffer optimization can be useful? boost::small_vector does it. Nearly all std::string does it. stlsoft::auto_buffer has done it for a decade, described in detail in Matthew Wilson's book. Chandler Carruth had the following presentation last year

https://www.youtube.com/watch?v=vElZc6zSIXM

about small buffer optimizations use in Clang data structures.

You don't need to justify SBO to me; I'm all for it. We already have boost::small_vector though, as you point out. Furthermore, that particular optimization only pays off when N is small. When N is small, amortization doesn't really pay off. So, again, since I would only use devector/deque when I know I need growth in *both* directions, I would want to see the relative strengths and weaknesses of deque vs. devector to make an informed decision.

...

For batch_deque: ----------------

With std::deque you cannot control the segment size, and implementations differ widely. Is it not obvious that having a block size of 1 or 2 (as e.g. visual c++ does) is very fast.

Right. Control of the segment size is the one thing I called out in my review as an unambiguous positive.

...

That iteration over a deque can be slow has been known for two decades, Matt Austern wrote an article about it.

... and this should have been the other. I'm aware of that work, and I'd like to see a deque that features configurable segment size and segmented iteration support. As an aside, my experiments with segmented iteration have been of mixed utility. The standard algorithms useful on segments is limited, since they cannot see across segment boundaries. For instance, adjacent_difference() does not work without some gymnastics.

...

My second concern is that the containers in this library are difficult to

...
reason about.

Look at it this way: devector is like vector with O(1) front insertion; batch_deque is like deque with custom control over the segment size.

Are you opposed to such containers or is it "all the extra" stuff (which you don't have to use btw) that makes you uneasy?

Up until now, in this post all my objections have been related to runtime efficiency. But now you've brought back up a major complaint that I have, which is the unsafety of the interface. To say that I don't have to use it is a bit disingenuous to me. Adding an unsafe API weakens the invariants of the type, making it more difficult to reason about and use. The idea that I can add elements that will be manually destructed but that are not yet constructed is a serious defect in my opinion. Any throwing operation I do between the creation of such elements and their construction is now unsafe.

...

Serialization: --------------

std types are over encapsulated which means that they can't avoid double initialization, nor process elements in batches. Is it not obvious that double initialization is more expensive than single initialization?

In the abstract, of course you cannot argue with that. I don't find it to be a useful question, though. The questions I have are: 1) In my specific case, do I care (that is, is it measurable)? In many cases, a type will be expensive to copy or non-default-construct, but trivial to default construct. 2) If the answer to #1 is yes, is reserve() not sufficient? I would expect the answer to one of those two questions to be "no" in almost all cases. It's not that you'd *never* answer "yes" to both, it's just that this represents a low-frequency corner.

...

For that matter, is the performance of devector better than std::deque when

...
I don't know the relative frequency of front- and back-pushes? It seems in many cases that it would be quite a bit worse.

You are comparing apples and oranges. push_back on vector may be worse than on deque (just try to store array<T,1024>).

Re-reading that. I think I didn't state my point very clearly. What I was trying to say is that the zero-point implied in having separate front and back capacities means that I need to guess the relative frequency of front and back pushes if I want the benefits of reserving space. When you combine that with the fact that growing a devector requires copying every element, it may be a pessimization to use one over a deque. To me, that's not an apples/oranges comparison, because my use of the entire library boils down to the case where I know I need growth in *both* directions, and now I must determine whether devector or deque is a better fit. It will depend on the situation, how expensive move operations are, how

...

expensive allocation is etc. But if you need or want a contiguous layout, you can't use deque.

Another example from the docs:

...
de::devector<std::string> reversed_lines; reversed_lines.reserve_front(24);

[snip]

? If it's just to avoid the verbose for-loop, that's not enough for me.

...
If it's an efficiency claim, I have no numbers to back that up, and the use of IO in the example makes it moot anyway.

It's not about rewriting a loop. It about having a different layout for your container's sequence because that layout is either more natural or convenient or simply needed.

Fair enough. Sometimes contiguity is very important.

...

The first use case for devector is actually in the implementation of batch_deque. One could use some deque implementation, but that would not be optimal for performance.

For small objects or large? For what values of N? For what pattern of calls? Only for double-ended growth, or for single-ended growth as well? I don't necessarily doubt your claim, I just don't have the relative superiority of devector quantified in any way that I can use.

...

Being able to use devector there has made the code so much simpler.

It's not obvious to me why. Could you elaborate?

...

I need a justification for unsafe_push_front() besides "avoiding a check,

...
thus sparing a few instructions". My intuition is that it does not matter, since branch prediction is quite good these days, and a single allocation will swallow all the gains if there are any. However, *measurement* is the only thing that can make this justification, pro or con.

Yes, and what do you do when you actually measure and find out push_back is the bottleneck?

vector has both operator[] and at(), and I'm pretty sure people would go mad if they were forced to use an operator[] that did what at() does. So why is push_back any different? You have basically the same situation: two functions that differ only slightly in their precondition.

First, please demonstrate that I should care about those few instructions. Second, I agree that the difference is slight, but I disagree that it's ok to make that slight change. One of the chief invariants of a standard container is that it manages its storage. This breaks that. Dangerous interfaces are not necessarily evil, but they certainly need to be justified. I would need significant use cases with significant performance benefits (not night-and-day, but significant) to tolerate this unsafety.

...

...
Again, from the example docs:

// int sockfd = socket(...); connect(...); de::devector<char> buffer(256, de::unsafe_uninitialized_tag{}); long recvSize = recv(sockfd, buffer.data(), buffer.size(), 0);

if (recvSize >= 0) { buffer.unsafe_uninitialized_resize_back(recvSize); // process contents of buffer } else { /* handle error */ }

The unsafe resize does nothing in this case, since char has a trivial dtor.

It changes the size of the container so you can't read uninitialized values.

That's true, though some sort of span would accomplish the same thing with no loss of safety. In the case that the unsafe uninitialized resize grows the container, leaving storage around that is not yet constructed, but that the dtor will destruct, is very dangerous. If I distill all my objections down it comes to these three points, I think: 1) This library seems to cover small corner cases. 2) The API is unsafe. 3) I don't have any reason to expect that the efficiency gains in the corner cases are substantial enough to justify the unsafety, or even that there are positive gains in my particular corner case. Zach

Thorsten Ottosen

3:59 p.m.

Den 26-09-2017 kl. 03:39 skrev Zach Laine via Boost:

...

On Mon, Sep 25, 2017 at 12:17 PM, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

...
Is it it not obvious that an O(1) push_front performs better than an O(n) vec.insert( vec.begin(), x ) ?

I find the use of the word "obvious" when applied to optimization to be a little suspect.

[snip]

...

But the larger point to me is that I can use vector when I care about amortized O(1) growth in either direction. I only care about devector or deque when I need amortized O(1) growth in *both* directions.

Can you explain how that works for vector when growing the front?

...

Then, it is a question of whether the relative performance recommends devector or deque. As of this writing, I have to write benchmarks or do a lot of custom profiling to figure out which works better.

My strong preference would be that the author provide sufficient measurements of a few use cases of both of these data structures at various values of N, so that I can make a reasonable educated guess when first selecting the data structure to use. It won't allow me to elide profiling altogether, but it will get me going in the right direction more often, with less effort for me as a user.

I agree that is not a bad thing to add to the documentation.

...

As an aside, my experiments with segmented iteration have been of mixed utility. The standard algorithms useful on segments is limited, since they cannot see across segment boundaries. For instance, adjacent_difference() does not work without some gymnastics.

We recently got Boost.PolyCollection. It has specialized algorithms: https://tinyurl.com/yaffvebf I'm wondering if its possible to make these more general. Basically, a poly-collection stores a vector for each dynamic type, and this is a bit like a segment (although segment size may differ).

...

...
Are you opposed to such containers or is it "all the extra" stuff (which you don't have to use btw) that makes you uneasy?

Up until now, in this post all my objections have been related to runtime efficiency. But now you've brought back up a major complaint that I have, which is the unsafety of the interface. To say that I don't have to use it is a bit disingenuous to me. Adding an unsafe API weakens the invariants of the type, making it more difficult to reason about and use. The idea that I can add elements that will be manually destructed but that are not yet constructed is a serious defect in my opinion. Any throwing operation I do between the creation of such elements and their construction is now unsafe.

For the record, I'm not trying to be disingeneous. Let me explain. The invariant of the container remains the same no matter you do. I dislike the word "unsafe" used in the naming of such functions, but that is what Benedek has chosen. For example, I don't think a function should be labeled unsafe just because it has one extra precondition compared to a similar function. If we did that, quite a few "unsafe" labels would appear in a normal high-level C++ program. Now, the functions that allow you to avoid double initialization can be misused. They have a strict contract if you will. Does that mean no library can ever contain such functionality? My personal view is that if containers over-encapsulate in order to be "safe" (or "safer"), but sacrifice near-optimal performance, some users will chose not to use them. And what they fall back on is going to be so much worse than using a container that is built to help them (worse in terms of exception-safety, abstraction, low-levelness of code). So what has been added to devector is the minimal thing needed to accomplish the goal of giving the user the choice of getting rid of double initialization. So if we chose not to add such functionality, it will make the container less useful to some (not all) people. And that is perhaps why I don't understand your argument. Yes, some of the function are more difficult to use, but only if actually use them. And those very few who do need to use them will find them much easier to use than the alternatives.

...

...
Serialization: --------------

std types are over encapsulated which means that they can't avoid double initialization, nor process elements in batches. Is it not obvious that double initialization is more expensive than single initialization?

In the abstract, of course you cannot argue with that. I don't find it to be a useful question, though. The questions I have are:

1) In my specific case, do I care (that is, is it measurable)? In many cases, a type will be expensive to copy or non-default-construct, but trivial to default construct. 2) If the answer to #1 is yes, is reserve() not sufficient?

I would expect the answer to one of those two questions to be "no" in almost all cases. It's not that you'd *never* answer "yes" to both, it's just that this represents a low-frequency corner.

No matter how we program, I think it's safe to say that container<int> or container<char> is not going away any time soon. So the answer to your question is that "it depends". If you high-level objects are build upon low-level things, making those low-level things faster can certainly impact the high-level things.

...

...
For that matter, is the performance of devector better than std::deque when

...
I don't know the relative frequency of front- and back-pushes? It seems in many cases that it would be quite a bit worse.

You are comparing apples and oranges. push_back on vector may be worse than on deque (just try to store array<T,1024>).

Re-reading that. I think I didn't state my point very clearly. What I was trying to say is that the zero-point implied in having separate front and back capacities means that I need to guess the relative frequency of front and back pushes if I want the benefits of reserving space.

Well, there is nothing that requires you to use reserve_front/reserve_back. Just like there is no one that requires you to use reserve in vector.

...

When you combine that with the fact that growing a devector requires copying every element, it may be a pessimization to use one over a deque. To me, that's not an apples/oranges comparison, because my use of the entire library boils down to the case where I know I need growth in *both* directions, and now I must determine whether devector or deque is a better fit.

We are not in disagreement here. But std::deque may be faster than std::vector for push_back. So I don't see how that can be a criticism of devector. It's not better than vector in this respect. And I agree some form of performance tests to give a rough idea of how the containers compare would be good.

...

...
The first use case for devector is actually in the implementation of batch_deque. One could use some deque implementation, but that would not be optimal for performance.

For small objects or large? For what values of N? For what pattern of calls? Only for double-ended growth, or for single-ended growth as well? I don't necessarily doubt your claim, I just don't have the relative superiority of devector quantified in any way that I can use.

If you write a deque like container, you need some data-structure to hold the segments. For some reason, implementations of deque tends to call this a map. Anyway, this "map" needs to support amortized O(1) growth at both ends in order for the deque class to guarantee amortized O(1) growth at both ends. What all the implementations I know of do is then to create this map as an internal data-structure. This clutters the code considerably compared to what Benedek has done. The reason that you don't want to use deque as a building block for batch_deque is that it destroys "map" locality.

...

...
I need a justification for unsafe_push_front() besides "avoiding a check,

...
thus sparing a few instructions". My intuition is that it does not matter, since branch prediction is quite good these days, and a single allocation will swallow all the gains if there are any.

[snip]

...

First, please demonstrate that I should care about those few instructions. Second, I agree that the difference is slight, but I disagree that it's ok to make that slight change. One of the chief invariants of a standard container is that it manages its storage. This breaks that.

It's not an invariant of vector that push_back allocates memory. unsafe_push_back does not change anything wrt. what the container manages.

...

Dangerous interfaces are not necessarily evil, but they certainly need to be justified. I would need significant use cases with significant performance benefits (not night-and-day, but significant) to tolerate this unsafety.

It might be hard to make you personally care. But even if you don't personally care, rest assured that some people will care. And I do think the documentation should be improved. But the simplicity (or not) of a function affects a lot of things: - exception specification: unsafe_push_back can be made conditionally noexcept, push_back cannot (it is not right now, but it should be). - inlineability: the simpler the function, the higher the chance it get's inlined - register allocation: the simpler the function, the better for the optimizer - branch prediction: on some platforms this works well, on other not so much which may depending on situation/platform be a big deal. But seriously, we should not call functions with one extra precondition "dangerous" or "unsafe". It's still much better than the alternative, and as I explained above, we should strive to make the inherently low-level and unsafe code obsolete. And the only way to do that is to present an alternative that is just as per formant (or better) yet offers a superior high-level approach. We don't want people to use local arrays and manage the size of it manually.

...

...
...
buffer.unsafe_uninitialized_resize_back(recvSize);

...

...
It changes the size of the container so you can't read uninitialized values.

That's true, though some sort of span would accomplish the same thing with no loss of safety.

But then you just postponed the double initialization to when the span needs to be copied into a proper container! The whole point is that we don't want to forego the great benefit of a container just because we are interfacing with a low-level API. kind regards Thorsten

Zach Laine

5:22 p.m.

On Tue, Sep 26, 2017 at 10:59 AM, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

Den 26-09-2017 kl. 03:39 skrev Zach Laine via Boost:

...
On Mon, Sep 25, 2017 at 12:17 PM, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

[snip]

...

But the larger point to me is that I can use vector when I care

...
about amortized O(1) growth in either direction. I only care about devector or deque when I need amortized O(1) growth in *both* directions.

Can you explain how that works for vector when growing the front?

Sure: std::vector<int> v; v.push_back(1); v.push_back(2); v.push_back(3); for (auto it = v.rbegin(), end = rend(); it != end; ++it) { // whatever } Just as you don't care whether your stack grow up or down, you don't care whether you're pushing to the "front" or "back" if you only push/pop to one side -- because you can always view the resulting sequence in reverse. As an aside, my experiments with segmented iteration

...

...
have been of mixed utility. The standard algorithms useful on segments is limited, since they cannot see across segment boundaries. For instance, adjacent_difference() does not work without some gymnastics.

We recently got Boost.PolyCollection. It has specialized algorithms:

https://tinyurl.com/yaffvebf

I'm wondering if its possible to make these more general.

I think that would be interesting work, perhaps something we could even standardize.

...

Are you opposed to such containers or is it "all the extra" stuff (which

...
...
you don't have to use btw) that makes you uneasy?

Up until now, in this post all my objections have been related to runtime efficiency. But now you've brought back up a major complaint that I have, which is the unsafety of the interface. To say that I don't have to use it is a bit disingenuous to me. Adding an unsafe API weakens the invariants of the type, making it more difficult to reason about and use. The idea that I can add elements that will be manually destructed but that are not yet constructed is a serious defect in my opinion. Any throwing operation I do between the creation of such elements and their construction is now unsafe.

For the record, I'm not trying to be disingeneous. Let me explain.

The invariant of the container remains the same no matter you do.

They do not. The most basic container invariant is RAII. That is, elements that have been constructed in the container will be destructed properly when the container is destructed, and -- importantly -- nothing else. I should not have to worry about destruction of not-yet-constructed elements for which there is storage allocated (as with unsafe_uninitialized* APIs). It does not matter that I do not use these APIs. If they exist, then later under maintenance I or a coworker may add use of them. I therefore have to write code that operates under the invariants of the type, not my particular use of it. But again, don't get me wrong -- I'm not opposed to doing these kinds of low-level hacks to get performance. For instance, move operations are unsafe in a similar fashion (though not as unsafe, IMO), and I use them all the time. The issue to me is that these are high-level container types, and yet they have invariants that are less safe than std::array. I feel they are not sufficiently justified in such a container. By the way, there are other ways to get similar performance gains without sacrificing safety: template <typename T> struct devector_guts { // low level unsafe stuff goes here, "here be dragons", etc. T * elements_; size_t size_; size_t capacity_; }; template <typename T> struct devector { // only safe operations here devector_guts steal_guts () &&; // returns guts; empties *this private: devector_guts guts_; }; [snip] For that matter, is the performance of devector better than std::deque when

...

...
...
...
I don't know the relative frequency of front- and back-pushes? It seems in many cases that it would be quite a bit worse.

You are comparing apples and oranges. push_back on vector may be worse than on deque (just try to store array<T,1024>).

Re-reading that. I think I didn't state my point very clearly. What I was trying to say is that the zero-point implied in having separate front and back capacities means that I need to guess the relative frequency of front and back pushes if I want the benefits of reserving space.

Well, there is nothing that requires you to use reserve_front/reserve_back. Just like there is no one that requires you to use reserve in vector.

Sure. I'm trying to point out that when I *do* want to use it, the current separate-and-fixed scheme is hard to use, because you must guess something close to the relative ratio of front and back pushes.

...

When you

...
combine that with the fact that growing a devector requires copying every element, it may be a pessimization to use one over a deque. To me, that's not an apples/oranges comparison, because my use of the entire library boils down to the case where I know I need growth in *both* directions, and now I must determine whether devector or deque is a better fit.

We are not in disagreement here. But std::deque may be faster than std::vector for push_back. So I don't see how that can be a criticism of devector. It's not better than vector in this respect.

It's not a criticism of devector at all. It's an open question for which I have no answer, meaning I have to 1) do a lot of homework (profiling, etc.) before knowing if I should even consider devector, and 2) weigh any possible gains in performance against loss of ability to reason about my code. It is, however, a criticism of the library, which should do this profiling homework and proper encapsulation for me.

...

And I agree some form of performance tests to give a rough idea of how the containers compare would be good.

The first use case for devector is actually in the implementation of

...
...
batch_deque. One could use some deque implementation, but that would not be optimal for performance.

For small objects or large? For what values of N? For what pattern of calls? Only for double-ended growth, or for single-ended growth as well? I don't necessarily doubt your claim, I just don't have the relative superiority of devector quantified in any way that I can use.

If you write a deque like container, you need some data-structure to hold the segments. For some reason, implementations of deque tends to call this a map. Anyway, this "map" needs to support amortized O(1) growth at both ends in order for the deque class to guarantee amortized O(1) growth at both ends. What all the implementations I know of do is then to create this map as an internal data-structure. This clutters the code considerably compared to what Benedek has done.

The reason that you don't want to use deque as a building block for batch_deque is that it destroys "map" locality.

Thanks for the explanation. That makes sense.

...

I need a justification for unsafe_push_front() besides "avoiding a check,

...
...
...
thus sparing a few instructions". My intuition is that it does not matter, since branch prediction is quite good these days, and a single allocation will swallow all the gains if there are any.

[snip]

First, please demonstrate that I should care about those few instructions.

...
Second, I agree that the difference is slight, but I disagree that it's ok to make that slight change. One of the chief invariants of a standard container is that it manages its storage. This breaks that.

It's not an invariant of vector that push_back allocates memory.

Sure it is. 26.3.11.5/1 says: template reference emplace_back(Args&&... args); template iterator emplace(const_iterator position, Args&&... args); void push_back(const T& x); void push_back(T&& x); Remarks: Causes reallocation if the new size is greater than the old capacity. Reallocation invalidates all the references...

...

Dangerous

...
interfaces are not necessarily evil, but they certainly need to be justified. I would need significant use cases with significant performance benefits (not night-and-day, but significant) to tolerate this unsafety.

It might be hard to make you personally care. But even if you don't personally care, rest assured that some people will care.

I *do* care about maximizing efficiency. I just want any unsafe code I have to write to be encapsulated so that I can unit-test it, and not care about the unsafe details. To the extent that the unsafe details leak out of the interface, I need *justification for* -- not complete absence of -- the unsafe bits.

...

And I do think the documentation should be improved. But the simplicity (or not) of a function affects a lot of things:

- exception specification: unsafe_push_back can be made conditionally noexcept, push_back cannot (it is not right now, but it should be).

If it can be, we should do it in Boost. If nothing else, that will serve as an existence proof of successful industrial use of that approach, making it a lot easier to sell to the committee.

...

- inlineability: the simpler the function, the higher the chance it get's inlined - register allocation: the simpler the function, the better for the optimizer - branch prediction: on some platforms this works well, on other not so much

which may depending on situation/platform be a big deal.

I agree with all that.

...

But seriously, we should not call functions with one extra precondition "dangerous" or "unsafe". It's still much better than the alternative, and as I explained above, we should strive to make the inherently low-level and unsafe code obsolete. And the only way to do that is to present an alternative that is just as per formant (or better) yet offers a superior high-level approach. We don't want people to use local arrays and manage the size of it manually.

I think that the unsafety comes from changing the invariants in such a way that I can no longer reason about lifetimes, etc. And going straight to handwritten code over arrays is IMO a red herring. A second unsafe tier can be provided with the desired operations, that everyone knows is unsafe, is truly opt0in, etc.

...

buffer.unsafe_uninitialized_resize_back(recvSize);

...
...
...
It changes the size of the container so you can't read uninitialized

...
values.

That's true, though some sort of span would accomplish the same thing with no loss of safety.

But then you just postponed the double initialization to when the span needs to be copied into a proper container!

Well, in the limited case where I copy it, sure. But that's also true if I copy buffer after calling buffer.unsafe_uninitialized_resize_back(recvSize); above. The point is you want to use the [0, recvSize) subsequence of buffer, and you don't need to have a unsafe_uninitialized_resize_back() member to do that.

...

The whole point is that we don't want to forego the great benefit of a container just because we are interfacing with a low-level API.

I think we can actually have both. Zach

Thorsten Ottosen

27 Sep 27 Sep

2:37 p.m.

Den 26-09-2017 kl. 19:22 skrev Zach Laine via Boost:

...

On Tue, Sep 26, 2017 at 10:59 AM, Thorsten Ottosen via Boost <

...

...
Can you explain how that works for vector when growing the front?

Sure:

std::vector<int> v;

v.push_back(1); v.push_back(2); v.push_back(3);

for (auto it = v.rbegin(), end = rend(); it != end; ++it) { // whatever }

Just as you don't care whether your stack grow up or down, you don't care whether you're pushing to the "front" or "back" if you only push/pop to one side -- because you can always view the resulting sequence in reverse.

Got it, thanks.

...

...
Are you opposed to such containers or is it "all the extra" stuff (which

...
...
you don't have to use btw) that makes you uneasy?

Up until now, in this post all my objections have been related to runtime efficiency. But now you've brought back up a major complaint that I have, which is the unsafety of the interface. To say that I don't have to use it is a bit disingenuous to me. Adding an unsafe API weakens the invariants of the type, making it more difficult to reason about and use. The idea that I can add elements that will be manually destructed but that are not yet constructed is a serious defect in my opinion. Any throwing operation I do between the creation of such elements and their construction is now unsafe.

For the record, I'm not trying to be disingeneous. Let me explain.

The invariant of the container remains the same no matter you do.

They do not. The most basic container invariant is RAII. That is, elements that have been constructed in the container will be destructed properly when the container is destructed, and -- importantly -- nothing else.

I could have been clearer: the invariant of the type is something static. An object of that type should maintain the invariant. And some of these special functions require you to maintain the invariant of the object momentarily.

...

But again, don't get me wrong -- I'm not opposed to doing these kinds of low-level hacks to get performance. For instance, move operations are unsafe in a similar fashion (though not as unsafe, IMO), and I use them all the time.

The issue to me is that these are high-level container types, and yet they have invariants that are less safe than std::array. I feel they are not sufficiently justified in such a container.

By the way, there are other ways to get similar performance gains without sacrificing safety:

template <typename T> struct devector_guts { // low level unsafe stuff goes here, "here be dragons", etc.

T * elements_; size_t size_; size_t capacity_; };

template <typename T> struct devector { // only safe operations here

devector_guts steal_guts () &&; // returns guts; empties *this

private: devector_guts guts_; };

maybe a slightly more encapsulated approach would be: devector<T> devec; ... devector_access<T> devec_access( devec ); // a non-copyable class that stores reference to devector<T> devec_access.nongrowing_push_back( x ); devec_access.uninitialized_resize( k ); // etc So whenever you have to do something a bit out of the ordinary, you do it via devector_access. That type is then in seperate header and documented under advanced usage. ?

...

...
Well, there is nothing that requires you to use reserve_front/reserve_back. Just like there is no one that requires you to use reserve in vector.

Sure. I'm trying to point out that when I *do* want to use it, the current separate-and-fixed scheme is hard to use, because you must guess something close to the relative ratio of front and back pushes.

I don't get this. Why do one want to use it if one has no clue about what to expect?

...

It's not a criticism of devector at all. It's an open question for which I have no answer, meaning I have to 1) do a lot of homework (profiling, etc.) before knowing if I should even consider devector, and 2) weigh any possible gains in performance against loss of ability to reason about my code.

It is, however, a criticism of the library, which should do this profiling homework and proper encapsulation for me.

I agree that it would be good with some basic tests. Its hard to cover everything because there are so many situations. Its not something every Boost library is doing, take Boost.Container where I don't see any benchmarks.

...

...
[snip]

First, please demonstrate that I should care about those few instructions.

...
Second, I agree that the difference is slight, but I disagree that it's ok to make that slight change. One of the chief invariants of a standard container is that it manages its storage. This breaks that.

It's not an invariant of vector that push_back allocates memory.

Sure it is. 26.3.11.5/1 says:

[snip]

...

Remarks: Causes reallocation if the new size is greater than the old capacity. Reallocation invalidates all the references...

I think we operate with different definitions of (informal) invariants. In my world something is invariant if it always holds. push_back may allocate, but it certainly doesn't always. I would rather say that the postcondition for push_back is so and so. Anyway, this is not important ...

...

...
buffer.unsafe_uninitialized_resize_back(recvSize);

...
...
...
It changes the size of the container so you can't read uninitialized

...
values.

That's true, though some sort of span would accomplish the same thing with no loss of safety.

But then you just postponed the double initialization to when the span needs to be copied into a proper container!

Well, in the limited case where I copy it, sure. But that's also true if I copy buffer after calling

buffer.unsafe_uninitialized_resize_back(recvSize);

above. The point is you want to use the [0, recvSize) subsequence of buffer, and you don't need to have a unsafe_uninitialized_resize_back() member to do that.

Maybe you can make your span-based snippet a little more concrete? In particular, where does the memory come from? While doing so, assume that the memory has to outlive the function that calls recv() and that the correctly initialized span has to be iterated over at some point after the function that calls recv(). kind regards Thorsten

Hans Dembinski

26 Sep 26 Sep

4:09 p.m.

...

On 26. Sep 2017, at 03:39, Zach Laine via Boost <boost@lists.boost.org> wrote:

On Mon, Sep 25, 2017 at 12:17 PM, Thorsten Ottosen via Boost < boost@lists.boost.org <mailto:boost@lists.boost.org>> wrote:

...
For devector: -------------

Is it it not obvious that an O(1) push_front performs better than an O(n) vec.insert( vec.begin(), x ) ?

I find the use of the word "obvious" when applied to optimization to be a little suspect.

I wanted to say the same. I got my fair share of surprises when I benchmarked code that I expected to run faster than some other code. I can recommend Google Benchmark for micro-benchmarks, it is quite neat. https://github.com/google/benchmark Best regards, Hans

Soul Studios

27 Sep 27 Sep

2:18 a.m.

...

With std::deque you cannot control the segment size, and implementations differ widely. Is it not obvious that having a block size of 1 or 2 (as e.g. visual c++ does) is very fast.

Visual studio's implementation is almost brokenly slow compared to others.

...

That iteration over a deque can be slow has been known for two decades, Matt Austern wrote an article about it.

Not true. Libstdc++'s implementation iterates almost as fast as vector except for large structs (due to a fixed memory-based block size). http://www.plflib.org/benchmarks_haswell_gcc.htm#rawperformance M

Thorsten Ottosen

2:47 p.m.

Den 27-09-2017 kl. 04:18 skrev Soul Studios via Boost:

...

...
With std::deque you cannot control the segment size, and implementations differ widely. Is it not obvious that having a block size of 1 or 2 (as e.g. visual c++ does) is very fast.

Visual studio's implementation is almost brokenly slow compared to others.

I totally agree.

...

...
That iteration over a deque can be slow has been known for two decades, Matt Austern wrote an article about it.

Not true. Libstdc++'s implementation iterates almost as fast as vector except for large structs (due to a fixed memory-based block size). http://www.plflib.org/benchmarks_haswell_gcc.htm#rawperformance

Nice work, btw. Note I said "can" not "must". FWIW, batch_deque also uses a compile-time block size. kind regards Thorsten

Soul Studios

28 Sep 28 Sep

12:38 a.m.

...

...
...
With std::deque you cannot control the segment size, and implementations differ widely. Is it not obvious that having a block size of 1 or 2 (as e.g. visual c++ does) is very fast.

Visual studio's implementation is almost brokenly slow compared to others.

I totally agree.

Okay, cool - my interpretation of your prev writing was that you believed it was obvious that having a block size of 1 or 2 was fast - but obviously I'm missing something here.

...

...
...
That iteration over a deque can be slow has been known for two decades, Matt Austern wrote an article about it.

Not true. Libstdc++'s implementation iterates almost as fast as vector except for large structs (due to a fixed memory-based block size). http://www.plflib.org/benchmarks_haswell_gcc.htm#rawperformance

Nice work, btw.

Note I said "can" not "must". FWIW, batch_deque also uses a compile-time block size.

Ah - fair enough. And thanks.

Benedek Thaler

25 Sep 25 Sep

9:13 p.m.

On Mon, Sep 25, 2017 at 2:52 AM, Zach Laine via Boost <boost@lists.boost.org

...

wrote:

...

I vote to reject Boost.DoubleEnded.

Thanks for your review, I really appreciate your time.

...

My first concern is that this library is centered around runtime performance, yet there appears to be no measurement done to determine what -- or even whether -- optimization occurs when using this library instead of the standard containers.

I'm sure you've seen the `benchmark` documentation page, and the linked ML thread: https://lists.boost.org/Archives/boost/2016/02/227743.php I wasn't able to create a convincing micro-benchmark for this container: It is really hard for the general case. (esp. comparing to vector, which offers only a subset of devectors features) On the other hand, a macro-benchmark would require a use-case, which makes it subjective again. I don't think there's a point in comparing performance of a small vector vs standard vector, given that the developer got the `small` size right.

...

There are no standard containers for which capacity() == 16 implies that when there are three elements, an allocation will occur when I add a fourth. That's substantially odd. It means that as a user I must know about extrinsic state information (how many pushes were to the front or back) in order to predict memory usage and the number of allocations even roughly.

No need for extrinsic state info, there are the front_free_capacity() and back_free_capacity() members.

...

de::devector<int> reserved_devector(32, 16, de::reserve_only_tag{});

[snip]

That's not a compelling use case to me. Why can't I just do this:

std::vector<int> reserved_vector; vector.reserve(48);

The reserve_only_tag is a convenient shorthand I tend to miss.

...

For that matter, is the performance of devector better than std::deque when I don't know the relative frequency of front- and back-pushes? It seems in many cases that it would be quite a bit worse.

I do agree. If you have no clue, use a deque - except, in cases when you are not supposed to use anything more complicated (see: the internal map of segments of batch_deque is a devector), or contiguous storage is required.

...

A custom growth policy must have an integral growth factor, but a commonly used growth factor is 3/2.

The growth policy takes the old capacity, and returns the new. new = uint(3/2*old) is a possible implementation. I don't think float capacity makes sense.

...

I don't understand why should_shrink() is part of GrowthPolicy. If I'm growing the container, why am I worried about shrinking it?

Because next time you can avoid growing it.

...

Again, from the example docs:

// int sockfd = socket(...); connect(...); de::devector<char> buffer(256, de::unsafe_uninitialized_tag{}); long recvSize = recv(sockfd, buffer.data(), buffer.size(), 0);

if (recvSize >= 0) { buffer.unsafe_uninitialized_resize_back(recvSize); // process contents of buffer } else { /* handle error */ }

The unsafe resize does nothing in this case, since char has a trivial dtor. For nontrivially destructed types, we get RAII violations. I don't know how to use a type that does that.

One must explicitly call unsafe_uninitialized_resize_back again, after the exact size is known, avoiding the UB.

...

...
- What is your evaluation of the implementation?

I did not look.

...
- What is your evaluation of the documentation?

There are a lot of aspects of the docs I found to be unclear.

The Benchmarks section contains no benchmarks.

There is this:

" Reference stability is an important feature of the batch_deque. Inserting elements to the front or to the back of the sequence never invalidates any references. Moreover, a special insert operation is provided, stable_insert, which takes an iterator, as a hint, and inserts a sequence of elements (not necessary directly) before the element pointed by the hint, in a way that it doesn't invalidate references. "

I find that very difficult to understand. It seems to be saying that stable_insert() does not always insert where you wanted in the sequence, because the insertion point is just a hint. I think what is intended is that the insertion will not necessarily be contiguous (or at least I hope that's what is intended).

Perhaps the reference helps? http://erenon.hu/double_ended/boost/double_ended/batch_deque.html#idp4037601... Thanks, Benedek

Thorsten Ottosen

26 Sep 26 Sep

4:32 p.m.

Den 25-09-2017 kl. 23:13 skrev Benedek Thaler via Boost:

...

On Mon, Sep 25, 2017 at 2:52 AM, Zach Laine via Boost <boost@lists.boost.org

...
wrote:

...

...
There are no standard containers for which capacity() == 16 implies that when there are three elements, an allocation will occur when I add a fourth. That's substantially odd. It means that as a user I must know about extrinsic state information (how many pushes were to the front or back) in order to predict memory usage and the number of allocations even roughly.

No need for extrinsic state info, there are the front_free_capacity() and back_free_capacity() members.

1. vector's capacity says nothing about whether the next push_back will allocate 2. there are containers that allocate on every push_back/push_front, so why is this odd? 3. we have different options here. I don't suppose you object to the fact when a small buffer is specified, then the small buffer size is also the initial capacity? So the choices we have is where to distribute the capacity between front_free_capacity and back_free_capacity: A. divide it as evenly as possible B. maximize front_free_capacity C. maximize back_free_capacity Currently, devector does (C) and provide means for the user to choose the capacity at both ends. What would you prefer instead?

...

...
I don't understand why should_shrink() is part of GrowthPolicy. If I'm growing the container, why am I worried about shrinking it?

Because next time you can avoid growing it.

To add to that: shrinking involves allocating a new buffer and copying or moving elements to the new buffer, which may be very expensive. Say you feel this is not worth doing if the amount of reclaimed memory is less than 5 percent (i.e. the vector is at +95% capacity). kind regards -Thorsten

Zach Laine

5:29 p.m.

On Tue, Sep 26, 2017 at 11:32 AM, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

Den 25-09-2017 kl. 23:13 skrev Benedek Thaler via Boost:

...
On Mon, Sep 25, 2017 at 2:52 AM, Zach Laine via Boost < boost@lists.boost.org

...
wrote:

There are no standard containers for which capacity() == 16 implies that

...
...
when there are three elements, an allocation will occur when I add a fourth. That's substantially odd. It means that as a user I must know about extrinsic state information (how many pushes were to the front or back) in order to predict memory usage and the number of allocations even roughly.

No need for extrinsic state info, there are the front_free_capacity() and back_free_capacity() members.

1. vector's capacity says nothing about whether the next push_back will allocate

That's exactly what capacity says. See my earlier post, plus 26.3.11.3/1: size_type capacity() const noexcept; Returns: The total number of elements that the vector can hold without requiring reallocation.

...

2. there are containers that allocate on every push_back/push_front, so why is this odd?

Because capacity() indicates that it does not do this, in relation to the current value of size(), when I call push_back(). Removing capacity() and not calling devector a container would alleviate this particular concern.

...

3. we have different options here. I don't suppose you object to the fact when a small buffer is specified, then the small buffer size is also the initial capacity?

No.

...

So the choices we have is where to distribute the capacity between front_free_capacity and back_free_capacity:

A. divide it as evenly as possible B. maximize front_free_capacity C. maximize back_free_capacity

Currently, devector does (C) and provide means for the user to choose the capacity at both ends. What would you prefer instead?

I don't have a better alternative. I do find the current choice clunky.

...

I don't understand why should_shrink() is part of GrowthPolicy. If I'm

...
...
growing the container, why am I worried about shrinking it?

Because next time you can avoid growing it.

To add to that: shrinking involves allocating a new buffer and copying or moving elements to the new buffer, which may be very expensive.

Say you feel this is not worth doing if the amount of reclaimed memory is less than 5 percent (i.e. the vector is at +95% capacity).

Ah, thanks. That makes sense. Zach

Thorsten Ottosen

27 Sep 27 Sep

3:26 p.m.

Den 26-09-2017 kl. 19:29 skrev Zach Laine via Boost:

...

On Tue, Sep 26, 2017 at 11:32 AM, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...
Den 25-09-2017 kl. 23:13 skrev Benedek Thaler via Boost:

...
On Mon, Sep 25, 2017 at 2:52 AM, Zach Laine via Boost < boost@lists.boost.org

...
wrote:

There are no standard containers for which capacity() == 16 implies that

...
...
when there are three elements, an allocation will occur when I add a fourth. That's substantially odd. It means that as a user I must know about extrinsic state information (how many pushes were to the front or back) in order to predict memory usage and the number of allocations even roughly.

No need for extrinsic state info, there are the front_free_capacity() and back_free_capacity() members.

1. vector's capacity says nothing about whether the next push_back will allocate

That's exactly what capacity says. See my earlier post, plus 26.3.11.3/1:

size_type capacity() const noexcept; Returns: The total number of elements that the vector can hold without requiring reallocation.

It is only capacity() - size() > 0 that will tell you if the /next/ push_back needs to allocate. As Benedek said, front_free_capacity/back_free_capacity tells you explictly how many pushes you can perform before a reallocation. And these are the very functions that are used in the precondition of unsafe_push_xxx. That said, I don't think capacity is particular useful compared to front_free_capacity/back_free_capacity. OTOH, it may seem strange that you can't tell what actual contiguous buffer size actually is. At least, if capacity is going away, so should reserve. But then you have a less generic container to use with algorithms that expects those exact functions (say, a generic algorithm that expects an ContigiousBackInsertionSequence and which can be called with vector, devector etc). Do we really want to prohibit such well-defined usage?

...

...
3. we have different options here. I don't suppose you object to the fact when a small buffer is specified, then the small buffer size is also the initial capacity?

No.

...
So the choices we have is where to distribute the capacity between front_free_capacity and back_free_capacity:

A. divide it as evenly as possible B. maximize front_free_capacity C. maximize back_free_capacity

Currently, devector does (C) and provide means for the user to choose the capacity at both ends. What would you prefer instead?

I don't have a better alternative. I do find the current choice clunky.

Benedek, please correct me if I'm wrong, but does the implementation not precisely support the following: A. push_front on empty container: devector<int> dev; assert( dev.capacity() == 0 ); dev.push_front( 42 ); assert( dev.capacity() > 0 && dev.front_free_capacity() == dev.capacity() - 1 && dev.back_free_capacity() == 0 ); A. push_front on empty container: devector<int> dev; assert( dev.capacity() == 0 ); dev.push_back( 42 ); assert( dev.capacity() > 0 && dev.back_free_capacity() == dev.capacity() - 1 && dev.front_free_capacity() == 0 ); ? If so, do you (Zach) still feels this is clunky? Would it not be weirder if the container decided that after a push_back it would allocate space in the front? Maybe the confusion is in the constructors: devector<int> cont( back_cap, reserve_only_tag{} ); devector( front_cap, back_cap, reserve_only_tag[} ); They could be specified as: devector<int> cont( reserve_back_cap{n} ); devector<int> cont( reserve_front_cap{n} ); devector<int> cont( reserve_front_cap{n}, reserve_back_cap{m} ); ? As for the capacity when there is a small buffer present, it needs to have a default. I don't think its a good idea (in general) to start moving elements around instead of growing the buffer in the direction where the free capacity is empty. (I think I can justify that based based on examples that make the amortized O(1) complexity impossible to uphold if we do so). However, one may say, as long as you are below the capacity of the small buffer, please move objects around until the small buffer is full. Does that make sense? kind regards Thorsten

Fletcher, John P

25 Sep 25 Sep

9:26 p.m.

Thorsten I am starting to look at a review of Double Ended. I have succeeded to build the devector example using Clang 3.6 and 4.0 with their corresponding libc++. When I run the example I get only one line of output: Reversed lines: This does not tell me much at all and I have not found out what it is expected to do. It seems to read a file, which is not supplied or described. I have not found anything in the documentation. On another topic, I have not seen in the documentation any discussion of models for what is going on. I am aware of Chris Okasaki's book "Purely Function Data Structures" (1998) and a lot of subsequent literature. Does this work relate at all to previous work? Answers will help me to put together a review. Best wishes John Fletcher

Benedek Thaler

9:48 p.m.

On Sep 25, 2017 11:26 PM, "Fletcher, John P via Boost" < boost@lists.boost.org> wrote: When I run the example I get only one line of output: Reversed lines: This does not tell me much at all and I have not found out what it is expected to do. It seems to read a file, which is not supplied or described. I have not found anything in the documentation. Hi, It reads the standard input. Try piping in some lines. HTH, Benedek

Fletcher, John P

26 Sep 26 Sep

5:59 p.m.

________________________________ From: Benedek Thaler [thalerbenedek@gmail.com] Sent: 25 September 2017 22:48 To: boost@lists.boost.org Cc: boost-announce@lists.boost.org; Thorsten Ottosen; Fletcher, John P Subject: Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30 On Sep 25, 2017 11:26 PM, "Fletcher, John P via Boost" <boost@lists.boost.org<mailto:boost@lists.boost.org>> wrote: When I run the example I get only one line of output: Reversed lines: This does not tell me much at all and I have not found out what it is expected to do. It seems to read a file, which is not supplied or described. I have not found anything in the documentation.

...

Hi, It reads the standard input. Try piping in some lines.

...

HTH, Benedek

Benedek Would you give me an example of what you mean? I have tried various things none of which make any sense. You are supposed to supply clear information. A review is not supposed to be a guessing game. It would help if the examples had clear information about expected outcomes. John

Benedek Thaler

9:15 p.m.

On Tue, Sep 26, 2017 at 7:59 PM, Fletcher, John P <j.p.fletcher@aston.ac.uk> wrote:

...

Benedek

Would you give me an example of what you mean? I have tried various things none of which make any sense. You are supposed to supply clear information. A review is not supposed to be a guessing game.

It would help if the examples had clear information about expected outcomes.

John

Hi John, Sorry for the confusion. The example/ directory hosts pieces of code suitable for documentation purposes, but the actual compiled example binaries has no meaningful output - it closes stdin before reading to keep testing simple. One need to take the code snippet shown in the docs and create a stand alone binary (or remove line 82 from doc_devector.cpp). Then it prints: echo -e "first\nsecond\nthird" | doc_devector Reversed lines: third second first Benedek

Tim Song

12:43 a.m.

Just some comments on devector for now, based on a quick read of the documentation and implementation. Design: I don't find the unsafe functions sufficiently motivated. Introducing footguns should require a compelling performance justification, but there are no actual benchmarks. The unsafe_push_* functions encourage heavy uses of reserve(), which, if done incorrectly, can easily prevent the growth policy from working. Moreover, the design is inconsistent. Why unsafe_push_* but not unsafe_emplace_*? What's the motivating use case for the unsafe_uninitialized functions that can't handled by a non-invariant-violating design like default_init? The sole example in the documentation is a devector<char> whose content will be filled by recv later, which can be handled equally well. Also, assuming that this container meets the allocator-aware container requirements, the user would have to construct and destroy the elements via allocator_traits<>::construct/destroy, not just placement new. Otherwise, your container's destructor would be asking its allocator to destroy something it didn't construct. I'm somewhat surprised that dropping the move_if_noexcept/strong exception safety dance doesn't seem to have been considered for a container that's supposed to be performance-oriented.

...

typedef pointer iterator; typedef const_pointer const_iterator;

Why are pointers being used as iterator types directly? Implementation:

...

class devector : Allocator

Even private derivation can interfere with overload resolution in unexpected and undesirable ways. You have enough data members around to leverage EBO without having to make devector itself derive from the allocator. https://github.com/erenon/double_ended/blob/master/include/boost/double_ende... incorrectly calls push_back rather than emplace_back. Insofar as I can determine, your clear() doesn't deallocate storage, so https://github.com/erenon/double_ended/blob/master/include/boost/double_ende... just overwrites the only allocator capable of deallocating the storage you are currently holding - without deallocating it first. More generally, implementation of allocator support requires substantial improvement. An allocator that doesn't propagate on X is not required to support X at all, but there's no handling for that in your code. Another example: construct takes raw pointers, not possibly fancy `pointer` like what you did here: https://github.com/erenon/double_ended/blob/master/include/boost/double_ende... The fast path for range-insert-at-front (https://github.com/erenon/double_ended/blob/master/include/boost/double_ende...) is broken based on a cursory inspection: vector<int> x = {1, 2, 3}; devector<int> y = {4, 5, 6}; y.reserve_front(10); y.insert(y.begin(), x.begin(), x.end()); // {3, 2, 1, 4, 5, 6} (!) Documentation: Does your push_back etc. really have "amortized linear" complexity? And doesn't that depend on the growth policy? How can clear() have the postcondition "empty() && front_free_capacity() == 0 && back_free_capacity() == small buffer size" yet not deallocate memory? The default growth policy is:

...

static size_type new_capacity(size_type capacity); Returns: 4 times the old capacity or 16 if it's 0.

Why those numbers? Were they arbitrarily chosen or based on some sort of benchmark? If so, what?

Thorsten Ottosen

5:14 p.m.

Hi Tim,

...

Just some comments on devector for now, based on a quick read of the documentation and implementation.

Design:

I don't find the unsafe functions sufficiently motivated. Introducing footguns should require a compelling performance justification, but there are no actual benchmarks. The unsafe_push_* functions encourage heavy uses of reserve(), which, if done incorrectly, can easily prevent the growth policy from working. Moreover, the design is inconsistent. Why unsafe_push_* but not unsafe_emplace_*?

The missing _emplace functions is surely an oversight. I the thread with Zach's review I tried to view this from a more philosophical viewpoint. What is safe and unsafe is highly subjective (and as I wrote in the other thread, it is a mistake to use the word "unsafe" in the function name"). But C++ is not memory safe like Java or C#, and there are tons of preconditions in C++ code that are only checked with debug assertions. Yet I don't think in general that C++ programs have more bugs than Java or C# programs. Take a look at various Boost libraries and they offer plenty of opportunity to shoot yourself in the foot. But we use them anyway because they offer functionality and performance that cannot be found in any other way. Said differently: all the alternatives are worse. As just one example: I can use Boost.Intrusive and it is surely harder to use than normal containers, but the performance is awesome and compared with all the hand-rolled, ad hoc alternatives, it's vastly superior. So it's easy to say something is hard and unsafe to use. But what about the alternative? I have personally been optimizing stuff that fell back on vector's push_back was not inlined. So I had to rewrite the code, effectively making an ad hoc container just so I could add elements to it efficiently in the loop that controlled the general performance of the program.

...

What's the motivating use case for the unsafe_uninitialized functions that can't handled by a non-invariant-violating design like default_init? The sole example in the documentation is a devector<char> whose content will be filled by recv later, which can be handled equally well.

Can you elaborate on what this design is exactly? Thanks.

...

Also, assuming that this container meets the allocator-aware container requirements, the user would have to construct and destroy the elements via allocator_traits<>::construct/destroy, not just placement new. Otherwise, your container's destructor would be asking its allocator to destroy something it didn't construct.

Indeed. If the documentation doesn't say so, it needs to be updated.

...

I'm somewhat surprised that dropping the move_if_noexcept/strong exception safety dance doesn't seem to have been considered for a container that's supposed to be performance-oriented.

For which container operation are you thinking about?

...

The default growth policy is:

...
static size_type new_capacity(size_type capacity); Returns: 4 times the old capacity or 16 if it's 0.

Why those numbers? Were they arbitrarily chosen or based on some sort of benchmark? If so, what?

In this connection, https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md may be relevant. It argues for a 1.5 growth factor. kind regards Thorsten

Tim Song

7:36 p.m.

On Tue, Sep 26, 2017 at 1:14 PM, Thorsten Ottosen via Boost <boost@lists.boost.org> wrote:

...

The missing _emplace functions is surely an oversight.

Fair enough.

...

I the thread with Zach's review I tried to view this from a more philosophical viewpoint. What is safe and unsafe is highly subjective (and as I wrote in the other thread, it is a mistake to use the word "unsafe" in the function name"). But C++ is not memory safe like Java or C#, and there are tons of preconditions in C++ code that are only checked with debug assertions. Yet I don't think in general that C++ programs have more bugs than Java or C# programs. Take a look at various Boost libraries and they offer plenty of opportunity to shoot yourself in the foot. But we use them anyway because they offer functionality and performance that cannot be found in any other way. Said differently: all the alternatives are worse. As just one example: I can use Boost.Intrusive and it is surely harder to use than normal containers, but the performance is awesome and compared with all the hand-rolled, ad hoc alternatives, it's vastly superior.

So it's easy to say something is hard and unsafe to use. But what about the alternative?

The problem is that the new hard/unsafe parts isn't justified by any hard performance numbers. We are talking about adding functions that make it more likely for users to do things wrong, possibly in a manner that's completely well-defined but massively inefficient (inappropriate reserves). The benefit from it is skipping a branch that the predictor probably gets right most of the time. On its face, this doesn't appear to me to be a substantial enough benefit to justify the cost. It is of course quite possible that I'm underestimating the benefit, but if so I'd like to see actual numbers proving it.

...

...
What's the motivating use case for the unsafe_uninitialized functions that can't handled by a non-invariant-violating design like default_init? The sole example in the documentation is a devector<char> whose content will be filled by recv later, which can be handled equally well.

Can you elaborate on what this design is exactly? Thanks.

The same design used in boost::container. You take a tag type that indicates that the elements should be default-initialized rather than value-initialized. For trivially default constructible types, this means that no initialization is actually performed. Of course if the type isn't trivially default constructible, it would still call the constructor. The question is: how often do such cases actually arise?

...

...
I'm somewhat surprised that dropping the move_if_noexcept/strong exception safety dance doesn't seem to have been considered for a container that's supposed to be performance-oriented.

For which container operation are you thinking about?

Anything that reallocates, really. Basically, unconditionally move during reallocation instead of move_if_noexcept, and if the move throws you only get basic exception safety.

...

...
The default growth policy is:

...
static size_type new_capacity(size_type capacity); Returns: 4 times the old capacity or 16 if it's 0.

Why those numbers? Were they arbitrarily chosen or based on some sort of benchmark? If so, what?

In this connection,

https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md

may be relevant. It argues for a 1.5 growth factor.

Indeed. I've seen 1.5. I've seen 2. 4, on the other hand, I haven't seen before, so I'm curious if there's anything behind that choice.

Thorsten Ottosen

27 Sep 27 Sep

6:20 p.m.

Den 26-09-2017 kl. 21:36 skrev Tim Song via Boost:

...

On Tue, Sep 26, 2017 at 1:14 PM, Thorsten Ottosen via Boost

...

...
So it's easy to say something is hard and unsafe to use. But what about the alternative?

The problem is that the new hard/unsafe parts isn't justified by any hard performance numbers. We are talking about adding functions that make it more likely for users to do things wrong, possibly in a manner that's completely well-defined but massively inefficient (inappropriate reserves). The benefit from it is skipping a branch that the predictor probably gets right most of the time. On its face, this doesn't appear to me to be a substantial enough benefit to justify the cost. It is of course quite possible that I'm underestimating the benefit, but if so I'd like to see actual numbers proving it.

No disagreement.

...

...
...
What's the motivating use case for the unsafe_uninitialized functions that can't handled by a non-invariant-violating design like default_init? The sole example in the documentation is a devector<char> whose content will be filled by recv later, which can be handled equally well.

Can you elaborate on what this design is exactly? Thanks.

The same design used in boost::container. You take a tag type that indicates that the elements should be default-initialized rather than value-initialized. For trivially default constructible types, this means that no initialization is actually performed.

Ok. So it would only work with trivially default constructible types, right? That is, if the type is not trivially default constructible, you get double initialization. In a serialization library you might rely on constructors that take an archive object and initializes the object.

...

Of course if the type isn't trivially default constructible, it would still call the constructor. The question is: how often do such cases actually arise?

Hard to say, but not often.

...

...
...
I'm somewhat surprised that dropping the move_if_noexcept/strong exception safety dance doesn't seem to have been considered for a container that's supposed to be performance-oriented.

For which container operation are you thinking about?

Anything that reallocates, really. Basically, unconditionally move during reallocation instead of move_if_noexcept, and if the move throws you only get basic exception safety.

Does Boost.Container do this? Would it not make generic code harder to reason about?

...

...
...
The default growth policy is:

...
static size_type new_capacity(size_type capacity); Returns: 4 times the old capacity or 16 if it's 0.

Why those numbers? Were they arbitrarily chosen or based on some sort of benchmark? If so, what?

I think we might have thought about the case where the class is used as a local buffer. There you may want to minimize reallocations and not worry about excessive memory use because the memory is reclaimed soon enough. kind regards Thorsten

Tim Song

7:39 p.m.

On Wed, Sep 27, 2017 at 2:20 PM, Thorsten Ottosen via Boost <boost@lists.boost.org> wrote:

...

...
...
...
I'm somewhat surprised that dropping the move_if_noexcept/strong exception safety dance doesn't seem to have been considered for a container that's supposed to be performance-oriented.

For which container operation are you thinking about?

Anything that reallocates, really. Basically, unconditionally move during reallocation instead of move_if_noexcept, and if the move throws you only get basic exception safety.

Does Boost.Container do this? Would it not make generic code harder to reason about?

I haven't checked, but I'd be surprised if boost::container::vector offers less exception safety than std::vector, even if the stronger exception safety is arguably a design error that had to be retained for backward compatibility. There is, however, no standard container named "devector", so there might be some room for change there. Anyway, my main point, which I perhaps didn't express too well, is that "introduce unsafe functions to save a branch" and "have no way to avoid expensive copies if the move constructor can potentially throw" are inconsistent.

Thorsten Ottosen

28 Sep 28 Sep

5:14 p.m.

Den 27-09-2017 kl. 21:39 skrev Tim Song via Boost:

...

On Wed, Sep 27, 2017 at 2:20 PM, Thorsten Ottosen via Boost <boost@lists.boost.org> wrote:

...

Anyway, my main point, which I perhaps didn't express too well, is that "introduce unsafe functions to save a branch" and "have no way to avoid expensive copies if the move constructor can potentially throw" are inconsistent.

That's true. We should do both. kind regards -T

Ion Gaztañaga

27 Sep 27 Sep

7:54 p.m.

On 27/09/2017 20:20, Thorsten Ottosen via Boost wrote:

...

...
Anything that reallocates, really. Basically, unconditionally move during reallocation instead of move_if_noexcept, and if the move throws you only get basic exception safety.

Does Boost.Container do this? Would it not make generic code harder to reason about?

Yes. See: http://www.boost.org/doc/libs/1_65_1/doc/html/container/Cpp11_conformance.ht... Best, Ion

degski

5:03 a.m.

On 26 September 2017 at 20:14, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

In this connection,

https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md

may be relevant. It argues for a 1.5 growth factor.

Actually, no need to look at something like FBVector, the VS2017 STL std::vector implementation uses the following growth policy: capacity += capacity / 2; I'm not sure how that used to be, on the MSDN blog (not that long ago) it was noted that an overhaul of std::vector was implemented. I think I remember it used to duplicate before (but cannot verify that anymore). degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Soul Studios

5:12 a.m.

...

...
In this connection,

https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md

may be relevant. It argues for a 1.5 growth factor.

That has pretty much been refuted by third party testing as far as I'm aware. In my personal experience I found no performance difference between fbvector and libstc++'s.

...

Actually, no need to look at something like FBVector, the VS2017 STL std::vector implementation uses the following growth policy:

capacity += capacity / 2;

I'm not sure how that used to be, on the MSDN blog (not that long ago) it was noted that an overhaul of std::vector was implemented. I think I remember it used to duplicate before (but cannot verify that anymore).

It used to be closer to 1.5 - I'm glad they changed it. https://stackoverflow.com/questions/12271017/initial-capacity-of-vector-in-c...

Thorsten Ottosen

6:23 p.m.

Den 27-09-2017 kl. 07:12 skrev Soul Studios via Boost:

...

...
...
In this connection,

https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md

may be relevant. It argues for a 1.5 growth factor.

That has pretty much been refuted by third party testing as far as I'm aware. In my personal experience I found no performance difference between fbvector and libstc++'s.

Any source on that?

...

...
Actually, no need to look at something like FBVector, the VS2017 STL std::vector implementation uses the following growth policy:

capacity += capacity / 2;

I'm not sure how that used to be, on the MSDN blog (not that long ago) it was noted that an overhaul of std::vector was implemented. I think I remember it used to duplicate before (but cannot verify that anymore).

It used to be closer to 1.5 - I'm glad they changed it.

Seems like its still 1.5 to me. kind regards Thorsten

Soul Studios

28 Sep 28 Sep

7:10 a.m.

...

...
That has pretty much been refuted by third party testing as far as I'm aware. In my personal experience I found no performance difference between fbvector and libstc++'s.

Any source on that?

The first, yes: this is worth reading through, including the links: https://www.reddit.com/r/cpp/comments/2ezwee/stdvector_optimization_by_faceb... The second, just early testing on my part. I didn't publish any of it. But certainly there was no significant difference in my testing.

...

...
...
Actually, no need to look at something like FBVector, the VS2017 STL std::vector implementation uses the following growth policy:

capacity += capacity / 2;

I'm not sure how that used to be, on the MSDN blog (not that long ago) it was noted that an overhaul of std::vector was implemented. I think I remember it used to duplicate before (but cannot verify that anymore).

It used to be closer to 1.5 - I'm glad they changed it.

Seems like its still 1.5 to me.

Sorry, I misread degski's message-

degski

8:12 a.m.

This: boost::double_ended::devector<std::uint64_t> v ( 16, 123 ); does not work on both Clang/LLVM-6.0 and VS2017 on Windows. One has to explicitely cast 123 to std::uint64_t, i.e. boost::double_ended::devector<std::uint64_t> v ( 16, ( std::uint64_t ) 123 ); does work. This is inconsistent with std::vector, which quite happily implicitely converts the int to std::uint64_t. Clang has this to say about it: 1>z:\vc\x64\include\boost/double_ended/devector.hpp(414): error : no matching function for call to 'distance' 1>pfr.cpp(70): note: in instantiation of function template specialization 'boost::double_ended::devector<unsigned long long, boost::double_ended::small_buffer_size<0>, boost::double_ended::devector_growth_policy, std::allocator<unsigned long long> >::devector<int, 0>' requested here 1>C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.11.25503\include\xutility(1104): note: candidate template ignored: substitution failure [with _InIt = int]: no type named 'difference_type' in 'std::iterator_traits<int>' 1>In file included from pfr.cpp:44: 1>z:\vc\x64\include\boost/double_ended/devector.hpp(417): error : no matching function for call to 'distance' 1>C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Tools\MSVC\14.11.25503\include\xutility(1104): note: candidate template ignored: substitution failure [with _InIt = int]: no type named 'difference_type' in 'std::iterator_traits<int>' 1>In file included from pfr.cpp:44: 1>z:\vc\x64\include\boost/double_ended/devector.hpp(2188): error : indirection requires pointer operand ('int' invalid) 1>z:\vc\x64\include\boost/double_ended/devector.hpp(2178): note: in instantiation of function template specialization 'boost::double_ended::devector<unsigned long long, boost::double_ended::small_buffer_size<0>, boost::double_ended::devector_growth_policy, std::allocator<unsigned long long> >::opt_copy<int, boost::double_ended::detail::null_construction_guard>' requested here 1>z:\vc\x64\include\boost/double_ended/devector.hpp(2655): note: in instantiation of function template specialization 'boost::double_ended::devector<unsigned long long, boost::double_ended::small_buffer_size<0>, boost::double_ended::devector_growth_policy, std::allocator<unsigned long long> >::opt_copy<int>' requested here 1>z:\vc\x64\include\boost/double_ended/devector.hpp(419): note: in instantiation of function template specialization 'boost::double_ended::devector<unsigned long long, boost::double_ended::small_buffer_size<0>, boost::double_ended::devector_growth_policy, std::allocator<unsigned long long> >::construct_from_range<int>' requested here 1>pfr.cpp(70): note: in instantiation of function template specialization 'boost::double_ended::devector<unsigned long long, boost::double_ended::small_buffer_size<0>, boost::double_ended::devector_growth_policy, std::allocator<unsigned long long> >::devector<int, 0>' requested here degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Benedek Thaler

5:04 p.m.

On Thu, Sep 28, 2017 at 10:12 AM, degski via Boost <boost@lists.boost.org> wrote:

...

This:

boost::double_ended::devector<std::uint64_t> v ( 16, 123 );

does not work on both Clang/LLVM-6.0 and VS2017 on Windows. One has to explicitely cast 123 to std::uint64_t, i.e.

boost::double_ended::devector<std::uint64_t> v ( 16, ( std::uint64_t ) 123 );

does work.

Noted, thanks. Benedek

Benedek Thaler

26 Sep 26 Sep

9:06 p.m.

On Tue, Sep 26, 2017 at 2:43 AM, Tim Song via Boost <boost@lists.boost.org> wrote:

...

Just some comments on devector for now, based on a quick read of the documentation and implementation.

Thanks for your time, really good points below.

...

[snip]

...
typedef pointer iterator; typedef const_pointer const_iterator;

Why are pointers being used as iterator types directly?

To keep simple things simple. What's wrong with pointers?

...

Implementation:

...
class devector : Allocator

Even private derivation can interfere with overload resolution in unexpected and undesirable ways. You have enough data members around to leverage EBO without having to make devector itself derive from the allocator.

Noted.

...

https://github.com/erenon/double_ended/blob/master/ include/boost/double_ended/devector.hpp#L403 incorrectly calls push_back rather than emplace_back.

Bug, noted.

...

Insofar as I can determine, your clear() doesn't deallocate storage, so https://github.com/erenon/double_ended/blob/master/ include/boost/double_ended/devector.hpp#L570 just overwrites the only allocator capable of deallocating the storage you are currently holding - without deallocating it first.

Bug again, noted.

...

More generally, implementation of allocator support requires substantial improvement. An allocator that doesn't propagate on X is not required to support X at all, but there's no handling for that in your code. Another example: construct takes raw pointers, not possibly fancy `pointer` like what you did here: https://github.com/erenon/double_ended/blob/master/ include/boost/double_ended/devector.hpp#L2086

`pointer` is defined by allocator_traits. Couldn't that be a fancy pointer, if the Allocator defines it so?

...

The fast path for range-insert-at-front (https://github.com/erenon/double_ended/blob/master/ include/boost/double_ended/devector.hpp#L2496) is broken based on a cursory inspection:

vector<int> x = {1, 2, 3}; devector<int> y = {4, 5, 6}; y.reserve_front(10); y.insert(y.begin(), x.begin(), x.end()); // {3, 2, 1, 4, 5, 6} (!)

Good catch, ugly bug. Noted.

...

Documentation:

Does your push_back etc. really have "amortized linear" complexity? And doesn't that depend on the growth policy?

It should be amortized constant, and it does depend on the growth policy.

...

How can clear() have the postcondition "empty() && front_free_capacity() == 0 && back_free_capacity() == small buffer size" yet not deallocate memory?

This is a typo. should be `back_free_capacity() == old capacity.`

...

The default growth policy is:

...
static size_type new_capacity(size_type capacity); Returns: 4 times the old capacity or 16 if it's 0.

Why those numbers? Were they arbitrarily chosen or based on some sort of benchmark? If so, what?

The numbers are arbitrarily chosen - probably need to be changed. Thanks for all the good spots. Benedek

Tim Song

9:20 p.m.

On Tue, Sep 26, 2017 at 5:06 PM, Benedek Thaler via Boost <boost@lists.boost.org> wrote:

...

On Tue, Sep 26, 2017 at 2:43 AM, Tim Song via Boost <boost@lists.boost.org> wrote:

...
[snip]

...
typedef pointer iterator; typedef const_pointer const_iterator;

Why are pointers being used as iterator types directly?

To keep simple things simple. What's wrong with pointers?

Using pointers directly makes it easier to write buggy code. A custom iterator type, even if just a thin wrapper, provides more type safety.

...

...
More generally, implementation of allocator support requires substantial improvement. An allocator that doesn't propagate on X is not required to support X at all, but there's no handling for that in your code. Another example: construct takes raw pointers, not possibly fancy `pointer` like what you did here: https://github.com/erenon/double_ended/blob/master/ include/boost/double_ended/devector.hpp#L2086

`pointer` is defined by allocator_traits. Couldn't that be a fancy pointer, if the Allocator defines it so?

allocate/deallocate use (possibly fancy) `pointer`. construct/destroy always use raw pointers, never fancy pointers.

Daniel James

11:08 p.m.

On 26 September 2017 at 22:20, Tim Song via Boost <boost@lists.boost.org> wrote:

...

On Tue, Sep 26, 2017 at 5:06 PM, Benedek Thaler via Boost <boost@lists.boost.org> wrote:

...
On Tue, Sep 26, 2017 at 2:43 AM, Tim Song via Boost <boost@lists.boost.org> wrote:;

...
Why are pointers being used as iterator types directly?

To keep simple things simple. What's wrong with pointers?

Using pointers directly makes it easier to write buggy code. A custom iterator type, even if just a thin wrapper, provides more type safety.

That's a pretty slim advantage, I'd value simplicity more. It's very much a judgement call.

...

...
...
More generally, implementation of allocator support requires substantial improvement. An allocator that doesn't propagate on X is not required to support X at all, but there's no handling for that in your code. Another example: construct takes raw pointers, not possibly fancy `pointer` like what you did here: https://github.com/erenon/double_ended/blob/master/ include/boost/double_ended/devector.hpp#L2086

`pointer` is defined by allocator_traits. Couldn't that be a fancy pointer, if the Allocator defines it so?

allocate/deallocate use (possibly fancy) `pointer`. construct/destroy always use raw pointers, never fancy pointers.

The allocator support is pretty bad, but that's fixable and doesn't reflect on the design of the containers. It's specialist knowledge, and it's not that long since most (or even all?) standard libraries made a pig's ear of it. I can help improve the allocator support if wanted.<div id="DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2"><br /> <table style="border-top: 1px solid #D3D4DE;"> <tr> <td style="width: 55px; padding-top: 13px;"><a href="https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail" target="_blank"><img src="https://ipmcdn.avast.com/images/icons/icon-envelope-tick-round-orange-animat..." alt="" width="46" height="29" style="width: 46px; height: 29px;" /></a></td> <td style="width: 470px; padding-top: 12px; color: #41424e; font-size: 13px; font-family: Arial, Helvetica, sans-serif; line-height: 18px;">Virus-free. <a href="https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail" target="_blank" style="color: #4453ea;">www.avast.com</a> </td> </tr> </table><a href="#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2" width="1" height="1"></a></div>

Thorsten Ottosen

27 Sep 27 Sep

6:25 p.m.

Den 27-09-2017 kl. 01:08 skrev Daniel James via Boost:

...

On 26 September 2017 at 22:20, Tim Song via Boost <boost@lists.boost.org> wrote:

...
On Tue, Sep 26, 2017 at 5:06 PM, Benedek Thaler via Boost <boost@lists.boost.org> wrote:

...
On Tue, Sep 26, 2017 at 2:43 AM, Tim Song via Boost <boost@lists.boost.org> wrote:;

...
Why are pointers being used as iterator types directly?

To keep simple things simple. What's wrong with pointers?

Using pointers directly makes it easier to write buggy code. A custom iterator type, even if just a thin wrapper, provides more type safety.

That's a pretty slim advantage, I'd value simplicity more. It's very much a judgement call.

I think Boost.Container uses a simple wrapper around T*, but at the same time has a macro that can be defined to turn it into pointers again. I would expect that some code does things differently if they are passed two pointers instead of two iterators. kind regards Thorsten

degski

26 Sep 26 Sep

6:33 a.m.

Playing around with devector, I find the default growth strategy (x4) rather (too) agressive. It also has some in my view un-intended effects. Starting from let's say initial front and back capacities of each 4, now adding 5 elements in front and 5 elements in the back (so 10), now gives me a devector with capacity 128 (wow, why bothr with SBO at all). Then looking how the free_capacity is split between front and back we get 95 and 23 (or the other way around, depending on whether we first push_back or first push_front, even when front- and back-pushes are interleaved). As we are multiplying by 4, this will become more and more skewed. I think capacity() should go, and replaced by (next to the free versions) front_capacity() and back_capacity(). I think the template allocator parameter should be second, not last. Some bikeshedding (tm). I don't like any of the names much. The batch_deque *is* a deque, it should be called that way. devector is a hybrid vector/deque, I think veque would capture that. Then as to double_ended, I think it's too long and describes badly what we are talking about. I haven't got any (much) better proposal, but was thinking of Boost.Que, nice and short. i.e. boost::que::veque and boost::que::deque. degski

degski

7:50 a.m.

On 26 September 2017 at 09:33, degski <degski@gmail.com> wrote:

...

Then looking how the free_capacity is split between front and back we get 95 and 23 (or the other way around, depending on whether we first push_back or first push_front, even when front- and back-pushes are interleaved).

Maybe a rebalance ( size_type hint ) member function could address this issue and relocate the "zero-point" to the hinted at position (possible safe and unsafe). A clear ( size_type hint ) could do the same on clearing. degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

27 Sep 27 Sep

6:28 p.m.

Den 26-09-2017 kl. 09:50 skrev degski via Boost:

...

On 26 September 2017 at 09:33, degski <degski@gmail.com> wrote:

...
Then looking how the free_capacity is split between front and back we get 95 and 23 (or the other way around, depending on whether we first push_back or first push_front, even when front- and back-pushes are interleaved).

Maybe a rebalance ( size_type hint ) member function could address this issue and relocate the "zero-point" to the hinted at position (possible safe and unsafe). A clear ( size_type hint ) could do the same on clearing.

It's definitely something to ponder. At the moment there is an asymmetry between clear and the expectation of where to push things next. kind regards Thorsten

Thorsten Ottosen

26 Sep 26 Sep

5:30 p.m.

Hi Degski, Thanks for your time.

...

Playing around with devector, I find the default growth strategy (x4) > rather (too) agressive. It also has some in my view un-intended effects. Starting from let's say initial front and back capacities of each 4, now adding 5 elements in front and 5 elements in the back (so 10), now gives me a devector with capacity 128 (wow, why bothr with SBO at all).

There is room for improvement here. https://github.com/facebook/folly/blob/master/folly/docs/FBVector.md recommends 1.5. But if I used it as a local buffer, a bigger value might be warranted (so at least the user can choose).

...

Then looking how the free_capacity is split between front and back we get 95 and 23 (or the other way around, depending on whether we first push_back or first push_front, even when front- and back-pushes are interleaved). As we are multiplying by 4, this will become more and more skewed.

Yeah, but would it not become more and more skewed no matter what number

...

1 we multiply with?

Is there any way to avoid that? Is it desirable to avoid that? The analog for vector is that it has (with a growth factor of 2) on average 25% unused objects. As the container grows, so does the wasted space in absolute terms.

...

Some bikeshedding (tm). I don't like any of the names much. The batch_deque *is* a deque, it should be called that way. devector is a hybrid vector/deque, I think veque would capture that. Then as to double_ended, I think it's too long and describes badly what we are talking about. I haven't got any (much) better proposal, but was thinking of Boost.Que, nice and short.

i.e. boost::que::veque and boost::que::deque.

I think it's great to do some bikeshedding. Keep doing that. kind regards Thorsten

Thorsten Ottosen

27 Sep 27 Sep

6:50 p.m.

Den 26-09-2017 kl. 19:30 skrev Thorsten Ottosen via Boost:

...

Hi Degski,

...

...
we are multiplying by 4, this will become more and more skewed.

Yeah, but would it not become more and more skewed no matter what number

...
1 we multiply with?

Is there any way to avoid that? Is it desirable to avoid that? The analog for vector is that it has (with a growth factor of 2) on average 25% unused objects. As the container grows, so does the wasted space in absolute terms.

Let's try something different. Let's say the growth factor is 2. Your example was: "Starting from let's say initial front and back capacities of each 4, now adding 5 elements in front and 5 elements in the back (so 10), now gives me a devector with capacity 128". With 2 as the growth factor we get 8 * 2 = 16 16 * 2 = 32; the front_free_capacity is 15 the back_free_capacity is 7 Now, this is when the new capacity is defined as 2 * old capacity. What if we use the following instead? Capacity = old capacity + size; We get 8 + 4 = 12 12 + 9 = 21 the front_free_capacity is 8 the back_free_capacity is 3 so doesn't change much in terms of skewness. We could start moving things around instead of allocating, but I think that strategy can be manipulated to destroy the amortized O(1) property. kind regards Thorsten

Steven Watanabe

7 Oct 7 Oct

9:18 p.m.

AMDG On 09/26/2017 12:33 AM, degski via Boost wrote:

...

Some bikeshedding (tm). I don't like any of the names much. The batch_deque *is* a deque, it should be called that way. devector is a hybrid vector/deque, I think veque would capture that.

That doesn't make any sense to me. deque = double ended queue devector = double ended vector veque = vector queue? In Christ, Steven Watanabe

degski

8 Oct 8 Oct

6:40 a.m.

On 8 October 2017 at 00:18, Steven Watanabe via Boost <boost@lists.boost.org

...

wrote:

...

deque = double ended queue devector = double ended vector veque = vector queue?

More Sunday-morning bikeshedding (TM) then :-], on special request. The 'double-ended' in double-ended-queue (deque) in my view is already problematic. Any finite 2-dimensional object has 2 ends (including a std::vector). 'double ended' adds no information. I know it's got two ends, because I know it's 2-dimensional and finite. A std::deque *is* just a queue, and doesn't specify its' use cases. std::queue should be a std::fifo_queue, and then a std::stack would be a std::lifo_queue. std::priority_queue already fits in nicely in this scheme, where the bit in the name before the underscore stipulates the 'mechanics' of that particular queue. If one coud only change the world! :-) That's how I get to veque, de::devector is a queue (in the sense of std::deque, a queue in my view, see above) with most properties of a vector (contiguity/relocation etc...). So yes, a veque is a vector-queue, to answer your question. degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Louis Dionne

26 Sep 26 Sep

5:46 p.m.

...

Dear users and members of Boost,

I'm proud to announce that the formal review of Benedek Thaler's Boost.DoubleEnded library starts today, September 21, and runs through September 30. This library is the result of Benedek's participation in Google Summer of Code 2015.

This week is the CppCon conference, and many C++ minded individuals that may have wanted to participate to the review (including myself) will be absolutely incapable of doing so. In fact, they may even miss the fact that a review is ongoing (such as what happened to me until now). I would like to request that the review be extended for some time after the conference ends to give others the chance of participating if they want to. Does that seem reasonable? Thanks, Louis -- Sent from: http://boost.2283326.n4.nabble.com/Boost-Dev-f2600599.html

Ion Gaztañaga

27 Sep 27 Sep

8:41 p.m.

On 26/09/2017 19:46, Louis Dionne via Boost wrote:

...

This week is the CppCon conference, and many C++ minded individuals that may have wanted to participate to the review (including myself) will be absolutely incapable of doing so. In fact, they may even miss the fact that a review is ongoing (such as what happened to me until now).

I would like to request that the review be extended for some time after the conference ends to give others the chance of participating if they want to.

Does that seem reasonable?

I also would like to request an extension so that this weekend I can review the library. Best, Ion

Joaquin M López Muñoz

26 Sep 26 Sep

8:03 p.m.

El 21/09/2017 a las 19:38, Thorsten Ottosen via Boost escribió:

...

Dear users and members of Boost,

I'm proud to announce that the formal review of Benedek Thaler's Boost.DoubleEnded library starts today, September 21, and runs through September 30. This library is the result of Benedek's participation in Google Summer of Code 2015.

Hi, this is my review of Boost.DoubleEnded. As my review approach is heavily influenced by my acceptance vote, let me begin with that: I vote for conditional acceptance of both containers after they have been presented as separate components of Boost.Container and undergone significant changes, mostly along the line of simplifying their interface. A bit of elaboration: * A library for a familiy of containers (or any other classes) is justified when its members are related either by shared concepts, constructs and/or implementation. This is not the case of Boost.DoubleEnded, whose only general theme is the very lax notion of "double endianness", and for which the amount of common implementation code is small and mostly utility-level. We already have a bestiary of containers in Boost, namely Boost.Container, so I think both devector and batch_deque are better served if proposed as components of this latter libray. Code in boost::double_ended::detail can go to / merge into boost::container[::detail] --for instance, boost::double_ended::detail::allocator_traits seems to be redundant with boost::containter::allocator_traits. * The main characteristics of both containers are obscured by many "extras" that feel insufficiently justified to me or even detrimental (I'll discuss in more detail below), so my vote is for this fat to be removed prior to acceptance in Boost. I can see much work and love have gone into adding this many niceties, and the author could understansably resist to simplification, but libraries, in my opinion, should start with a minimal and well grounded design and add more stuff as demand proves their need. So, since I see this library as a pair of two unrelated classes, let me continue with the review for each separately. REVIEW FOR DEVECTOR DESIGN 1. The container is presented as a "hybrid of the standard vector and deque containers, as well as the small_vector of Boost.Container.". It's hard to see what the selling point is. To me, the main feature (and the only one worth retaining) of devector is amortized constant insertion at both ends with memory contiguity. So, I'd present devector as a "contiguous memory container with amortized constant insertion at both ends of the sequence". Note that this characterizes the container to the point of nearly fixing its implementation and (assuming STL style is followed) its interface. 2. SBO is orthogonal to devector's raison d'être (I mean, SBO can be added to any container) and violates the "don't pay for what you don't use" principle. I'd remove it. If someone asks for it, provide a separate small_devector, but I doubt there'll be huge demand for this (devector, to begin with, is likely to be used only in very specialized scenarios, so the probability that SFO comes in handy as well is, IMHO, vanishingly small). 3. Unsafe methods are poorly justified: if std::vector does not have them, why add them here? Such degree of unsafety should be backed up with hard data on its superior performance. I'd remove this bit. 4. Same goes for the growing policy. Not even boost::container::vector features a growing policy, which seems an indication that this is not a heavily demanded customization point. I'd remove it. 5. Defining size_type as unsigned int for the sake of class size seems to me overzealous. 6. reserve_only_tag construction looks like too much hassle to avoid default construction followed by proper reserving calls, which is what everybody already does with std::vector. Again, I'd remove this one. 7. Modulo the points discussed above, the interface of devector follows closely that of std::vector (with the obvious additions, e.g. [push|pop_emplace]_front), which is very good. The only interface decisions I'm not fully convinced about is capacity/ resize/reserve/shrink_to_fit without "front" or "back" qualification: - capacity() seems to be (from what I can infer reading the docs only) the sum of front capacity and back capacity. I can't find any sensible use of this info, and, besides, the careless reader could assume capacity() works as std::vector::capacity(), which raises usability concerns. My suggestion is to remove this. FWIW, we had a similar discussion about capacity() in Boost.PolyCollection, and that member function finally was dropped. - front_free_capacity/back_free_capacity could be more aptly named front_capacity/back_capacity. - resize and reserve as aliases to resize_back and reserve_back have usability problems as discussed with capacity. I'd remove them. - For symmetry, shrink_to_fit could be complemented with [front|back]_shrink_to_fit member functions --note I'm not proposing that shrink_to_fit be removed, as its semantics are perfectly clear. [SERIALIZATION] IMPLEMENTATION 1. I've taken just cursory look at the code, but everything looks very good in general. 2. I take that lines 2367-8 (and similar) in devector.hpp: // avoid invalidating any reference in args later T tmp(std::forward<Args>(args)...); are somehow related to LWG Issue 526, right? 3. I have issues with growing and capacity handling. If I'm reading the code right, growth factor is x4 (!!) and whether the new space is given to the front or back side depends on whether the insertion that triggered the expansion is closer to one or the other. x4 is much larger than any growing factor I've seen so far for std::vector (typically x1.5 or x2), I suggest this should be brought down to what's customary. As for the policy for giving the space to front or back, I think it makes sense but needs to be documented officially. I'm assuming that erasure behaves analogously (i.e. space is given back to the side that's closer to the erasure point) --in particular, inserting an element and erasing it immediately after should return front and back capacities to their original values. 4. I'm not comfortable with the fact that shifting on insertion turns to the opposite side when the favored side (the one closer to the insertion point) happens to be full. This makes it very hard to reason about post-insertion front and back capacities. Also, the reference statement that insertion is "linear in half the size of *this [...]" is currently not true (when the favored side is full insertion takes longer than that). I'd rather drop this optimization and document that shifting always happen to the direction where the least operations are needed. 5. I wonder why serialization code is so complicated: doesn't it suffice to rely on boost::serialization::stl::collection_load_impl|save_collection? 6. Tests look very good, with corner cases and all. DOCUMENTATION 1. If the extra features are removed, the design rationale would consequently be much shorter. 2. I confess I don't get what "devector implements the NAD (Not A Defect) resolution of LWG Issue 526" is about. 3. The examples section is 90% devoted to the extra features I advocate to suppress. 4. The benchmark section should go (IMHO). The display of assembly outputs is anecdotal at best, since it is quite plain to guess that devector::push_back should be basically the same code as std::vector::push_back. I don't think devector needs any benchmark backup to be bought by potential users. 5. The reference section is very good. Capacity handling and which side shifting goes on insertion/erasure should IMHO be documented exhaustively (see my points above on this matter). POTENTIAL USEFULNESS OF DEVECTOR I can only use my intuition here, but memory contiguity plus constant insertion at both ends looks a general enough proposition to meet potential users. FIFO queues jump to mind. Time will tell. OTHER 1. I didn't try to use the library. 2. I spent aroud four hours reviewing devector. 3. I have some experience writing STL-style containers. REVIEW FOR BATCH_DEQUE DESIGN 1. In the design rationale, batch_deque is presented as a deque with configurable segment size, which is true, but not the end of the story: segment access (segment_begin) is not mentioned, neither is stable insertion. 2. As for segment access, I think the chosen interface is very odd. From what I can gather after looking at the code and testing (because the reference does not document this), a segment_iterator dereferences to the first element of the pointed to segment and has additional data() and data_size() member functions to access the full segment. This is a sort of hybrid entity I fail to see the utility of (for instance, what can I do with referencing?). I have some experience with segmented structures and think a better approach would be for segment_iterator to dereference to a segment descriptor with begin() and end() member functions ranging over the segment elements. This way, one can write stuff like: for(auto it=d.segment_begin(),end=d.segment_end();it!=end;++it){ for(auto& element:*it){ // do something with element } } The thing can be further stylized by dropping segment_[begin|end] and having a segments() member functon returning an object with equivalent begin() and end() member functions, i.e., one would write d.segments().[begin|end]() rather than d.segment_[begin|end](), which allows us to use a range for in: for(auto segment:d.segments()){ for(auto& element:segment){ // do something with element } } 3. The question arises of whether segmented access can really gain us some performance I wrote a small test to measure the performance of plain std::for_each vs. an equivalent sequence of segment-level loops, and this is what I got in a

...

The library may be downloaded from

https://github.com/erenon/double_ended

and the documentation may be found here:

http://erenon.hu/double_ended/

Anyone is welcome to post a review and/or take part in subsequent discussions (see below for review guidelines).

Introduction ------------

This is a container library that provides two containers:

A. boost::devector

B. boost::batch_deque

Both containers provides efficient push_front and push_back operations, hence the name DoubleEnded.

The boost::devector class can be seen as an extension of std::vector with efficient support for push_front (so boost::devector provides contiguous storage). In addition, it provides a customizable small buffer optimization like boost::small_vector(*). It furthermore allows for customization of the growth factor such that the user can decide if a reallocation should allocate 1.5 or 2 or whatever more memory. And finally it provides new functionality that does not exist in normal vector implementations -- this functionality allow the class to excel in performance in certain situations like serialization or transfer of data from sources that require a preallocated buffer.

The boost::batch_deque is similar to std::deque, but with two important twists. Like std::deque, it provides reference stability when calling push_front/push_back. Reference stability is crucial for programs that uses intrusive containers e.g. using Boost.Intrusive (**). boost::batch_deque has the following main advantages:

1. It allows the user to specify the segment size

2. It allows for external iteration over the segments

Taken together, they provide a very efficient and flexible basis for intrusive containers. Access to the segments also increases performance for tasks such as serialization. Finally, boost::batch_deque may be seen as an efficient alternative to std::vector if you want to replace a percent-wise memory overhead with a constant memory overhead and have no need for contiguous storage.

Review guidelines -----------------

Please provide in your review whatever information you think is valuable to understand your final choice of ACCEPT or REJECT including Fit as a Boost library. Please be explicit about your decision.

Some other questions you might want to consider answering:

- What is your evaluation of the design? - What is your evaluation of the implementation? - What is your evaluation of the documentation? - What is your evaluation of the potential usefulness of the library? - Did you try to use the library? With which compiler(s)? Did you have any problems? - How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? - Are you knowledgeable about the problem domain?

More information about the Boost Formal Review Process can be found here: http://www.boost.org/community/reviews.html

Kind regards

Thorsten, Review manager

(*) boost::small_vector: https://tinyurl.com/ycslp4ot

(**) http://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html

_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Ion Gaztañaga

27 Sep 27 Sep

9:06 p.m.

On 26/09/2017 22:03, Joaquin M López Muñoz via Boost wrote:

...

4. Same goes for the growing policy. Not even boost::container::vector features a growing policy, which seems an indication that this is not a heavily demanded customization point.

True. But is on my todo list I was recently trying to add customization options to Boost.Container containers (I need to offer something different to my fans that the standard containers can't offer ;-). Usually the choice is usually between a factor of 2 and 1.5. The theoretical limit value for the growth factor to be able to reuse previously allocated memory is roughly 1.61. Howard Hinnant shared with my a nice paper about this more than ten years ago showing how to grow using a factor of 1.6 and the math behind this. I just reviewed that brief paper recently, and since it's not longer online, Howard kindly has given me permission to share it with the Boost community. Best, Ion

degski

28 Sep 28 Sep

6:02 a.m.

On 28 September 2017 at 00:06, Ion Gaztañaga via Boost < boost@lists.boost.org> wrote:

...

The theoretical limit value for the growth factor to be able to reuse previously allocated memory is roughly 1.61.

That figure smells suspiciously like the Golden Ratio <https://en.wikipedia.org/wiki/Golden_ratio>: 1.618033988749895 (unsurprisingly even, maybe, if one considers what the GR entails). degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

8:49 a.m.

Den 27-09-2017 kl. 23:06 skrev Ion Gaztañaga via Boost:

...

On 26/09/2017 22:03, Joaquin M López Muñoz via Boost wrote:

...
4. Same goes for the growing policy. Not even boost::container::vector features a growing policy, which seems an indication that this is not a heavily demanded customization point.

True. But is on my todo list I was recently trying to add customization options to Boost.Container containers (I need to offer something different to my fans that the standard containers can't offer ;-).

We do need some general agreement about what a boost library can do and not do. There is a large difference between saying "I'm not going to use this" or "I don't see the benefit" and saying "I want to prohibit legitimate use cases for other people".

...

Usually the choice is usually between a factor of 2 and 1.5. The theoretical limit value for the growth factor to be able to reuse previously allocated memory is roughly 1.61. Howard Hinnant shared with my a nice paper about this more than ten years ago showing how to grow using a factor of 1.6 and the math behind this. I just reviewed that brief paper recently, and since it's not longer online, Howard kindly has given me permission to share it with the Boost community.

Thanks for sharing this. There is another argument for 1.5 and that is that the casual users ends up with only about 16% wasted space on average compared to 25%. Reading through the argument, there are a few things I don't get. Is seems that the proof sort of assumes one container instance. And will the heap manager actually coalesce anything? kind regards -Thorsten

Howard Hinnant

12:06 p.m.

...

On Sep 28, 2017, at 4:49 AM, Thorsten Ottosen via Boost <boost@lists.boost.org> wrote:

Den 27-09-2017 kl. 23:06 skrev Ion Gaztañaga via Boost:

...
On 26/09/2017 22:03, Joaquin M López Muñoz via Boost wrote:

...
4. Same goes for the growing policy. Not even boost::container::vector features a growing policy, which seems an indication that this is not a heavily demanded customization point. True. But is on my todo list I was recently trying to add customization options to Boost.Container containers (I need to offer something different to my fans that the standard containers can't offer ;-).

We do need some general agreement about what a boost library can do and not do. There is a large difference between saying "I'm not going to use this" or "I don't see the benefit" and saying "I want to prohibit legitimate use cases for other people".

...
Usually the choice is usually between a factor of 2 and 1.5. The theoretical limit value for the growth factor to be able to reuse previously allocated memory is roughly 1.61. Howard Hinnant shared with my a nice paper about this more than ten years ago showing how to grow using a factor of 1.6 and the math behind this. I just reviewed that brief paper recently, and since it's not longer online, Howard kindly has given me permission to share it with the Boost community.

Thanks for sharing this.

There is another argument for 1.5 and that is that the casual users ends up with only about 16% wasted space on average compared to 25%.

Reading through the argument, there are a few things I don't get. Is seems that the proof sort of assumes one container instance. And will the heap manager actually coalesce anything?

Good questions, and to put this into perspective: When I wrote the CodeWarrior vector I used this argument to use a growth factor of 1.6. But several years later, when I wrote libc++, I went back to 2. Howard

Ion Gaztañaga

7:35 p.m.

On 28/09/2017 10:49, Thorsten Ottosen via Boost wrote:

...

Reading through the argument, there are a few things I don't get. Is seems that the proof sort of assumes one container instance. And will the heap manager actually coalesce anything?

DLMalloc coalesces. Several heap managers coalesce when the allocation is bigger than a limit. Boost.Interprocess allocators coalesce. I'm not sure what jemalloc and similar allocators do. Ion

Thorsten Ottosen

29 Sep 29 Sep

7:30 a.m.

Den 28-09-2017 kl. 21:35 skrev Ion Gaztañaga via Boost:

...

On 28/09/2017 10:49, Thorsten Ottosen via Boost wrote:

...
Reading through the argument, there are a few things I don't get. Is seems that the proof sort of assumes one container instance. And will the heap manager actually coalesce anything?

DLMalloc coalesces. Several heap managers coalesce when the allocation is bigger than a limit. Boost.Interprocess allocators coalesce.

I'm not sure what jemalloc and similar allocators do.

Yes, there seem to be many takes on this. Stuff like http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3536.html seem to suggest that a larger chunk is independent of smaller chunks. The argument says we are looking for \sum_{j=o}^{i-1} S[j] >= S[i+1] so the coalescing is supposed to happen between the i'th and (i+1)'th allocation. Now, if there are two vectors in play, or other data structures that uses memory, surely those leftovers can be coalesced into something that satisfy a larger growth factor. kind regards -Thorsten

Paul A. Bristow

28 Sep 28 Sep

11:31 a.m.

...

-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of Ion Gaztañaga via Boost Sent: 27 September 2017 22:06 To: Joaquin M López Muñoz via Boost Cc: Ion Gaztañaga Subject: Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

On 26/09/2017 22:03, Joaquin M López Muñoz via Boost wrote:

...
4. Same goes for the growing policy. Not even boost::container::vector features a growing policy, which seems an indication that this is not a heavily demanded customization point.

True. But is on my todo list I was recently trying to add customization options to Boost.Container containers (I need to offer something different to my fans that the standard containers can't offer ;-).

Usually the choice is usually between a factor of 2 and 1.5. The theoretical limit value for the growth factor to be able to reuse previously allocated memory is roughly 1.61. Howard Hinnant shared with my a nice paper about this more than ten years ago showing how to grow using a factor of 1.6 and the math behind this. I just reviewed that brief paper recently, and since it's not longer online, Howard kindly has given me permission to share it with the Boost community.

I can't read the attached fully, but it seems an significant calculation to document properly (or reference fully) in any software that is using a growth factor. Boost has the floating-point constant phi http://www.boost.org/doc/libs/1_65_1/libs/math/doc/html/math_toolkit/constan... but perhaps it needs to be integral-ish like 1.5? But the assumptions made by need to be documented and examined closely too? "In theory, there is no difference between theory and practice. But in practice, there's no similarity." Jan L A van de Snepscheut "It doesn't matter how beautiful your theory is, it doesn't matter how smart you are. If it doesn't agree with experiment, it's wrong." Richard P. Feynman Paul --- Paul A. Bristow Prizet Farmhouse Kendal UK LA8 8AB +44 (0) 1539 561830

degski

8 Oct 8 Oct

12:14 p.m.

On 28 September 2017 at 14:31, Paul A. Bristow via Boost < boost@lists.boost.org> wrote:

...

Boost has the floating-point constant phi

http://www.boost.org/doc/libs/1_65_1/libs/math/doc/html/ math_toolkit/constants.html

Expressed as a ratio: // The Golden Ratio, the ratio of the 2 largest consecutive Fibonacci numbers // representable in std::intmax_t... #ifdef _WIN64 using golden_ratio = std::ratio<7540113804746346429, 4660046610375530309>; #else using golden_ratio = std::ratio<1836311903, 1134903170>; #endif Which makes the definition of the Golden Ratio independent of the underlying (system defined) representation of reals, but the best possible. degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Paul A. Bristow

9 Oct 9 Oct

9:58 a.m.

...

-----Original Message----- From: Boost [mailto:boost-bounces@lists.boost.org] On Behalf Of degski via Boost Sent: 08 October 2017 13:14 To: boost Cc: degski Subject: Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

On 28 September 2017 at 14:31, Paul A. Bristow via Boost < boost@lists.boost.org> wrote:

...
Boost has the floating-point constant phi

http://www.boost.org/doc/libs/1_65_1/libs/math/doc/html/ math_toolkit/constants.html

Expressed as a ratio:

// The Golden Ratio, the ratio of the 2 largest consecutive Fibonacci numbers // representable in std::intmax_t...

#ifdef _WIN64 using golden_ratio = std::ratio<7540113804746346429, 4660046610375530309>; #else using golden_ratio = std::ratio<1836311903, 1134903170>; #endif

Which makes the definition of the Golden Ratio independent of the underlying (system defined) representation of reals, but the best possible.

Boost.Math constants also all provide the best possible representation of reals, but do it * portably over all platforms known to Boost, * at compile time, * constexpr, * and for all Boost.Multiprecision Real types from float up to thousands or decimal digits. But all this is massive overkill when people are uncertain whether 1.5 or 2 is best? When computing a change in capacity I would have thought something integral-ish like 2 or 3/2 would be an advantage? Paul --- Paul A. Bristow Prizet Farmhouse Kendal UK LA8 8AB +44 (0) 1539 561830

degski

3:31 p.m.

On 9 October 2017 at 12:58, Paul A. Bristow via Boost <boost@lists.boost.org

...

wrote:

...

Boost.Math constants also all provide the best possible representation of reals,

You don't like that definition? Cool, no problem, ignore it... Why *not* use boost if you can? degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Bjorn Reese

29 Sep 29 Sep

8:17 a.m.

On 09/27/2017 11:06 PM, Ion Gaztañaga via Boost wrote:

...

Usually the choice is usually between a factor of 2 and 1.5. The theoretical limit value for the growth factor to be able to reuse previously allocated memory is roughly 1.61. Howard Hinnant shared with

There are several factors in play, so we may need a slightly more sophisticated growth policy. Besides memory coalescing, there is also the performance concern about the average number of memory reallocation per added byte. When the capacity is low we may want to grow more aggressively, because the reallocation-per-byte concern outweighs the coalescing concern. As the capacity increases the coalescing concern will become more important and we would want to adjust the growth factor below the golden ratio.

Niall Douglas

1:25 p.m.

On 29/09/2017 09:17, Bjorn Reese via Boost wrote:

...

On 09/27/2017 11:06 PM, Ion Gaztañaga via Boost wrote:

...
Usually the choice is usually between a factor of 2 and 1.5. The theoretical limit value for the growth factor to be able to reuse previously allocated memory is roughly 1.61. Howard Hinnant shared with

There are several factors in play, so we may need a slightly more sophisticated growth policy.

Me personally, my biggest complaint with the STL containers is that you don't get direct control over when they call malloc or free. My great hope remains that for STL2 we get new STL2 containers which never reallocate without being asked to by the programmer. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Glen Fernandes

2:34 p.m.

On Friday, September 29, 2017, Niall Douglas wrote:

...

Me personally, my biggest complaint with the STL containers is that you don't get direct control over when they call malloc or free.

Which standard library containers are you concerned about doing allocations beyond those allocations where they use the given Allocator instance and invoke its allocate() member function? Glen

Niall Douglas

5:04 p.m.

...

Me personally, my biggest complaint with the STL containers is that you don't get direct control over when they call malloc or free.

Which standard library containers are you concerned about doing allocations beyond those allocations where they use the given Allocator instance and invoke its allocate() member function?

My biggest peeve is std::vector. I absolutely hate that it can ever allocate new storage on its own. In my preferred STL2 design, a vector would only ever occupy its capacity which is given to it at the time of construction. It cannot change its capacity at all. You the programmer can, of course, create a new vector of larger capacity and have all the contents from the old vector moved/copied into the new vector. All the remaining STL2 containers are then built in layers atop this STL2 vector, muxed in with Ranges and Views. BTW STL1 containers don't go away. They're still available for when you don't care and just need something quick and dirty. And for legacy code of course. You can see some of my STL2 container ideas in AFIO v2 where memory mapped files are "malloc" and `afio::algorithm::mapped_view<T>` is a typed view of some mapped file storage. I haven't implemented `afio::algorithm::vector<T>` yet, but it's on the plan. One of the super cool things with this design is constant time vector extensions when the type has a trivial move and copy constructor because we simply map the underlying file storage to a new address. No memory copying needed. The other super cool thing is sparse storage, so you can allocate trillion element vectors without issue on any system. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/

Soul Studios

4 Oct 4 Oct

9:47 p.m.

...

My biggest peeve is std::vector. I absolutely hate that it can ever allocate new storage on its own.

In my preferred STL2 design, a vector would only ever occupy its capacity which is given to it at the time of construction. It cannot change its capacity at all. You the programmer can, of course, create a new vector of larger capacity and have all the contents from the old vector moved/copied into the new vector.

How is this not avoided using reserve()?

Steven Watanabe

7 Oct 7 Oct

9:29 p.m.

AMDG On 09/26/2017 02:03 PM, Joaquin M López Muñoz via Boost wrote:

...

Also, the reference statement that insertion is "linear in half the size of *this [...]" is currently not true (when the favored side is full insertion takes longer than that).

Technically it is true, since "linear in half the size" is equivalent to "linear in the size."

...

I'd rather drop this optimization and document that shifting always happen to the direction where the least operations are needed.

In Christ, Steven Watanabe

Joaquin M López Muñoz

26 Sep 26 Sep

9:43 p.m.

El 21/09/2017 a las 19:38, Thorsten Ottosen via Boost escribió:

...

Dear users and members of Boost,

I'm proud to announce that the formal review of Benedek Thaler's Boost.DoubleEnded library starts today, September 21, and runs through September 30. This library is the result of Benedek's participation in Google Summer of Code 2015.

Hi, this is my review of Boost.DoubleEnded. As my review approach is heavily influenced by my acceptance vote, let me begin with that: I vote for conditional acceptance of both containers after they have been presented as separate components of Boost.Container and undergone significant changes, mostly along the line of simplifying their interface. A bit of elaboration: * A library for a familiy of containers (or any other classes) is justified when its members are related either by shared concepts, constructs and/or implementation. This is not the case of Boost.DoubleEnded, whose only general theme is the very lax notion of "double endianness", and for which the amount of common implementation code is small and mostly utility-level. We already have a bestiary of containers in Boost, namely Boost.Container, so I think both devector and batch_deque are better served if proposed as components of this latter libray. Code in boost::double_ended::detail can go to / merge into boost::container[::detail] --for instance, boost::double_ended::detail::allocator_traits seems to be redundant with boost::containter::allocator_traits. * The main characteristics of both containers are obscured by many "extras" that feel insufficiently justified to me or even detrimental (I'll discuss in more detail below), so my vote is for this fat to be removed prior to acceptance in Boost. I can see much work and love have gone into adding this many niceties, and the author could understansably resist to simplification, but libraries, in my opinion, should start with a minimal and well grounded design and add more stuff as demand proves their need. So, since I see this library as a pair of two unrelated classes, let me continue with the review for each separately. REVIEW FOR DEVECTOR DESIGN 1. The container is presented as a "hybrid of the standard vector and deque containers, as well as the small_vector of Boost.Container.". It's hard to see what the selling point is. To me, the main feature (and the only one worth retaining) of devector is amortized constant insertion at both ends with memory contiguity. So, I'd present devector as a "contiguous memory container with amortized constant insertion at both ends of the sequence". Note that this characterizes the container to the point of nearly fixing its implementation and (assuming STL style is followed) its interface. 2. SBO is orthogonal to devector's raison d'être (I mean, SBO can be added to any container) and violates the "don't pay for what you don't use" principle. I'd remove it. If someone asks for it, provide a separate small_devector, but I doubt there'll be huge demand for this (devector, to begin with, is likely to be used only in very specialized scenarios, so the probability that SFO comes in handy as well is, IMHO, vanishingly small). 3. Unsafe methods are poorly justified: if std::vector does not have them, why add them here? Such degree of unsafety should be backed up with hard data on its superior performance. I'd remove this bit. 4. Same goes for the growing policy. Not even boost::container::vector features a growing policy, which seems an indication that this is not a heavily demanded customization point. I'd remove it. 5. Defining size_type as unsigned int for the sake of class size seems to me overzealous. 6. reserve_only_tag construction looks like too much hassle to avoid default construction followed by proper reserving calls, which is what everybody already does with std::vector. Again, I'd remove this one. 7. Modulo the points discussed above, the interface of devector follows closely that of std::vector (with the obvious additions, e.g. [push|pop_emplace]_front), which is very good. The only interface decisions I'm not fully convinced about is capacity/ resize/reserve/shrink_to_fit without "front" or "back" qualification: - capacity() seems to be (from what I can infer reading the docs only) the sum of front capacity and back capacity. I can't find any sensible use of this info, and, besides, the careless reader could assume capacity() works as std::vector::capacity(), which raises usability concerns. My suggestion is to remove this. FWIW, we had a similar discussion about capacity() in Boost.PolyCollection, and that member function finally was dropped. - front_free_capacity/back_free_capacity could be more aptly named front_capacity/back_capacity. - reserve as an alias to reserve_back has usability problems as discussed with capacity. I'd remove them. - For symmetry, shrink_to_fit could be complemented with [front|back]_shrink_to_fit member functions --note I'm not proposing that shrink_to_fit be removed, as its semantics are perfectly clear. IMPLEMENTATION 1. I've taken just cursory look at the code, but everything looks very good in general. 2. I take that lines 2367-8 (and similar) in devector.hpp: // avoid invalidating any reference in args later T tmp(std::forward<Args>(args)...); are somehow related to LWG Issue 526, right? 3. I have issues with growing and capacity handling. If I'm reading the code right, growth factor is x4 (!!) and whether the new space is given to the front or back side depends on whether the insertion that triggered the expansion is closer to one or the other. x4 is much larger than any growing factor I've seen so far for std::vector (typically x1.5 or x2), I suggest this should be brought down to what's customary. As for the policy for giving the space to front or back, I think it makes sense but needs to be documented officially. I'm assuming that erasure behaves analogously (i.e. space is given back to the side that's closer to the erasure point) --in particular, inserting an element and erasing it immediately after should return front and back capacities to their original values. 4. I'm not comfortable with the fact that shifting on insertion turns to the opposite side when the favored side (the one closer to the insertion point) happens to be full. This makes it very hard to reason about post-insertion front and back capacities. Also, the reference statement that insertion is "linear in half the size of *this [...]" is currently not true (when the favored side is full insertion takes longer than that). I'd rather drop this optimization and document that shifting always happen to the direction where the least operations are needed. 5. I wonder why serialization code is so complicated: doesn't it suffice to rely on boost::serialization::stl::collection_load_impl|save_collection? 6. Tests look very good, with corner cases and all. DOCUMENTATION 1. If the extra features are removed, the design rationale would consequently be much shorter. 2. I confess I don't get what "devector implements the NAD (Not A Defect) resolution of LWG Issue 526" is about. 3. The examples section is 90% devoted to the extra features I advocate to suppress. 4. The benchmark section should go (IMHO). The display of assembly outputs is anecdotal at best, since it is quite plain to guess that devector::push_back should be basically the same code as std::vector::push_back. I don't think devector needs any benchmark backup to be bought by potential users. 5. The reference section is very good. Capacity handling and which side shifting goes on insertion/erasure should IMHO be documented exhaustively (see my points above on this matter). POTENTIAL USEFULNESS OF DEVECTOR I can only use my intuition here, but memory contiguity plus constant insertion at both ends looks a general enough proposition to meet potential users. FIFO queues jump to mind. Time will tell. OTHER 1. I didn't try to use the library. 2. I spent aroud four hours reviewing devector. 3. I have some experience writing STL-style containers. REVIEW FOR BATCH_DEQUE DESIGN 1. In the design rationale, batch_deque is presented as a deque with configurable segment size, which is true, but not the end of the story: segment access (segment_[begin|end]) is not mentioned, neither is stable insertion. 2. As for segment access, I think the chosen interface is very odd. From what I can gather after looking at the code and testing (because the reference does not document this), a segment_iterator dereferences to the first element of the pointed to segment and has additional data() and data_size() member functions to access the full segment. This is a sort of hybrid entity I fail to see the utility of (for instance, what can I do with referencing?). I have some experience with segmented structures and think a better approach would be for segment_iterator to dereference to a segment descriptor with begin() and end() member functions ranging over the segment elements. This way, one can write stuff like: for(auto it=d.segment_begin(),end=d.segment_end();it!=end;++it){ for(auto& element:*it){ // do something with element } } The thing can be further stylized by dropping segment_[begin|end] and having a segments() member functon returning an object with equivalent begin() and end() member functions, i.e., one would write d.segments().[begin|end]() rather than d.segment_[begin|end](), which allows us to use a range for in: for(auto segment:d.segments){ for(auto& element:segment){ // do something with element } } 3. The question arises of whether segment access can gain us some speed. I've written a small test to measure the performance of a plain std::for_each loop over a batch_deque vs. an equivalent sequence of segment-level loops (attached, batch_deque_for_each.cpp), and this is what I got for Visual C++ 2015 32-bit (x86) release mode in a Windows 7 64-bit box with an Intel Core i5-2520M @2.5GHz: [](int x){return x;} segment size: 32 n plain segmented 10E3 25.5472 23.6305 10E4 24.5778 23.6907 10E5 24.5821 22.8076 10E6 25.5007 23.1037 10E7 27.1452 24.0339 segment size: 512 n plain segmented 10E3 23.8384 23.6638 10E4 23.0284 23.8705 10E5 22.8449 22.8187 10E6 23.8485 23.7454 10E7 24.1711 23.5404 [](int x){return x%4?x:-x;} segment size: 32 n plain segmented 10E3 33.9795 23.6662 10E4 32.4817 24.023 10E5 32.8731 23.3803 10E6 33.5396 22.9298 10E7 33.1034 23.0206 segment size: 512 n plain segmented 10E3 25.0623 23.3205 10E4 25.1048 23.5812 10E5 25.3343 21.7686 10E6 25.6961 22.4639 10E7 25.8664 22.9964 (Time is nanoseconds, the lambda functions encapsulate the operation done on each element before accumulating the result.) So, there is some performance gain, though nothing spectacular. Whether this justifies providing segment access is not clear to me. 4. Stable insertion looks to me a nice feature without a real-world use case... Is there some more or less obvious application I'm missing? Otherwise, I'd drop it. 5. A default segment size of 512 looks too big: typical numbers for std::queue range in 8-16. Any reason for this? 6. Why a batch_deque_policy wrapper rather than directly providing the segment size to batch_queue as in boost::double_ended::batch_deque<int,512> ? 7. BTW, batch_deque_policy::segment_size should be defined as constexpr so as to fully honor the promise that "using a power of two increases performance because of the cheaper division and modulo operations". 8. As for devector, unsafe operations do not seem justified. 9. resize_[front|back] are IMHO a nice addition over std::deque in full accordance with its double-ended nature. I'm not so sure about reserve, capacity and shrink_to_fit, more so when the number of segments is expected to be much lower than the number of elements and segment index (map_t) reallocation should not affect overall performance significatively. Either way, the same concerns over naming and capacity handling apply here. 10. A bit of bikeshedding: what's the rationale for "batch_deque"? Not that I have a better name for this, though. IMPLEMENTATION 1. From a cursory look, everything looks clean and beatiful. 2. As with devector, why is serialization code is so complicated and does not simply rely on boost::serialization::stl::collection_load_impl|save_collection? 3. Tests look good. DOCUMENTATION 1. The design rationale, as commented above, does not really specify what a batch_deque is all about. 2. The reference is in general quite complete, but: 3. [const_]segment_iterator is not documented at all. 4. Postconditions on front and back capacity are not documented. 5. Maybe a performance section could be added showing how tuning the segment size can affect performance. POTENTIAL USEFULNESS OF BATCH_DEQUE Can't tell. Controlling segment size seems a potentially wanted feature for users of std::deque. Others with greater practical experience with this type of structures can have a more informed opinion. OTHER 1. I briefly used the batch_deque sub-library with MSVC 2015 for performance testing and some basic exploration of segment_[begin|end]. No problem foud. 2. I spent aroud three hours reviewing batch_deque. 3. I have some experience writing STL-style containers. SECTION ON BUILD AND TEST 1. I don't think this is really needed if the library finally makes it into Boost. Thanks to Benedek Thaler for his effort in writing and submitting this library, and to Thorsten in his dual role as mentor / review manager. Best, Joaquín M López Muñoz

degski

27 Sep 27 Sep

6:03 a.m.

On 27 September 2017 at 00:43, Joaquin M López Muñoz via Boost < boost@lists.boost.org> wrote:

...

... but libraries, in my opinion, should start with a minimal and well grounded design and add more stuff as demand proves their need.

This sounds good in theory, but doesn't work like that in practice. Libraries tend to get "stuck" in their inital state. As an example of this: I recently proposed to add variadic move construction for nodes to boost::pool (a trivial change). And even though this has obvious advantages (and would get rid for C++11 capable compilers of the use of M4 to generate the necessary code), this was received with: file a PR, including tests (?) and change to documentation. Enough to put me off, so I just changed it locally. Another example is the VS std::deque, where chunk size is the greater of 16 bytes and the size of one object and makes std::deque have std::list performance for anything other than chars, shorts and ints (luckily there's boost!!! It's the greater of 512 bytes and the size of one object b.t.w.). 4. Same goes for the growing policy. Not even boost::container::vector

...

features a growing policy, which seems an indication that this is not a heavily demanded customization point.

The fact that it's not there, does not proof anything and might be related to the point I raised in the above. This implementation of a vector <https://github.com/aguinet/pector>, which shares ideas (and has some other ones) with devector has a growth policy (although not as flexible and neat as devector). That implementation also provides, next to a multiplicative growth strategy, an additive growth strategy. The current devector design makes implementing this a trivial exercise.

...

5. Defining size_type as unsigned int for the sake of class size seems to me overzealous.

I dis-agree, the above pt::pector goes further in this respect and (can) reduce(s) it's footprint to 16 bytes (64 bits app.) if required (at the cost of a small performance penalty). The use case I referred to above is a DAG with adjacency vectors! i.e. every node (many in my use case) has 2 vectors (in and out), shaving of 8 bytes makes a difference (in improved locality and overall memory use). Why not make this decision a template parameter (as in pt::pector) and be done with it? degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Edward Diener

12:42 p.m.

On 9/27/2017 2:03 AM, degski via Boost wrote:

...

On 27 September 2017 at 00:43, Joaquin M López Muñoz via Boost < boost@lists.boost.org> wrote:

...
... but libraries, in my opinion, should start with a minimal and well grounded design and add more stuff as demand proves their need.

This sounds good in theory, but doesn't work like that in practice. Libraries tend to get "stuck" in their inital state.

As an example of this: I recently proposed to add variadic move construction for nodes to boost::pool (a trivial change). And even though this has obvious advantages (and would get rid for C++11 capable compilers of the use of M4 to generate the necessary code), this was received with: file a PR, including tests (?) and change to documentation. Enough to put me off, so I just changed it locally.

The pool library has no official maintainer so it had been added to the CMT libraries. You would probably have gotten more response to your suggestion if pool had an official maintainer. I do not think that is much proof that libraries tend to get "stuck" in their initial state, although it happens some of the time. C++ inheritance often means that you can add functionality to a library via another library or your own one-off class(es). But I agree that the initial notion that a library should have a well grounded design, although "minimal" is highly subjective, is a good one. snipped...

degski

2:12 p.m.

On 27 September 2017 at 15:42, Edward Diener via Boost < boost@lists.boost.org> wrote:

...

The pool library has no official maintainer so it had been added to the CMT libraries. You would probably have gotten more response to your suggestion if pool had an official maintainer.

It's un-fortunately a sad state of affairs. What bugs me the most is (nobody here to blame), that there are (on this list) on a regular basis so many people stating they would like to "contribute" to Boost... and then, there's silence.

...

I do not think that is much proof that libraries tend to get "stuck" in their initial state, although it happens some of the time.

No proof intended, just examples... For some reason, lack of maintainers, backward (binary) compatibility, things don't progress... And one ends up, like you suggested below, doing ones' own thing.

...

C++ inheritance often means that you can add functionality to a library via another library or your own one-off class(es).

Reason number one is simple, it's this rule: "Thou shall not inherit from class xxx publicly in the absence of a virtual destructor as it effectively prevents you from polymorphic use of descendants.", and this rule is repeated over and over again. boost::pool does not have a virtual destructor (trivial change, I agree, but that was what I was doing in the first place!).

...

But I agree that the initial notion that a library should have a well grounded design, although "minimal" is highly subjective, is a good one.

Getting back on the thread we are actually discussing, it seems to me that there is a discrepancy between what Benedek proposes and how this library is being reviewed (by some), that's the source of the subjectiveness you are referring to, IMO. The idea (the way I see it) is not necessarilly that the library will/or should be minimal and do the right thing all the time, but that it should provide maximum control and potentially (when well used) optimal performance. We are looking at control C-style. Yes, one can "shoot ones' leg of". For me, that's understood from the outset. It's most probably not a candidate for standardization, but, on the other hand, might become quite popular with game- or HFT-app-developpers. degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Gavin Lambert

10:27 p.m.

On 28/09/2017 03:12, degski wrote:

...

No proof intended, just examples... For some reason, lack of maintainers, backward (binary) compatibility, things don't progress... And one ends up, like you suggested below, doing ones' own thing.

Boost does not and has never (to my knowledge) guaranteed backwards binary compatibility in general, although it does try to maintain backwards source compatibility (but even that has been broken on occasion when sufficiently justified or just by accident). There are a few exceptions, eg. Boost.Serialization preserves backwards binary compatibility of its serialization archives (but still not necessarily of the code itself). *Some* Boost libraries try to provide a certain amount of ABI protection, where you get link errors instead of erroneous code if you link incompatible compilation units together (eg. where a library is used with different options in each case). Some others try to allow both versions to coexist. Still others don't have any protection and things will just misbehave at runtime in that case. But these are all different from backwards compatibility.

...

Reason number one is simple, it's this rule: "Thou shall not inherit from class xxx publicly in the absence of a virtual destructor as it effectively prevents you from polymorphic use of descendants.", and this rule is repeated over and over again. boost::pool does not have a virtual destructor (trivial change, I agree, but that was what I was doing in the first place!).

While that's a good rule in general, it only really applies to classes that are actually going to be used polymorphically -- ie. you're passing around pointers to the base class. That's not really likely to be the case for classes in Boost.Pool.

Soul Studios

28 Sep 28 Sep

12:47 a.m.

...

I dis-agree, the above pt::pector goes further in this respect and (can) reduce(s) it's footprint to 16 bytes (64 bits app.) if required (at the cost of a small performance penalty). The use case I referred to above is a DAG with adjacency vectors! i.e. every node (many in my use case) has 2 vectors (in and out), shaving of 8 bytes makes a difference (in improved locality and overall memory use).

Why not make this decision a template parameter (as in pt::pector) and be done with it?

There's a couple of problems - main one is that two objects from the same template and with the same T cannot interact (=, ==, swap etc) easily if the size_type is a template parameter. The second is that unless the type being declared in template parameter is being used a -lot-, the reduction in performance cost and memory usage is very low. I think a better alternative is to make the default case as generic as possible, then make it easy for people to modify the code if they have specific cases that require or benefit from doing so- I use a template parameter for the skipfield type in colony, but that's used as much as T so it makes sense.

Steven Watanabe

7 Oct 7 Oct

10:49 p.m.

AMDG On 09/27/2017 12:03 AM, degski via Boost wrote:

...

On 27 September 2017 at 00:43, Joaquin M López Muñoz via Boost < boost@lists.boost.org> wrote:

...
... but libraries, in my opinion, should start with a minimal and well grounded design and add more stuff as demand proves their need.

This sounds good in theory, but doesn't work like that in practice. Libraries tend to get "stuck" in their inital state.

As an example of this: I recently proposed to add variadic move construction for nodes to boost::pool (a trivial change). And even though this has obvious advantages (and would get rid for C++11 capable compilers of the use of M4 to generate the necessary code), this was received with: file a PR, including tests (?) and change to documentation. Enough to put me off, so I just changed it locally.

That sounds pretty normal to me. Documentation and tests are the basic requirements for any change. (Naturally, if the interface doesn't change, the documentation shouldn't need to be updated.) In Christ, Steven Watanabe

Thorsten Ottosen

27 Sep 27 Sep

4:23 p.m.

Den 26-09-2017 kl. 23:43 skrev Joaquin M López Muñoz via Boost: Thanks for the thorough review.

...

3. The question arises of whether segment access can gain us some speed. I've written a small test to measure the performance of a plain std::for_each loop over a batch_deque vs. an equivalent sequence of segment-level loops (attached, batch_deque_for_each.cpp), and this is what I got for Visual C++ 2015 32-bit (x86) release mode in a Windows 7 64-bit box with an Intel Core i5-2520M @2.5GHz:

[](int x){return x;} segment size: 32 n plain segmented 10E3 25.5472 23.6305 10E4 24.5778 23.6907 10E5 24.5821 22.8076 10E6 25.5007 23.1037 10E7 27.1452 24.0339 segment size: 512 n plain segmented 10E3 23.8384 23.6638 10E4 23.0284 23.8705 10E5 22.8449 22.8187 10E6 23.8485 23.7454 10E7 24.1711 23.5404

[](int x){return x%4?x:-x;} segment size: 32 n plain segmented 10E3 33.9795 23.6662 10E4 32.4817 24.023 10E5 32.8731 23.3803 10E6 33.5396 22.9298 10E7 33.1034 23.0206 segment size: 512 n plain segmented 10E3 25.0623 23.3205 10E4 25.1048 23.5812 10E5 25.3343 21.7686 10E6 25.6961 22.4639 10E7 25.8664 22.9964

For 32-bit release mode on windows 7 64 bit with intel i7-2700K: [](int x){return x;} segment size: 32 n plain segmented 10E3 21.4589 21.8351 10E4 19.9545 20.5133 10E5 19.4889 20.6197 10E6 19.2552 19.6976 10E7 19.2919 19.5425 segment size: 512 n plain segmented 10E3 20.2503 20.6372 10E4 19.0234 19.3367 10E5 18.5394 18.6171 10E6 18.555 18.5816 10E7 19.0918 19.1833 [](int x){return x%4?x:-x;} segment size: 32 n plain segmented 10E3 28.743 19.7501 10E4 26.8371 19.0719 10E5 27.0304 18.7624 10E6 26.9561 18.2357 10E7 27.2985 18.6425 segment size: 512 n plain segmented 10E3 22.1073 20.0347 10E4 20.7825 19.5639 10E5 20.6122 18.0773 10E6 20.6039 18.4895 10E7 21.7964 19.1822 So basically the same as your results. The case for segment size 32 and a non-trivial lambda does show some speedup, doesn't it? For 64-bit release mode on windows 7 64 bit with intel i7-2700K: [](int x){return x;} segment size: 32 n plain segmented 10E3 34.748 21.1357 10E4 32.8879 19.8592 10E5 32.6779 18.955 10E6 32.6255 19.3307 10E7 33.2282 19.3158 segment size: 512 n plain segmented 10E3 28.442 20.0265 10E4 26.5783 18.5851 10E5 26.4857 18.6023 10E6 26.4884 18.6571 10E7 27.0076 19.1338 [](int x){return x%4?x:-x;} segment size: 32 n plain segmented 10E3 43.0149 18.8431 10E4 42.2736 18.5071 10E5 42.4035 18.7087 10E6 42.1964 18.3355 10E7 42.8113 18.7723 segment size: 512 n plain segmented 10E3 40.3695 19.0028 10E4 38.5371 18.2029 10E5 38.2163 17.85 10E6 38.2952 17.9199 10E7 38.7489 18.6342 I don't know why a 64-bit program would be slower, but there seems to be a larger difference here. I'm wondering how the results would be on 32/64 bit ARM. Also, I do expect a benchmark of serialization to be much better. I don't think one do that optimally without access to the segments. Benedek, could you please make a test of the performance of serialization for both devector/batch_deque vs boost::vector/boost::deque (release mode, full speed optimization), perhaps using the same measuring technique as employed by Joaquin. And then post results and code so people can run it on their favorite system. You should use types char, int, and something bigger, e.g. string or array<int,32>. kind regards Thorsten

Joaquin M López Muñoz

7:10 p.m.

El 27/09/2017 a las 18:23, Thorsten Ottosen via Boost escribió:

...

For 32-bit release mode on windows 7 64 bit with intel i7-2700K:

[...]

So basically the same as your results. The case for segment size 32 and a non-trivial lambda does show some speedup, doesn't it?

For 64-bit release mode on windows 7 64 bit with intel i7-2700K:

[...]

I don't know why a 64-bit program would be slower, but there seems to be a larger difference here.

I have no idea: ints are the same size in 64-bit mode, so the resulting structures are basically the same, I'd say. Anyway, there does seem to be cases where the difference is substantial, which could merit having specialized versions of std algorithms for batch_deque (or more generally, as discussed also in this thread, for segmented structures). I can try to help with that since I did something similar for Boost.PolyCollection (with an additional feature, namely type restitution, which is not applicable here). Joaquín M López Muñoz

Thorsten Ottosen

2 Oct 2 Oct

5:30 p.m.

Hi, Some more points to consider. I tried comparing push_back/unsafe_push_back a little. I don't see any performance speedup on my system if the loop body is just the push_back. If I make the loop a little more complicated than just a push_back, I get some differences: 32 bit: boost::vector::push_back vs devector::unsafe_push_back 10E3 29.0508 24.8578 10E4 26.9987 22.7853 10E5 26.9468 22.3659 10E6 26.9766 22.3562 10E7 27.1064 22.5215 64 bit: 10E3 28.6106 24.2881 10E4 27.9322 23.6613 10E5 28.0182 23.3318 10E6 28.0493 23.3803 10E7 28.1458 23.5394 32 bit: boost::deque::push_back vs batch_deque::unsafe_push_back 10E3 34.803 45.6774 10E4 36.0686 46.3888 10E5 36.7773 48.6228 10E6 40.7754 49.6144 10E7 43.1134 49.1949 64 bit: 10E3 33.5997 30.0433 10E4 35.5992 32.8985 10E5 34.4685 48.9641 10E6 40.7974 48.6893 10E7 43.2836 48.9001 I'm not 100% sure why batch_deque is so much slower, but looking a little at the assembler, it seems that going from an iterator to a pointer may be expensive because the batch_deque::iterator uses a segment pointer and an index. (normal push_back in batch_deque therefore also appears somewhat slower than in boost::deque). kind regards Thorsten

Benedek Thaler

10:02 p.m.

On Mon, Oct 2, 2017 at 7:30 PM, Thorsten Ottosen <tottosen@dezide.com> wrote:

...

Hi,

Some more points to consider. I tried comparing push_back/unsafe_push_back a little. I don't see any performance speedup on my system if the loop body is just the push_back. If I make the loop a little more complicated than just a push_back, I get some differences:

Hi,

My concern with the above benchmark, is that it does not compare the same thing: it calls reserve here but not there (std::deque and boost::container::deque have no reserve). I tried a similar test, comparing devector/vector/container.vector unsafe_push_back/push_back, using google benchmark. See the code attached. This is how I run it: $ uname -r 3.13.0-24-generic $ g++ ---version g++-5 (Ubuntu 5.4.1-2ubuntu1~14.04) 5.4.1 20160904 $ g++ -Wall -Werror -Wextra -std=c++11 -O2 -march=native -I ../include/ -I ../../google-benchmark/include/ -I $BOOST_BUILD_PATH -L ../../google-benchmark/src/ google_push_back.cpp -DNDEBUG -lbenchmark -lpthread $ sudo su # echo performance > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor # nice -20 taskset -c 0 ./a.out Run on (8 X 3500 MHz CPU s) 2017-10-02 23:55:35 ***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead. (I consider this warning a false positive. CPU0 has no scaling, and this task is pinned there) ----------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------- devector_unsafe_push_back/8 785 ns 782 ns 897711 devector_unsafe_push_back/64 846 ns 844 ns 819441 devector_unsafe_push_back/512 1219 ns 1216 ns 582811 devector_unsafe_push_back/4096 4207 ns 4238 ns 164192 devector_unsafe_push_back/32768 30885 ns 30876 ns 24689 vector_push_back/8 796 ns 793 ns 890232 vector_push_back/64 967 ns 965 ns 724603 vector_push_back/512 2109 ns 2104 ns 335052 vector_push_back/4096 11279 ns 11300 ns 61336 vector_push_back/32768 84355 ns 84253 ns 8332 cvector_push_back/8 791 ns 788 ns 883267 cvector_push_back/64 861 ns 858 ns 809779 cvector_push_back/512 1330 ns 1326 ns 527179 cvector_push_back/4096 5081 ns 5109 ns 135001 cvector_push_back/32768 34288 ns 34286 ns 20202 # echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor Re-run results in very similar results. I think the numbers are convincing in favor of devector::unsafe_push_back. Please repeat the test. Thanks, Benedek

Thorsten Ottosen

3 Oct 3 Oct

7:58 a.m.

Den 03-10-2017 kl. 00:02 skrev Benedek Thaler via Boost:

...

On Mon, Oct 2, 2017 at 7:30 PM, Thorsten Ottosen <tottosen@dezide.com> wrote:

...
Hi,

Some more points to consider. I tried comparing push_back/unsafe_push_back a little. I don't see any performance speedup on my system if the loop body is just the push_back. If I make the loop a little more complicated than just a push_back, I get some differences:

Hi,

My concern with the above benchmark, is that it does not compare the same thing: it calls reserve here but not there (std::deque and boost::container::deque have no reserve).

Well, true. But since the unsafe_push_back on batch_deque needs to be preceded by a reserve, it would not be fair to exclude that.

...

I tried a similar test, comparing devector/vector/container.vector unsafe_push_back/push_back, using google benchmark. See the code attached. This is how I run it:

what is container.vector? Also, the file you attached uses batch_deque and std::vector, you test mentions 3 types .. kind regards Thorsten

Benedek Thaler

4:27 p.m.

On Tue, Oct 3, 2017 at 9:58 AM, Thorsten Ottosen <tottosen@dezide.com> wrote:

...

what is container.vector?

It is Boost.Containers vector.

...

Also, the file you attached uses batch_deque and std::vector, you test mentions 3 types ..

Wrong file was attached. Let me try again. Benedek

Thorsten Ottosen

13 Oct 13 Oct

5:13 p.m.

Den 03-10-2017 kl. 00:02 skrev Benedek Thaler via Boost:

...

Re-run results in very similar results. I think the numbers are convincing in favor of devector::unsafe_push_back.

There is not a huge difference between devector and vector. If I take my test where I try to confuse the branch predictor and alter just 1 flag: "favor small code" instead of "favor fast code", then I get: 32 bit: 10E3 63.8728 42.2238 10E4 62.2564 40.5588 10E5 62.0731 40.2956 10E6 62.2532 40.2451 10E7 62.3553 43.5931 64 bit: 10E3 29.2422 24.6654 10E4 28.3354 24.246 10E5 28.2262 23.7662 10E6 28.2777 23.8788 10E7 28.3777 23.8641 I think that is a good example of how the compiler can play tricks on you. And this is major reason we should always strive after providing abstractions with zero overhead. kind regards Thorsten

Benedek Thaler

3 Oct 3 Oct

5:32 p.m.

On Wed, Sep 27, 2017 at 6:23 PM, Thorsten Ottosen <tottosen@dezide.com> wrote:

...

Benedek, could you please make a test of the performance of serialization for both devector/batch_deque vs boost::vector/boost::deque (release mode, full speed optimization), perhaps using the same measuring technique as employed by Joaquin. And then post results and code so people can run it on their favorite system. You should use types char, int, and something bigger, e.g. string or array<int,32>.

Hi, Please find the benchmark code and results attached. It compares serialization of devector/batch_deque/std::vector/std::deque. (I couldn't find a ready serializer for Boost.Container). The comparison is not exactly fair, because it seems boost/serialization/vector.hpp also adds version information. Measurement was done as above. Brief preview of the full set of attached results: ------------------------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------------------------------------------------------------------- boost::double_ended::devector<char>/32768 94197 ns 94039 ns 7430 std::vector<char>/32768 94351 ns 94235 ns 7457 boost::double_ended::batch_deque<std::array<char, 32>>>/32768 2975327 ns 2970627 ns 234 std::deque<std::array<char, 32>>>/32768 5237764 ns 5228441 ns 133 Benedek

Thorsten Ottosen

4 Oct 4 Oct

4:24 p.m.

Hi Benedek,

...

Please find the benchmark code and results attached. It compares serialization of devector/batch_deque/std::vector/std::deque. (I couldn't find a ready serializer for Boost.Container). The comparison is not exactly fair, because it seems boost/serialization/vector.hpp also adds version information. Measurement was done as above.

Brief preview of the full set of attached results:

I was a bit surprised that std::vector performed almost as good as devector. Then I realized that you are testing serialization "save" and not serialization "load". Given that, I'm now surprised that devector/batch_deque is faster than std::vector for e.g. std::array<char, 32>! I did not expect that. However, the "load" test should be even more interesting as that is the one that involves potential double initialization. Could you post those numbers too? Thanks in advance Thorsten

Benedek Thaler

8:48 p.m.

On Wed, Oct 4, 2017 at 6:24 PM, Thorsten Ottosen <tottosen@dezide.com> wrote:

...

However, the "load" test should be even more interesting as that is the one that involves potential double initialization.

Could you post those numbers too?

Hi, Please find the code and results attached. As expected, both devector and batch_deque highly outperforms std::vector and queue in deserialization. Benedek

degski

5 Oct 5 Oct

2:19 a.m.

On 4 October 2017 at 23:48, Benedek Thaler via Boost <boost@lists.boost.org> wrote:

...

As expected, both devector and batch_deque highly outperforms std::vector and queue in deserialization.

This sounds (to me) counterintuitive. Why is this the case? degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

8:20 a.m.

Den 05-10-2017 kl. 04:19 skrev degski via Boost:

...

On 4 October 2017 at 23:48, Benedek Thaler via Boost <boost@lists.boost.org> wrote:

...
As expected, both devector and batch_deque highly outperforms std::vector and queue in deserialization.

This sounds (to me) counterintuitive. Why is this the case?

Because they can avoid double initialization for some types. When using std::vector, you have to either A. reserve + load item to local object + push_back B. resize + delegate to bulk operation in Boost.Serialization The bulk operation is to prefer, but then requires redundant initialization in resize. kind regards Thorsten

Thorsten Ottosen

3:37 p.m.

Den 05-10-2017 kl. 10:20 skrev Thorsten Ottosen via Boost:

...

Den 05-10-2017 kl. 04:19 skrev degski via Boost:

...
On 4 October 2017 at 23:48, Benedek Thaler via Boost <boost@lists.boost.org> wrote:

...
As expected, both devector and batch_deque highly outperforms std::vector and queue in deserialization.

This sounds (to me) counterintuitive. Why is this the case?

Because they can avoid double initialization for some types. When using std::vector, you have to either

A. reserve + load item to local object + push_back B. resize + delegate to bulk operation in Boost.Serialization

The bulk operation is to prefer, but then requires redundant initialization in resize.

I gave this a little thought, and must admit that the numbers cannot be explained by double initialization alone. Looking at the serialization code for DoubleEnded, it think may not be quite in line with the Boost.Serialization approach. For example, it seems to be me that it is not working correctly with e.g. xml archives. Basically, the implementation for trivial types avoids all the Boost.Serialization machinery by calling load_binary/save_binary directly. Now, what I don't understand is how that machinery can be so costly for the binary archive case to get the huge difference we see with std::vector. This is a real mystery (In the same way the save test got much slower when the type became array<int,N>). Secondly, I think we can see the current numbers as a best case of what is possible with the extra functionality provided by DoubleEnded. There are other serialization libs out there, and they should be able to benefit even if there is no way to specialize for the binary case in Boost.Serialization. kind regards Thorsten

degski

4:03 p.m.

On 5 October 2017 at 18:37, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

Secondly, I think we can see the current numbers as a best case of what is possible with the extra functionality provided by DoubleEnded. There are other serialization libs out there, and they should be able to benefit even if there is no way to specialize for the binary case in Boost.Serialization.

I added devector to cereal, the file is attached and should be placed in the types sub-dir (of cereal). It's just a copy of the std::vector code, with the std::vector<bool> specialisation taken out. Not tested. degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

5:23 p.m.

Den 05-10-2017 kl. 18:03 skrev degski via Boost:

...

On 5 October 2017 at 18:37, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...
Secondly, I think we can see the current numbers as a best case of what is possible with the extra functionality provided by DoubleEnded. There are other serialization libs out there, and they should be able to benefit even if there is no way to specialize for the binary case in Boost.Serialization.

I added devector to cereal, the file is attached and should be placed in the types sub-dir (of cereal). It's just a copy of the std::vector code, with the std::vector<bool> specialisation taken out. Not tested.

Cute. But would one not use std::is_trivially_copyable instead of std::is_arithmetic? And instead of resize, should you not call the version that avoids double initialization? kind regards Thorsten

degski

6 Oct 6 Oct

2:25 a.m.

On 5 October 2017 at 20:23, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

Cute.

I'll tell the wife! But would one not use std::is_trivially_copyable instead of

...

std::is_arithmetic?

The std::vector code is done this way. I assume(d) the library writers are well aware of std::is_trivially_copyable.

...

And instead of resize, should you not call the version that avoids double initialization?

Let me ask that question the other way around. If based on type_traits it could be worked out (at compile-time) that double initialization can be avoided, why doesn't resize() do the "right" thing on it's own? resize() should do what it says it does, and do that in the optimal way, given T. degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Vinnie Falco

2:44 a.m.

On Thu, Oct 5, 2017 at 7:25 PM, degski via Boost <boost@lists.boost.org> wrote:

...

I'll tell the wife!

Has anyone done a search on Google for "DoubleEnded?" To see if there's another library with the same name. Thanks

degski

2:47 a.m.

On 6 October 2017 at 05:25, degski <degski@gmail.com> wrote:

...

Let me ask that question the other way around.

Drop that question, I've got it now, seem to have a slow start today... degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

27 Sep 27 Sep

7:27 p.m.

Den 26-09-2017 kl. 23:43 skrev Joaquin M López Muñoz via Boost:

...

El 21/09/2017 a las 19:38, Thorsten Ottosen via Boost escribió:

...
Dear users and members of Boost,

[snip]

...

and mostly utility-level. We already have a bestiary of containers in Boost, namely Boost.Container, so I think both devector and batch_deque are better served if proposed as components of this latter libray. Code in boost::double_ended::detail can go to / merge into boost::container[::detail] --for instance, boost::double_ended::detail::allocator_traits seems to be redundant with boost::containter::allocator_traits.

This makes sense, but it does require that the author of Boost.Container is willing to entertain that idea, doesn't it?

...

REVIEW FOR DEVECTOR

...

and violates the "don't pay for what you don't use" principle. I'd remove it.

How does it violate that?

...

7. Modulo the points discussed above, the interface of devector follows closely that of std::vector (with the obvious additions, e.g. [push|pop_emplace]_front), which is very good. The only interface decisions I'm not fully convinced about is capacity/ resize/reserve/shrink_to_fit without "front" or "back" qualification:

What about generic code that can be used with either a vector or devector?

...

- capacity() seems to be (from what I can infer reading the docs only) the sum of front capacity and back capacity. I can't find any sensible use of this info, and, besides, the careless reader could assume capacity() works as std::vector::capacity(), which raises usability concerns. My suggestion is to remove this. FWIW, we had a similar discussion about capacity() in Boost.PolyCollection, and that member function finally was dropped.

The capacity from Boost.PolyCollection was quite another beast. Here capacity is just like vector's capacity(), namely the sice of the contiguous chunk of memory.

...

- front_free_capacity/back_free_capacity could be more aptly named front_capacity/back_capacity.

But unlike capacity() in vector, these actually tells you how many push_back/push_front you can do without reallocation. There is no definition of "back capacity" because there is no "middle".

...

- reserve as an alias to reserve_back has usability problems as discussed with capacity.

Again, what about generic code?

...

I'd remove them. - For symmetry, shrink_to_fit could be complemented with [front|back]_shrink_to_fit

Is back_shrink_to_fit then an alias for shink_to_fit or does it actually consider both ends?

...

be documented officially. I'm assuming that erasure behaves analogously (i.e. space is given back to the side that's closer to the erasure point) --in particular, inserting an element and erasing it immediately after should return front and back capacities to their original values.

Anything else would be very inefficient.

...

REVIEW FOR BATCH_DEQUE

...

d.segment_[begin|end](), which allows us to use a range for in:

for(auto segment:d.segments){ for(auto& element:segment){ // do something with element } }

This is a good idea IMO.

...

2. As with devector, why is serialization code is so complicated and does not simply rely on boost::serialization::stl::collection_load_impl|save_collection?

Let's wait for the performance test.

...

3. Tests look good.

4. Postconditions on front and back capacity are not documented.

You mean front_free_capacity etc? What condition did you have in mind? The funcions are const. kind regards Thorsten

Ion Gaztañaga

8:50 p.m.

On 27/09/2017 21:27, Thorsten Ottosen via Boost wrote:

...

Den 26-09-2017 kl. 23:43 skrev Joaquin M López Muñoz via Boost:

...
El 21/09/2017 a las 19:38, Thorsten Ottosen via Boost escribió:

...
Dear users and members of Boost,

[snip]

...
and mostly utility-level. We already have a bestiary of containers in Boost, namely Boost.Container, so I think both devector and batch_deque are better served if proposed as components of this latter libray. Code in boost::double_ended::detail can go to / merge into boost::container[::detail] --for instance, boost::double_ended::detail::allocator_traits seems to be redundant with boost::containter::allocator_traits.

This makes sense, but it does require that the author of Boost.Container is willing to entertain that idea, doesn't it?

I haven't reviewed the library yet, and I can't tell yow feasible is this. Something similar happened wrt static_vector and stable_vector. However, Boost.Container still supports C++03 and other features, so the implementation should be updated to match the rest of the library (if tests are good enough that should not be a problem). devector definitely looks interesting, I haven't investigated yet the essential differences between boost::container::deque and batch_deque. Best, Ion

Joaquin M López Muñoz

28 Sep 28 Sep

1:02 p.m.

El 27/09/2017 a las 21:27, Thorsten Ottosen via Boost escribió:

...

Den 26-09-2017 kl. 23:43 skrev Joaquin M López Muñoz via Boost:

...
REVIEW FOR DEVECTOR

...
and violates the "don't pay for what you don't use" principle. I'd remove it.

How does it violate that?

All the _storage handling is done even if there's the small buffer is 0.

...

...
7. Modulo the points discussed above, the interface of devector follows closely that of std::vector (with the obvious additions, e.g. [push|pop_emplace]_front), which is very good. The only interface decisions I'm not fully convinced about is capacity/ resize/reserve/shrink_to_fit without "front" or "back" qualification:

What about generic code that can be used with either a vector or devector?

Problem is: suppose d.size()==5, d.capacity()==10,d.back_free_capacity()==2; doing for(int i=0;i<5;++i)d.push_back(i) will reallocate, unlike std::vector.

...

[...]

...
- front_free_capacity/back_free_capacity could be more aptly named front_capacity/back_capacity.

But unlike capacity() in vector, these actually tells you how many push_back/push_front you can do without reallocation. There is no definition of "back capacity" because there is no "middle".

You're right, my bad. the _free_ infix is very adequate.

...

...
- reserve as an alias to reserve_back has usability problems as discussed with capacity.

Again, what about generic code?

Similar consistency problem with std::vector. d.size()==5, d.capacity()==10, d.back_free_capacity()==2: d.reserve(d.capacity()) reallocates, whereas it is a no-op for std::vector.

...

...
I'd remove them. - For symmetry, shrink_to_fit could be complemented with [front|back]_shrink_to_fit

Is back_shrink_to_fit then an alias for shink_to_fit or does it actually consider both ends?

The semantics that seems to me the cleanest is * shrink_to_fit --> front_free_capacity()==0, back_free_capacity()==0 * front_shrink_to_fit --> front_free_capacity()==0, back_free_capacity() untouched * back_shrink_to_fit --> front_free_capacity() untouched, back_free_capacity()==0 Strictly speaking, standard says shrink_to_fit is an *indication*, so implement all three as no-ops would comply :-)

...

[...]

...
3. Tests look good.

4. Postconditions on front and back capacity are not documented.

You mean front_free_capacity etc? What condition did you have in mind? The funcions are const.

Sorry, I meant there is no indication of what happens to [front|back]_free_capacity after insertions and erasures. Best Joaquín M López Muñoz

Benedek Thaler

5:09 p.m.

FYI, with the intent of helping the review, I fixed some trivial bugs 18 hours ago: devector: fix clear() doc devector: fix complexity docs (amortized linear -> amortized constant) devector: fix insert bug, improve tests devector: free storage using old allocator in op= if needed devector: use emplace_back instead of push_back in range ops batch_deque: make policy segment_size constexpr and filed more complicated bugs as tickets [0]. These also fix some typos in the docs. To avoid confusion, I did not touch more substantial or controversial parts, therefore *no* change was made to the overall documentation, to the default growth policy or to the shifting behavior of insert(). Thanks, Benedek [0]: https://github.com/erenon/double_ended/issues

Thorsten Ottosen

5:33 p.m.

Den 28-09-2017 kl. 15:02 skrev Joaquin M López Muñoz via Boost:

...

El 27/09/2017 a las 21:27, Thorsten Ottosen via Boost escribió:

...
Den 26-09-2017 kl. 23:43 skrev Joaquin M López Muñoz via Boost:

...
REVIEW FOR DEVECTOR

...
and violates the "don't pay for what you don't use" principle. I'd remove it.

How does it violate that?

All the _storage handling is done even if there's the small buffer is 0.

Is that not optimized away?

...

...
...
7. Modulo the points discussed above, the interface of devector follows closely that of std::vector (with the obvious additions, e.g. [push|pop_emplace]_front), which is very good. The only interface decisions I'm not fully convinced about is capacity/ resize/reserve/shrink_to_fit without "front" or "back" qualification:

What about generic code that can be used with either a vector or devector?

Problem is: suppose d.size()==5, d.capacity()==10,d.back_free_capacity()==2; doing

for(int i=0;i<5;++i)d.push_back(i)

will reallocate, unlike std::vector.

Sure, but is that really a problem? The same can happen with two vector instances that have different capacity.

...

...
...
- reserve as an alias to reserve_back has usability problems as discussed with capacity.

Again, what about generic code?

Similar consistency problem with std::vector. d.size()==5, d.capacity()==10, d.back_free_capacity()==2:

d.reserve(d.capacity())

reallocates, whereas it is a no-op for std::vector.

Who writes code like that? :-) Let me try to understand: your problem is that a generic algorithm may exhibit a different allocation sequence depending on what type that it is given? And you are OK with that happening for the same type?

...

...
[...]

...
3. Tests look good.

4. Postconditions on front and back capacity are not documented.

You mean front_free_capacity etc? What condition did you have in mind? The funcions are const.

Sorry, I meant there is no indication of what happens to [front|back]_free_capacity after insertions and erasures.

You are right, there seems to be missing some postconditions. Also for push_front/push_back. -Thorsten

Benedek Thaler

27 Sep 27 Sep

9:07 p.m.

On Tue, Sep 26, 2017 at 11:43 PM, Joaquin M López Muñoz via Boost < boost@lists.boost.org> wrote:

...

Hi, this is my review of Boost.DoubleEnded. As my review approach is heavily influenced by my acceptance vote, let me begin with that:

I vote for conditional acceptance of both containers after they have been presented as separate components of Boost.Container and undergone significant changes, mostly along the line of simplifying their interface.

Thanks for the really elaborate review.

...

* A library for a familiy of containers (or any other classes) is justified when its members are related either by shared concepts, constructs and/or implementation. This is not the case of Boost.DoubleEnded, whose only general theme is the very lax notion of "double endianness", and for which the amount of common implementation code is small and mostly utility-level. We already have a bestiary of containers in Boost, namely Boost.Container, so I think both devector and batch_deque are better served if proposed as components of this latter libray. Code in boost::double_ended::detail can go to / merge into boost::container[::detail] --for instance, boost::double_ended::detail::allocator_traits seems to be redundant with boost::containter::allocator_traits.

Originally, a new library wasn't planned, but Boost.Container was targeted. Ion, however, considered that the design (or the execution) is not aligned with Container (I suppose), so we were left with this option.

...

* The main characteristics of both containers are obscured by many "extras" that feel insufficiently justified to me or even detrimental (I'll discuss in more detail below), so my vote is for this fat to be removed prior to acceptance in Boost. I can see much work and love have gone into adding this many niceties, and the author could understansably resist to simplification, but libraries, in my opinion, should start with a minimal and well grounded design and add more stuff as demand proves their need.

On the contrary. Removing unnecessary code is always a pleasure.

...

So, since I see this library as a pair of two unrelated classes, let me continue with the review for each separately.

REVIEW FOR DEVECTOR

[snip]

IMPLEMENTATION

1. I've taken just cursory look at the code, but everything looks very good in general. 2. I take that lines 2367-8 (and similar) in devector.hpp:

// avoid invalidating any reference in args later T tmp(std::forward<Args>(args)...);

are somehow related to LWG Issue 526, right?

Yes. As noted below, a more aggressive approach might be taken here. [snip]

...

4. I'm not comfortable with the fact that shifting on insertion turns to

...

the opposite side when the favored side (the one closer to the insertion point) happens to be full. This makes it very hard to reason about post-insertion front and back capacities. Also, the reference statement that insertion is "linear in half the size of *this [...]" is currently not true (when the favored side is full insertion takes longer than that). I'd rather drop this optimization and document that shifting always happen to the direction where the least operations are needed.

Point taken.

...

DOCUMENTATION

2. I confess I don't get what "devector implements the NAD (Not A Defect) resolution of LWG Issue 526" is about.

It says that the implementation ignores corner cases when an already inserted element is to be inserted again, e.g: v.insert(v.begin(), v[2]);

...

5. The reference section is very good. Capacity handling and which side shifting goes on insertion/erasure should IMHO be documented exhaustively (see my points above on this matter).

Noted.

...

REVIEW FOR BATCH_DEQUE

DESIGN

1. In the design rationale, batch_deque is presented as a deque with configurable segment size, which is true, but not the end of the story: segment access (segment_[begin|end]) is not mentioned, neither is stable insertion.

Noted. [snip - better interface of segment iterator]

...

3. The question arises of whether segment access can gain us some speed. I've written a small test to measure the performance of a plain std::for_each loop over a batch_deque vs. an equivalent sequence of segment-level loops (attached, batch_deque_for_each.cpp), and this is what I got for Visual C++ 2015 32-bit (x86) release mode in a Windows 7 64-bit box with an Intel Core i5-2520M @2.5GHz:

[snip]

A better use-case is when elements are processed in a batch (hence the name): http://erenon.hu/double_ended/double_ended/examples.html#double_ended.exampl... It's especially useful when doing scatter/gather IO.

...

6. Why a batch_deque_policy wrapper rather than directly providing the segment size to batch_queue as in

boost::double_ended::batch_deque<int,512>

This is to allow further customization points without an additional policy argument. One issue with the approach above: It isn't clear whether 512 is a number of elements or bytes. de::batch_deque<int, de::elements_per_segment<512>> is conceivable tough.

...

7. BTW, batch_deque_policy::segment_size should be defined as constexpr so as to fully honor the promise that "using a power of two increases performance because of the cheaper division and modulo operations".

Noted.

...

10. A bit of bikeshedding: what's the rationale for "batch_deque"? Not that I have a better name for this, though.

It is because of the segment iteration. Any better name would do.

...

Thanks to Benedek Thaler for his effort in writing and submitting this library, and to Thorsten in his dual role as mentor / review manager.

Thanks for your time and precise points, Benedek

degski

28 Sep 28 Sep

6:48 a.m.

On 28 September 2017 at 00:07, Benedek Thaler via Boost < boost@lists.boost.org> wrote:

...

This is to allow further customization points without an additional policy argument. One issue with the approach above: It isn't clear whether 512 is a number of elements or bytes. de::batch_deque<int, de::elements_per_segment<512>> is conceivable tough.

In a different but related application, I considered the underlying allocator (all Windows here), VirtualAlloc <https://msdn.microsoft.com/en-us/library/windows/desktop/aa366887(v=vs.85).aspx>. Requesting memory from the OS (bar my understanding) results in at least one memory page (4096 Kb) to be allocated (or a multiple of that). Not wanting to let any memory go to waste (it's already allocated anyway), I defined a template parameter that stipulates the *minimum* number of objects to be allocated in one segment. One can then calculate the number of pages required to make up that segment and following that, the number of objects that are actually being reserved in that segment. All this can obviously be static constexpr C++14 member functions. Please comment (anyone), as I might be totally deluded (as I'm still not really sure on this) and don't mind being put straight (preferably). degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

8:16 a.m.

Den 27-09-2017 kl. 23:07 skrev Benedek Thaler via Boost:

...

On Tue, Sep 26, 2017 at 11:43 PM, Joaquin M López Muñoz via Boost <

...

...
6. Why a batch_deque_policy wrapper rather than directly providing the segment size to batch_queue as in

boost::double_ended::batch_deque<int,512>

This is to allow further customization points without an additional policy argument. One issue with the approach above: It isn't clear whether 512 is a number of elements or bytes. de::batch_deque<int, de::elements_per_segment<512>> is conceivable tough.

Also note that the most general thing would be to allow both de::batch_deque<int, de::elements_per_segment<512>> and de::batch_deque<int, de::bytes_per_segment<512>> -T

degski

8:49 a.m.

On 28 September 2017 at 11:16, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

Also note that the most general thing would be to allow both

de::batch_deque<int, de::elements_per_segment<512>>

and

de::batch_deque<int, de::bytes_per_segment<512>>

One can do better and have: de::segment_policy<std::size_t bytes, std::size_t count = 1> The policy expressed is: "the greater of bytes and count * sizeof ( value_type )". So boost:container::deque would, as implemented now, have: de::segment_policy<512>; VS's would be: de::segment_policy<16>; degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

10:52 a.m.

Den 28-09-2017 kl. 10:49 skrev degski via Boost:

...

On 28 September 2017 at 11:16, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...
Also note that the most general thing would be to allow both

de::batch_deque<int, de::elements_per_segment<512>>

and

de::batch_deque<int, de::bytes_per_segment<512>>

One can do better and have:

de::segment_policy<std::size_t bytes, std::size_t count = 1>

The policy expressed is: "the greater of bytes and count * sizeof ( value_type )".

So boost:container::deque would, as implemented now, have:

de::segment_policy<512>;

VS's would be:

de::segment_policy<16>;

How is that better? Then I have to look up the documentation to understand how big the segment is... kind regards -Thorsten

degski

11:56 a.m.

On 28 September 2017 at 13:52, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

How is that better? Then I have to look up the documentation to understand how big the segment is...

As, in my mind it's all constexpr, we can have member function segment_size () at zero run-time cost, no need to look things up. It (the policy) also simply generalises what's already current practice. degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

12:02 p.m.

Den 28-09-2017 kl. 13:56 skrev degski via Boost:

...

On 28 September 2017 at 13:52, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...
How is that better? Then I have to look up the documentation to understand how big the segment is...

As, in my mind it's all constexpr, we can have member function segment_size () at zero run-time cost, no need to look things up. It (the policy) also simply generalises what's already current practice.

I'm talking about reading the actual code and having that code being self-documenting wrt. the size of the segment. kind regards -Thorsten

degski

12:26 p.m.

On 28 September 2017 at 15:02, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

I'm talking about reading the actual code and having that code being self-documenting wrt. the size of the segment.

Current practice is, that one is agnostic about what's actually happening (the segment size is an implementation detail). The policy should be defaulted to the current practice. If one wants more control, one *does* need to read the docs and understand what things mean, yes, that's true. No free lunch and such. There are thougher nuts to crack learning (modern) C++, I would think. de::batch_deque<int, de::bytes_per_segment<512>> does (due to possible alignment issues) still not give you the number of elements and the other way around de::batch_deque<int, de::elements_per_segment<512>> does not give you the segment size, for the same reason. It could be easily integrated in boost::container::deque without breaking anything. degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

5:36 p.m.

Den 28-09-2017 kl. 14:26 skrev degski via Boost:

...

On 28 September 2017 at 15:02, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...
I'm talking about reading the actual code and having that code being self-documenting wrt. the size of the segment.

...

de::batch_deque<int, de::bytes_per_segment<512>> does (due to possible alignment issues) still not give you the number of elements and the other way around de::batch_deque<int, de::elements_per_segment<512>> does not give you the segment size, for the same reason.

Well, surely the latter gives you the actual number of elements in a full segment?!? If you want to control that number, why should you care about the bytes and vice versa? -T

Thorsten Ottosen

5:55 p.m.

New subject: When is an optimization justified. Was Re: [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

Den 26-09-2017 kl. 23:43 skrev Joaquin M López Muñoz via Boost:

...

Hi, this is my review of Boost.DoubleEnded.

[snip]

...

* The main characteristics of both containers are obscured by many "extras" that feel insufficiently justified to me or even detrimental (I'll discuss in more detail below), so my vote is for this fat to be removed prior to acceptance in Boost. I can see much work and love have gone into adding this many niceties, and the author could understansably resist to simplification, but libraries, in my opinion, should start with a minimal and well grounded design and add more stuff as demand proves their need.

Joaquin & others, Since many of the reviewers have had strong opinions about safety and optimizations, I think we need to establish when it makes sense. As an example, consider this made up story. Assume that boost was around in 1992 and then Stepanov came around and wanted his new STL library to be reviewed by Boost. Boost members anno 1992 would immediately see that his vector had an unsafe operator[](size_t) which did not throw, but had a precondition, when violated, would lead to UB. Why would you want to save one branch or generate slightly smaller code? The reviewers then demanded that operator[](size_t) was removed from vector. So ... have we as a community lost the strive for zero penalty, code that doesn't throw, code that may be well-received in constrained environments? If not, could we at least entertain the idea of secondary class: devector<T> devec; ... devector_access<T> devec_access( devec ); // a non-copyable class that stores reference to devector<T> devec_access.nongrowing_push_back( x ); devec_access.uninitialized_resize( k ); // etc So whenever you have to do something a bit out of the ordinary, you do it via devector_access. That type is then in seperate header and documented under advanced usage.? -Thorsten

Ion Gaztañaga

3 Oct 3 Oct

11:30 p.m.

On 21/09/2017 19:38, Thorsten Ottosen via Boost wrote:

...

Dear users and members of Boost,

I'm proud to announce that the formal review of Benedek Thaler's Boost.DoubleEnded library starts today, September 21, and runs through September 30. This library is the result of Benedek's participation in Google Summer of Code 2015.

Hi, this is my review of Boost.DoubleEnded. ------------------------------------- ------------------------------------- What is your evaluation of the design? ------------------------------------- ------------------------------------- The library presents two containers, devector and batch_deque. I find the first one as a relatively simple but interesting vector extension supporting front insertions. In the future it could maybe take advantage of the backwards memory expansion (taking the previously adjacent free memory segment so that new elements could be added to front without moving the rest of the elements) supported by Boost.Interprocess and the Boost.Container’s modified DLMalloc. ---------- devector -------- 1) From a design point of view, I would prefer not to include the small buffer optimization with the data structure, although reviewing code, it seems many parts are to be optimized. It's like just having small_vector without having vector, most libraries have two classes for these two use cases. Different classes also ease that a type erasured base class can be defined independent from the static storage size (Boost.Container calls it small_vector_base, LLVM calls it SmallVectorBase) I think we should have a devector, and add small_devector and static_devector containers if there is demand. 2) Defaulting size_type to unsigned int is arbitrary and against existing practice. According to the usual STL rules, size_type shall come from the allocator, which knows what type can represent the number of elements that can be allocated. I find interesting the idea of expressing a different size_type to save space, but I am not sure if that should come from the allocator or from an option. 3) reserve_only_tag: I would rename it as “reserve_only_t” (similarly to std::allocator_arg_t and Boost.Container’s “default_init_t”). The front and back capacity version should be enough as I think devector should not show preference for back insertion (if a user wants only a back capacity, then it would use a vector and not a devector). Strictly speaking, capacity constructors are a bit redundant, but the same could be said about default constructing constructors: //it could be implemented as default constructor + resize explicit vector(size_type n, const Allocator & allocator = Allocator()); A reserve-only constructor plus an “unsafe” emplacement could make a container similar to an array, where no redundant checks are done when filling the array. But I don’t expect noticeable performance gains with this reserve-only constructor. 4) unsafe_uninitialized_tag: I don’t like these non-safe constructors, which are the cases?. With reserve-only constructors + unsafe emplace back a user can achieve nearly the same performance. For POD types (like the socket example mentioned in the documentation), I would suggest something like Boost.Container’s “default_init_t”. Instead of “unsafe_uninitialized_resize_front” I suggest resize_front(size_type, default_init_t) 5) reserve() / resize(): I think they have no sensible meaning in the container (capacity() could be useful, depending on the reallocation strategy). We should not treat back insertion as the “default” insertion. IMHO the API of devector should be symmetrical. 6) unsafe_push_back/front: I think they can be useful, many times we iterate until a known upper bound and capacity checks might be redundant in some code, just like we do with arrays. I suggest replacing “unsafe” with “unchecked” and “push” with “emplace” (which is much more general): “unchecked_emplace_back/front(Args&&….)”. 7) growth_policy: This is an option that could be interesting not only for devector but also for vector. However, I would prefer passing options as a single argument where options are combined, so that the class declaration is not modified each time a new option is added. Something like: template<class T, class Allocator, class Options= void> class devector; using myoptions = make_options_t<growth_factor<G>, size_type<unsigned short>, new_option<arg>, … >; usinge mydevector = devector<int, allocator<int>, myoptions>; a similar approach is used in Boost.Intrusive and will be added to Boost.Container. 8) Regarding "move_if_noexcept", Boost.Container does not use it as it consider a pessimized design for a situation (throwing moves) that will occur nearly never and few times will be handled using the provided strong guarantees. On the other hand, most people will notice the big slowdown that will receive because their move constructors cannot be marked as noexcept (pimpl idiom, small buffer optimization, sentinel-node based lists, devector itself would be copied if small_buffer_size is used, etc.). I think “move_if_noexcept” was just added for backwards compatibility with C++03’s strong guarantee, but devector does not need to follow the same rule. I understand many (most?) people don’t agree here, but don’t think we should sacrifice devector performance just to support rare throwing cases. 9) Minor comment 2: IMHO I think iterators would be better as classes instead of pointers, a user can always use data() to make sure pointers are returned. Many times unwanted overload resolutions can happen if pointers are used, so I prefer the type safety offered by a class-based iterator. It can also offer checked iterators in the future. 10) Default growth factor of 4. Sounds like overkill, 2 or 1.5 sounds more reasonable and in line with user expectations based on existing practice in STL and Boost. I’m not sure if shrinking should be part of the trait. ---------- Batch_deque -------- Many of the comments from devector are applicable here, so I will not repeat them here. 1) In general, I don’t see the need of batch_deque as a different container, because it’s basically a deque with some options (segment size and additional operations). Those could be added to boost::container::deque without breaking anythying. 2) Segment_iterators: Raw pointers are stored in segment iterators. That means that those could not be placed in the memory that the allocator allocates, if the allocator does not defined ::pointer as T*. Also, it seems strange that there is not segment_type/segment_info class. The iterator itself has members that return the address and the size of a segment. I think that’s unexpected, the expected segment iterator would return a reference to a class (e.g. segment_type) that represents the segment, which has the required operations, and also internal begin()/end() operations to obtain segment-local iterators (a la unordered containers). This could allow some optimized segment-based algorithms (see lafstern.org/matt/segmented.pdf, for more information) 3) stable_insert: I can’t see the usefulness of this operation. A batch_deque is a sequence, but the user cares about the insertion point but does not care if additional elements are created somewhere? I fail to see a real use case. ------------------------------------- ------------------------------------- What is your evaluation of the implementation? ------------------------------------- ------------------------------------- I think it’s solid. Tests look good. -------------------------- devector -------------------------- emplace_slow_path: The forced temporary creation should be avoided, it is overkill for containers of containers, devector<static_vector> or containers with sentinel nodes. IMHO supporting references to already included elements is a bad feature of std::vector (a corner case supported in single element insertion, ignored for ranges, etc.) Boost.Container chose not to support it and no one has demanded that feature: http://www.boost.org/doc/libs/1_65_1/doc/html/container/Cpp11_conformance.ht... Reallocation strategy: First, I considered this strategyunnatural, but now I am not sure. For a push_front, if no free front capacity is found, but there is free back capacity, then all data is shifted to insert the element. If interleaved front and back insertions are performed, I don’t know if this will lead to too many data movements. When memory is reallocated to make room for a front insertions, the free back capacity is unchanged and all new free capacity goes to front. This means that the last operation (front or back) determines where all the new capacity goes. I find a bit strange that N == front_back_capacity() does not mean that a reallocation will happen if we push_back N times, as repeated data movement will happen if front_free_capacity() is available. Another strategy is to treat both ends independently: if front insertion has no free space, it could always reallocate and make room at front, so further push_fronts don’t provoke data movement. This would be similar to batch_deque, where memory is allocated if there is no free front capacityregardless of the free back capacity (no data movement),f and free segment is recycled from back capacity to insert elements at front. Obviously, this approach consumes more memory. Yet another strategy would be distribute any new free capacity (after reallocation) between front and back, indepently from which operation (back or front) has provoked the allocation. emplace_slow_path: // construct at back + 1 from back (no guard) alloc_construct(end(), std::move(back())); // move back half right std::move_backward(position, end() - 1, end()); ++_back_index; This does not seem exception safe “++_back-index” should be done just after alloc_construct, same for front insertion. insert_range_slow_path_near_front/back: I see that construction + rotation is done, but I think this is not optimal, the implementation might be simplier, but more than necessary data movement is done. Range insert: A temporary temp_devector is used, which means copying data two times. The inline storage for 32 elements can blow the stack if T is something big (like static_vector<T, N>) -------------------------- batch_deque -------------------------- 1) I like that deque_[const_]iterator is lightweight, I don’t know if iteration/access it’s more efficient than more traditional approaches. boost::container::deque iterators are really fat and should be improved to something similar to batch_deque iterator. On the other hand deque_segment_[const_]iterators are heavier than needed, and I think they could be just like vector iterators if used to iterate a single segment. The problem is that deque_segment_iterator works both as segment information class and as a local iterator. I think a similar approach to PolyCollection’s “segment_traversal_info” as segment information could work better, as it allows really lightweight local iterators that can speed up specialized segmented algorithms (local iterators could be just vector-like iterators inside a segment). 2) The data structure o batch_deque is slightly bigger than the traditional one, just asking if it’s intended or it was developed like this to reuse code. Traditional deque (from SGI) used four elements (map pointer, map size, iterator to begin, iterator to end). Your implementation defines the map as a devector, which stores size and capacity for the map, but this allows you handle the excess capacity of the map, I’m not sure if it adds performance benefits. 3) I haven’t thoroughly reviewed all the code, but many devector comments are applicable: rotation, move_if_noexcept, etc. ------------------------------------- ------------------------------------- What is your evaluation of the documentation? ------------------------------------- ------------------------------------- Documentation is correct, but I expected more detail. If one of the goals of the library is efficiency, benchmarks should be there (maybe against Boost and Std vector and deque). ------------------------------------- ------------------------------------- What is your evaluation of the potential usefulness of the library? ------------------------------------- ------------------------------------- devector might be useful, I have doubts with batch_deque. Deque in general is not the most used STL container, so I don’t expect a lot of users, maybe for specialized cases. ------------------------------------- ------------------------------------- Did you try to use the library? With which compiler(s)? Did you have any problems? ------------------------------------- ------------------------------------- Just executed tests and examples and debugged to help code review. MSVC 2017 triggers quite a lot of warnings, mainly about constant conditional expressions. ------------------------------------- ------------------------------------- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study? ------------------------------------- ------------------------------------- Several hours analyzing the design and reading the code. An hour playing with the library with MSVC. ------------------------------------- ------------------------------------- Are you knowledgeable about the problem domain? ------------------------------------- ------------------------------------- I do think so. I’ve written a lot of containers for Boost and outside Boost. Deque is not my favorite container, though ;-) ------------------------------------- ------------------------------------- ¿Should we accept the library? ------------------------------------- ------------------------------------- I think the library (see my design comments) and the documentation still need improvements, but it's not the main issue. I’m reluctant to accept the library because I don’t see substantial benefits of batch_deque against deque (configurable segment size could be added to Boost.Container if that feature is considered useful), and I don’t know if the use cases where devector is supposed to shine needs a whole new library (we have no clear rules about how tiny a library can be, but in this case it would be a new container and reimplementation of another). Joaquín has proposed to merge this proposal to Boost.Container, but supported compilers and goals are quite different and a direct merge would be problematic as Boost.Container would loose coherency, common goals and minimum requirements between all containers. A middle solutions is to: 1) Start “porting” devector, after all issues have been agreed, (without small vector optimization, at least in the first version) to Boost.Container’s requirements (including some pre C++11 compilers) 2) Improve Boost.Container’s deque to support features (and maybe code) from batch_deque (configurable segment size, local iterators, etc.). That’s a lot of work, discards a lot of batch_queue code and I don’t know if Benedek would agree to help on this task after I spoke against this option when it was proposed and after doing so much work to write the library. Best, Ion

Thorsten Ottosen

4 Oct 4 Oct

9:45 a.m.

Hi Ion, Thanks for the detailed review. Some minor comments follow.

...

The library presents two containers, devector and batch_deque. I find the first one as a relatively simple but interesting vector extension supporting front insertions. In the future it could maybe take advantage of the backwards memory expansion (taking the previously adjacent free memory segment so that new elements could be added to front without moving the rest of the elements) supported by Boost.Interprocess and the Boost.Container’s modified DLMalloc.

Interesting idea.

...

---------- devector --------

1) From a design point of view, I would prefer not to include the small buffer optimization with the data structure, although reviewing code, it seems many parts are to be optimized.

Everything should be removed by the optimizer; if not, then that section of code needs to be amended with a constant expression.

...

It's like just having small_vector without having vector, most libraries have two classes for these two use cases. Different classes also ease that a type erasured base class can be defined independent from the static storage size (Boost.Container calls it small_vector_base, LLVM calls it SmallVectorBase) I think we should have a devector, and add small_devector and static_devector containers if there is demand.

Can they share some of the implementation, or are we left with a double (or triple) implementation effort?

...

2) Defaulting size_type to unsigned int is arbitrary and against existing practice.

We didn't do that as an arbitrary decision.

...

According to the usual STL rules, size_type shall come from the allocator, which knows what type can represent the number of elements that can be allocated.

I have no problems with STL rules, but it as you point out, even Boost.Container deviates from several of those.

...

3) reserve_only_tag: I would rename it as “reserve_only_t” (similarly to std::allocator_arg_t and Boost.Container’s “default_init_t”). The front and back capacity version should be enough as I think devector should not show preference for back insertion (if a user wants only a back capacity, then it would use a vector and not a devector).

So you would prefer it to divide the buffer in two approximately equal chucks? Or that one has to specify each chunk size explicitly each time (so full symmetry, and the one-sided constructor gone)?

...

Strictly speaking, capacity constructors are a bit redundant, but the same could be said about default constructing constructors: //it could be implemented as default constructor + resize explicit vector(size_type n, const Allocator & allocator = Allocator());

Yes, just like push_back is redundant when we have insert( iter, x ) :-)

...

A reserve-only constructor plus an “unsafe” emplacement could make a container similar to an array, where no redundant checks are done when filling the array. But I don’t expect noticeable performance gains with this reserve-only constructor.

No, it is for convenience.

...

4) unsafe_uninitialized_tag: I don’t like these non-safe constructors, which are the cases?. With reserve-only constructors + unsafe emplace back a user can achieve nearly the same performance. For POD types (like the socket example mentioned in the documentation), I would suggest something like Boost.Container’s “default_init_t”. Instead of “unsafe_uninitialized_resize_front” I suggest resize_front(size_type, default_init_t)

I didn't know about these default_init_t when we started in 2015. They have a good design IMO. But unfortunately they don't get rid of double initialization for non-trivial types. So maybe both resize_front( size_type, default_init_t ) resize_front( size_type, uninitialized_t ) would be an idea.

...

5) reserve() / resize(): I think they have no sensible meaning in the container (capacity() could be useful, depending on the reallocation strategy). We should not treat back insertion as the “default” insertion. IMHO the API of devector should be symmetrical.

Symmetry is good, but what about generic code?

...

8) Regarding "move_if_noexcept", Boost.Container does not use it as it consider a pessimized design for a situation (throwing moves) that will occur nearly never [snip] devector itself would be copied if small_buffer_size is used, etc.).

only when it stores elements with a non-noexcept move though.

...

9) Minor comment 2: IMHO I think iterators would be better as classes instead of pointers, a user can always use data() to make sure pointers are returned. Many times unwanted overload resolutions can happen if pointers are used, so I prefer the type safety offered by a class-based iterator. It can also offer checked iterators in the future.

I think you have a way to turn these into pointers again in Boost.Container, By defining some preprocessor symbol, right? Having pointers matters when interfacing with other types, because this is the only case where the iterators are seen as contiguous. I would much rather see that in debug mode the iterator type is a class and in release mode it is a pointer compared to having extra preprocessor symbols.

...

2) Segment_iterators: Raw pointers are stored in segment iterators. That means that those could not be placed in the memory that the allocator allocates, if the allocator does not defined ::pointer as T*.

There is definitely some work relating to allocators and such.

...

Also, it seems strange that there is not segment_type/segment_info class. The iterator itself has members that return the address and the size of a segment. I think that’s unexpected, the expected segment iterator would return a reference to a class (e.g. segment_type) that represents the segment,

I think we discussed that design at some point. Benedek, can you remember why we went in the other direction?

...

3) stable_insert: I can’t see the usefulness of this operation. A batch_deque is a sequence, but the user cares about the insertion point but does not care if additional elements are created somewhere? I fail to see a real use case.

A deque is a sequence, sure, but also a container with reference stability. I'm curious, when you use Boost.Intrusive, where do store the elements?

...

-------------------------- devector --------------------------

emplace_slow_path: The forced temporary creation should be avoided, it is overkill for containers of containers, devector<static_vector> or containers with sentinel nodes. IMHO supporting references to already included elements is a bad feature of std::vector (a corner case supported in single element insertion, ignored for ranges, etc.) Boost.Container chose not to support it and no one has demanded that feature: http://www.boost.org/doc/libs/1_65_1/doc/html/container/Cpp11_conformance.ht...

Yeah, I remember heated discussion about this in past. I'm not a fan of that feature of the STL containers.

...

Reallocation strategy: First, I considered this strategyunnatural, but now I am not sure. For a push_front, if no free front capacity is found, but there is free back capacity, then all data is shifted to insert the element. If interleaved front and back insertions are performed, I don’t know if this will lead to too many data movements. When memory is reallocated to make room for a front insertions, the free back capacity is unchanged and all new free capacity goes to front. This means that the last operation (front or back) determines where all the new capacity goes. I find a bit strange that N == front_back_capacity() does not mean that a reallocation will happen if we push_back N times, as repeated data movement will happen if front_free_capacity() is available.

Another strategy is to treat both ends independently: if front insertion has no free space, it could always reallocate and make room at front, so further push_fronts don’t provoke data movement. This would be similar to batch_deque, where memory is allocated if there is no free front capacityregardless of the free back capacity (no data movement),f and free segment is recycled from back capacity to insert elements at front. Obviously, this approach consumes more memory. Yet another strategy would be distribute any new free capacity (after reallocation) between front and back, indepently from which operation (back or front) has provoked the allocation.

This is tricky. I think the best is to treat both ends independently. If you start moving things around, you can break the O(1) amortized guarantee by interleaving push_front/push_back.

...

2) The data structure o batch_deque is slightly bigger than the traditional one, just asking if it’s intended or it was developed like this to reuse code. Traditional deque (from SGI) used four elements (map pointer, map size, iterator to begin, iterator to end). Your implementation defines the map as a devector, which stores size and capacity for the map, but this allows you handle the excess capacity of the map, I’m not sure if it adds performance benefits.

Reuse & simplicity would be the answer. If the iterators are fat, I bet you get the same size as as a batch_deque. Having a secondary ad hoc data structure with amortized O(1) double ended growth doesn't exactly make the code easier to reason about. kind regards Thorsten

Chris Glover

1:32 p.m.

...

I think you have a way to turn these into pointers again in Boost.Container, By defining some preprocessor symbol, right? Having pointers matters when interfacing with other types, because this is the only case where the iterators are seen as contiguous.

I would much rather see that in debug mode the iterator type is a class and in release mode it is a pointer compared to having extra preprocessor symbols.

This is a bad idea. Early std library implementations did this and it caused several problems, including code that compiled in release, but not debug, or worse, different overload resolution between debug and release builds. There should be a way to get at the pointers, of course, but it should require specific action by the programmer and not create different types automatically between debug and release builds, as that is surprising. -- chris

Thorsten Ottosen

1:38 p.m.

Den 04-10-2017 kl. 15:32 skrev Chris Glover via Boost:

...

...
I think you have a way to turn these into pointers again in Boost.Container, By defining some preprocessor symbol, right? Having pointers matters when interfacing with other types, because this is the only case where the iterators are seen as contiguous.

I would much rather see that in debug mode the iterator type is a class and in release mode it is a pointer compared to having extra preprocessor symbols.

This is a bad idea. Early std library implementations did this and it caused several problems, including code that compiled in release, but not debug, or worse, different overload resolution between debug and release builds.

Sure, the whole point of having a class type in debug mode is to catch things at compile time. And don't you get the same when we start defining BOOST_CONTAINER_ITERATOR_AS_POINTER? -T

Ion Gaztañaga

2:08 p.m.

On 04/10/2017 15:38, Thorsten Ottosen via Boost wrote:

...

And don't you get the same when we start defining BOOST_CONTAINER_ITERATOR_AS_POINTER?

Which was recently removed from the develop branch. AFAIK, there was no demand and maintenance and testing for both iterator modes was a cost I preferred to eliminate. Best, Ion

Thorsten Ottosen

3:23 p.m.

Den 04-10-2017 kl. 16:08 skrev Ion Gaztañaga via Boost:

...

On 04/10/2017 15:38, Thorsten Ottosen via Boost wrote:

...
And don't you get the same when we start defining BOOST_CONTAINER_ITERATOR_AS_POINTER?

Which was recently removed from the develop branch. AFAIK, there was no demand and maintenance and testing for both iterator modes was a cost I preferred to eliminate.

Ok. Given the lack of a contiguous iterator tag, this is not just a switch from pointer to class iterator. The internal code now needs to be careful when it's passed its own iterator type and detect that it actually has a contiguous range. kind regards -Thorsten

Ion Gaztañaga

8:21 p.m.

On 04/10/2017 17:23, Thorsten Ottosen via Boost wrote:

...

Den 04-10-2017 kl. 16:08 skrev Ion Gaztañaga via Boost:

...
On 04/10/2017 15:38, Thorsten Ottosen via Boost wrote:

...
And don't you get the same when we start defining BOOST_CONTAINER_ITERATOR_AS_POINTER?

Which was recently removed from the develop branch. AFAIK, there was no demand and maintenance and testing for both iterator modes was a cost I preferred to eliminate.

Ok.

Given the lack of a contiguous iterator tag, this is not just a switch from pointer to class iterator. The internal code now needs to be careful when it's passed its own iterator type and detect that it actually has a contiguous range.

Right. Internally used algorithms query "are_elements_contiguous<T>" traits which is specialized to true for vector_iterator. Ion

Thorsten Ottosen

5 Oct 5 Oct

3:46 p.m.

Den 04-10-2017 kl. 22:21 skrev Ion Gaztañaga via Boost:

...

On 04/10/2017 17:23, Thorsten Ottosen via Boost wrote:

...

...
Given the lack of a contiguous iterator tag, this is not just a switch from pointer to class iterator. The internal code now needs to be careful when it's passed its own iterator type and detect that it actually has a contiguous range.

Right. Internally used algorithms query "are_elements_contiguous<T>" traits which is specialized to true for vector_iterator.

Makes sense. And it works great when applied within one library, say Boost.Container. But then what about all the other container libraries in boost: BiMap CircularBuffer Gil MultiArray MultiIndex Unordered etc. We would have to have a Boost-wide trait for all interactions to be optimal. And then we can't do anything about interactions with the standard library. Anyway, I don't feel strongly about it either way. kind regards Thorsten

Ion Gaztañaga

4 Oct 4 Oct

9:28 p.m.

On 04/10/2017 11:45, Thorsten Ottosen via Boost wrote:

...

...
It's like just having small_vector without having vector, most libraries have two classes for these two use cases. Different classes also ease that a type erasured base class can be defined independent from the static storage size (Boost.Container calls it small_vector_base, LLVM calls it SmallVectorBase) I think we should have a devector, and add small_devector and static_devector containers if there is demand.

Can they share some of the implementation, or are we left with a double (or triple) implementation effort?

They can obviously share implementation. That's the case with vector/small_vector/static_vector.

...

...
2) Defaulting size_type to unsigned int is arbitrary and against existing practice.

We didn't do that as an arbitrary decision.

Yes, sorry, "arbitrary" sounds too strong.

...

...
According to the usual STL rules, size_type shall come from the allocator, which knows what type can represent the number of elements that can be allocated.

I have no problems with STL rules, but it as you point out, even Boost.Container deviates from several of those.

Yes, but only when they are "clearly bad rules (TM)" ;-)

...

...
3) reserve_only_tag: I would rename it as “reserve_only_t” (similarly to std::allocator_arg_t and Boost.Container’s “default_init_t”). The front and back capacity version should be enough as I think devector should not show preference for back insertion (if a user wants only a back capacity, then it would use a vector and not a devector).

So you would prefer it to divide the buffer in two approximately equal chucks? Or that one has to specify each chunk size explicitly each time (so full symmetry, and the one-sided constructor gone)?

Good questions. Maybe the first, but that would apply also to "reserve".

...

...
4) unsafe_uninitialized_tag: I don’t like these non-safe constructors, which are the cases?. With reserve-only constructors + unsafe emplace back a user can achieve nearly the same performance. For POD types (like the socket example mentioned in the documentation), I would suggest something like Boost.Container’s “default_init_t”. Instead of “unsafe_uninitialized_resize_front” I suggest resize_front(size_type, default_init_t)

I didn't know about these default_init_t when we started in 2015. They have a good design IMO. But unfortunately they don't get rid of double initialization for non-trivial types. So maybe both

resize_front( size_type, default_init_t ) resize_front( size_type, uninitialized_t )

would be an idea.

I understand the idea to avoid zero initialization for buffers, but what's the use case for types with non-trivial constructors? The documentation shows a socket example, and that's covered ith default_init_t.

...

...
5) reserve() / resize(): I think they have no sensible meaning in the container (capacity() could be useful, depending on the reallocation strategy). We should not treat back insertion as the “default” insertion. IMHO the API of devector should be symmetrical.

Symmetry is good, but what about generic code?

Generic code can't rely on reserve, only vector implements it in the STL. But if reserve() and resize() are unbiased to back insertion maybe they could make sense. reserve() allocates a new buffer if capacity() is not enough and moves existing elements to the middle. Resize could do the same and fill with elements on both ends. I somewhat have the idea that in a devector both front and back insertions should be common, so a back-biased approach should be avoided. Not sure if real usage follows this pattern.

...

...
8) Regarding "move_if_noexcept", Boost.Container does not use it as it consider a pessimized design for a situation (throwing moves) that will occur nearly never [snip] devector itself would be copied if small_buffer_size is used, etc.).

only when it stores elements with a non-noexcept move though.

Right.

...

...
9) Minor comment 2: IMHO I think iterators would be better as classes instead of pointers, a user can always use data() to make sure pointers are returned. Many times unwanted overload resolutions can happen if pointers are used, so I prefer the type safety offered by a class-based iterator. It can also offer checked iterators in the future.

I think you have a way to turn these into pointers again in Boost.Container, By defining some preprocessor symbol, right? Having pointers matters when interfacing with other types, because this is the only case where the iterators are seen as contiguous.

I would much rather see that in debug mode the iterator type is a class and in release mode it is a pointer compared to having extra preprocessor symbols.

That could have surprising compilation errors when switching modes. For debug iterators, I find interesting the libc++ approach (but I don't know how it performs) to maintain binary compatibility between checked and non-checked iterators using a global database store containers and iterators.

...

...
3) stable_insert: I can’t see the usefulness of this operation. A batch_deque is a sequence, but the user cares about the insertion point but does not care if additional elements are created somewhere? I fail to see a real use case.

A deque is a sequence, sure, but also a container with reference stability. I'm curious, when you use Boost.Intrusive, where do store the elements?

vector if object count or maximum object count is known and all elements are going to be destroyed at the same time. Deque if the number of elements are not known (but they are destroyed at the same time). If elements are destroyed in any order, then you need a pool, a list or some kind of free list backed with a deque or similar. stable_insert can maintain stability with a middle insertion but there is no equivalent operation for erasures. I just can't imagine a use case that tan take advantage of stable_insert (it needs more or less a positional insertion, but not absolute, instead of front/back insertion) but it will surely exist. Additional note: stable_insert also constructs a temporary batch_deque which is not passed the stored allocator so that objects are created in the same memory type of already stored objects. The temporary container should be avoided, maybe new segments could be back/front allocated and initialized then then just reinserted in their final positions in the map.

...

This is tricky. I think the best is to treat both ends independently. If you start moving things around, you can break the O(1) amortized guarantee by interleaving push_front/push_back.

What do you mean by "treat both ends independtly"? Best, Ion

Thorsten Ottosen

5 Oct 5 Oct

5:02 p.m.

Den 04-10-2017 kl. 23:28 skrev Ion Gaztañaga via Boost:

...

On 04/10/2017 11:45, Thorsten Ottosen via Boost wrote:

...
Can they share some of the implementation, or are we left with a double (or triple) implementation effort?

They can obviously share implementation. That's the case with vector/small_vector/static_vector.

Ok, great.

...

...
...
According to the usual STL rules, size_type shall come from the allocator, which knows what type can represent the number of elements that can be allocated.

I have no problems with STL rules, but it as you point out, even Boost.Container deviates from several of those.

Yes, but only when they are "clearly bad rules (TM)" ;-)

I don't have any strong position about the size_type but thought it nice to have a small size of the container on 64 system. I do sometimes wonder if breaking the strong guarantee is a smart move. It seems to me that we can end up with non-trivial objects where the invariant is broken. So in principle we can't even reliably destroy them if the destructor relies on the invariant. I guess I really dislike that we are allowed to write move operations that are not noexcept (like swap).

...

...
...
4) unsafe_uninitialized_tag: I don’t like these non-safe constructors, which are the cases?. With reserve-only constructors + unsafe emplace back a user can achieve nearly the same performance. For POD types (like the socket example mentioned in the documentation), I would suggest something like Boost.Container’s “default_init_t”. Instead of “unsafe_uninitialized_resize_front” I suggest resize_front(size_type, default_init_t)

I didn't know about these default_init_t when we started in 2015. They have a good design IMO. But unfortunately they don't get rid of double initialization for non-trivial types. So maybe both

resize_front( size_type, default_init_t ) resize_front( size_type, uninitialized_t )

would be an idea.

I understand the idea to avoid zero initialization for buffers, but what's the use case for types with non-trivial constructors? The documentation shows a socket example, and that's covered ith default_init_t.

Well, I don't know if my compiler is wrong, but vc 2017 says: class IntClass { int i; public: IntClass() : i(0) {} IntClass( int i ) : i( i ) {} }; is trivially copyable (hence we can copy arrays of it as fast as arrays of ints). But it will be initialized with zero if I use default_init_t, right? If so, I'm just saying that it may be nice/beneficial to be able to avoid double initialization also for IntClass (perhaps not relevant to sockets, but certainly to serialization). I do agree that reserve + unchecked_emplace_back() should be working well for types that are /not/ trivially copyable (also better in terms of exception-safety). The unitialized_resize idea is probably before we had emplace and so at that time there was no efficient/easy way to construct an object inplace. About the naming of unsafe_emplace_back: I prefer your suggestion with the unchecked_ prefix. Other ideas would be nonreallocating_emplace_back nonallocating_emplace_back nonexpanding_emplace_back nongrowing_emplace_back // seems wrong, since the size does change

...

...
...
5) reserve() / resize(): I think they have no sensible meaning in the container (capacity() could be useful, depending on the reallocation strategy). We should not treat back insertion as the “default” insertion. IMHO the API of devector should be symmetrical.

Symmetry is good, but what about generic code?

Generic code can't rely on reserve, only vector implements it in the STL. But if reserve() and resize() are unbiased to back insertion maybe they could make sense. reserve() allocates a new buffer if capacity() is not enough and moves existing elements to the middle. Resize could do the same and fill with elements on both ends.

If reserve is not an alias for reserve_back, the I think we must remove it. I was thinking along generic code along the line template< class BacKInsertionContiguousContainer > void foo( BacKInsertionContiguousContainer& cont ) { cont.reserve( source.size() ); for( ... ) cont.push_back( ... ); } why should that not work with both devector and vector?

...

stable_insert can maintain stability with a middle insertion but there is no equivalent operation for erasures. I just can't imagine a use case that tan take advantage of stable_insert (it needs more or less a positional insertion, but not absolute, instead of front/back insertion) but it will surely exist.

Yeah, I don't have a use case at hand either. Somehow you would want to insert a segments somewhere among the other segments. Then why not allow the segments themselves to be rearranged? (potentially almost trivial to implement). So I guess we could expose a complete container interface to the segments, but it is perhaps premature until we have a good use case.

...

...
This is tricky. I think the best is to treat both ends independently. If you start moving things around, you can break the O(1) amortized guarantee by interleaving push_front/push_back.

What do you mean by "treat both ends independtly"?

I mean that buffer extensions at either end should never affect each other. That is, a push_back can never remove free capacity from the front, a push_front can never remove free capacity from the back. kind regards Thorsten

Ion Gaztañaga

8:42 p.m.

On 05/10/2017 19:02, Thorsten Ottosen via Boost wrote:

...

...
I understand the idea to avoid zero initialization for buffers, but what's the use case for types with non-trivial constructors? The documentation shows a socket example, and that's covered ith default_init_t.

Well, I don't know if my compiler is wrong, but vc 2017 says:

class IntClass { int i; public: IntClass() : i(0) {} IntClass( int i ) : i( i ) {} };

is trivially copyable (hence we can copy arrays of it as fast as arrays of ints). But it will be initialized with zero if I use default_init_t, right?

Right.

...

I do agree that reserve + unchecked_emplace_back() should be working well for types that are /not/ trivially copyable (also better in terms of exception-safety). The unitialized_resize idea is probably before we had emplace and so at that time there was no efficient/easy way to construct an object inplace.

Ok, understood

...

If reserve is not an alias for reserve_back, the I think we must remove it.

reserve() means that a container won't reallocate if size() < capacity(). Where memory is reserved is irrelevant for the user.

...

I was thinking along generic code along the line

template< class BacKInsertionContiguousContainer > void foo( BacKInsertionContiguousContainer& cont ) { cont.reserve( source.size() ); for( ... ) cont.push_back( ... );

}

why should that not work with both devector and vector?

It would work, but it would not be as efficient as vector.

...

...
...
This is tricky. I think the best is to treat both ends independently. If you start moving things around, you can break the O(1) amortized guarantee by interleaving push_front/push_back.

What do you mean by "treat both ends independtly"?

I mean that buffer extensions at either end should never affect each other. That is, a push_back can never remove free capacity from the front, a push_front can never remove free capacity from the back.

Ok. I still have no clear idea which allocation strategy would be the "natural" one. I've found this blog with a similar discussion and alternatives: http://larshagencpp.github.io/blog/2016/05/22/devector also linked from the blog post: https://github.com/orlp/devector Moving elements just one step when one side is full and the other side has capacity has bad quadratic behavior while current capacity is being exhausted and we continue inserting in the same side. Moving elements to the middle of the buffer seems a simple operation that reduces this to something closer to O(NlogN). In any case, I will always bad insertion sequence that hurts our strategy, so optimizing the common case seems the way to go. Best, Ion

Thorsten Ottosen

6 Oct 6 Oct

9:53 a.m.

Den 05-10-2017 kl. 22:42 skrev Ion Gaztañaga via Boost:

...

On 05/10/2017 19:02, Thorsten Ottosen via Boost wrote:

...
...
I understand the idea to avoid zero initialization for buffers, but what's the use case for types with non-trivial constructors? The documentation shows a socket example, and that's covered ith default_init_t.

Well, I don't know if my compiler is wrong, but vc 2017 says:

class IntClass { int i; public: IntClass() : i(0) {} IntClass( int i ) : i( i ) {} };

is trivially copyable (hence we can copy arrays of it as fast as arrays of ints). But it will be initialized with zero if I use default_init_t, right?

Right.

But I agree we don't need two overloads, the existing one just has to work in terms of trivially copyable. We can also put a static assertion in there to lock it down to such types for now.

...

...
If reserve is not an alias for reserve_back, the I think we must remove it.

reserve() means that a container won't reallocate if size() < capacity(). Where memory is reserved is irrelevant for the user.

...
I was thinking along generic code along the line

template< class BacKInsertionContiguousContainer > void foo( BacKInsertionContiguousContainer& cont ) { cont.reserve( source.size() ); for( ... ) cont.push_back( ... );

}

why should that not work with both devector and vector?

It would work, but it would not be as efficient as vector.

It may be. But I guess my concern was to make it compile. Sure, if a call to reserve is conditioned on d.capacity() - d.size(), it can behave differently. Reserve maybe easier to do something about. We may need to tweak the definition of reserve_back/reserve_front a little. Let's say we have x x 1 1 x so front_free_capacity is 2, back free capacity is 1 ans size is 2. with d.reserve( d.size() + 3 ); devector uses the expression n = new_capacity - size() so we get n = 5 - 2 = 3 and we get x x 1 1 x x x . A vector 1 1 x does the same. So is your concern that users use capacity to determine what/if to reserve?

...

...
...
...
This is tricky. I think the best is to treat both ends independently. If you start moving things around, you can break the O(1) amortized guarantee by interleaving push_front/push_back.

What do you mean by "treat both ends independtly"?

I mean that buffer extensions at either end should never affect each other. That is, a push_back can never remove free capacity from the front, a push_front can never remove free capacity from the back.

Ok. I still have no clear idea which allocation strategy would be the "natural" one. I've found this blog with a similar discussion and alternatives:

http://larshagencpp.github.io/blog/2016/05/22/devector

also linked from the blog post:

https://github.com/orlp/devector

Certainly a good analysis.

...

Moving elements just one step when one side is full and the other side has capacity has bad quadratic behavior while current capacity is being exhausted and we continue inserting in the same side. Moving elements to the middle of the buffer seems a simple operation that reduces this to something closer to O(NlogN). In any case, I will always bad insertion sequence that hurts our strategy, so optimizing the common case seems the way to go.

I'm ok with inserts in the middle to move stuff around since it already has O(n) worst case behavior. For push_back/push_front I don't know. It would have to be really simple to be worth the effort IMO. Say, only when other_size_free_capacity >= size(). The simple thing would be to grow independently, and then the user calls shrink to fit when done if he cares about it. BTW: Before the review some people complained that the container could not work as deque when you repeatedly pop from one end and push another. Something to ponder ... kind regards Thorsten

Thorsten Ottosen

9 Oct 9 Oct

8:30 a.m.

Den 06-10-2017 kl. 11:53 skrev Thorsten Ottosen via Boost:

...

Den 05-10-2017 kl. 22:42 skrev Ion Gaztañaga via Boost:

...
On 05/10/2017 19:02, Thorsten Ottosen via Boost wrote:

...

...
...
I was thinking along generic code along the line

template< class BacKInsertionContiguousContainer > void foo( BacKInsertionContiguousContainer& cont ) { cont.reserve( source.size() ); for( ... ) cont.push_back( ... );

}

why should that not work with both devector and vector?

It would work, but it would not be as efficient as vector.

It may be. But I guess my concern was to make it compile. Sure, if a call to reserve is conditioned on d.capacity() - d.size(), it can behave differently. Reserve maybe easier to do something about.

What if we did this: capacity() -> alias for back_capacity back_capacity() -> return size() + back_free_capacity() front_capacity() -> return size() + front_free_capacity() ? kind regards Thorsten

Ion Gaztañaga

10:12 p.m.

On 09/10/2017 10:30, Thorsten Ottosen via Boost wrote:

...

What if we did this:

capacity() -> alias for back_capacity back_capacity() -> return size() + back_free_capacity() front_capacity() -> return size() + front_free_capacity()

?

I think it's very confusing. If we move elements while inserting on one end and the free capacity of that end is zero, then capacity() has sense, it's the limit where memory will be reallocated. Best, Ion

Thorsten Ottosen

10 Oct 10 Oct

8:08 a.m.

Den 10-10-2017 kl. 00:12 skrev Ion Gaztañaga via Boost:

...

On 09/10/2017 10:30, Thorsten Ottosen via Boost wrote:

...
What if we did this:

capacity() -> alias for back_capacity back_capacity() -> return size() + back_free_capacity() front_capacity() -> return size() + front_free_capacity()

?

I think it's very confusing. If we move elements while inserting on one end and the free capacity of that end is zero, then capacity() has sense, it's the limit where memory will be reallocated.

Let's not mix moving in push_back/push_front with reallocation (*). The above definition solve exactly what you and Joaquin saw as a problem while at the same time /generalizing/ the vector behavior for both ends. If we have - as free space and x as elements, then -----xxxxxxxxxxxxx--------- ^^^^^ front free capacity ^^^^^^^^^^^^^^^^^^ front capacity ^^^^^^^^^^^^^ size ^^^^^^^^^ back free capacity ^^^^^^^^^^^^^^^^^^^^^^ back capacity (a.k.a capacity) The limit where new memory is acquired for insert can be found easily enough. Basically we can't both generalize vector behavior and get the segment size in one function. So pick one or none. Two of them allow generic code to compile. kind regards Thorsten (*) If push_back/push_front steals memory from the other end, we should be cautious about when this is done (potentially high constant factors as mentioned in the analysis you found). But it is more than that: it may mean that push_back/push_front cannot give the strong exception-safety guarantee. Are you really given that away for Boost.Container's vector? (I hope not).

Ion Gaztañaga

9:23 p.m.

On 10/10/2017 10:08, Thorsten Ottosen via Boost wrote:

...

If we have - as free space and x as elements, then

-----xxxxxxxxxxxxx---------

^^^^^ front free capacity ^^^^^^^^^^^^^^^^^^ front capacity ^^^^^^^^^^^^^ size ^^^^^^^^^ back free capacity ^^^^^^^^^^^^^^^^^^^^^^ back capacity (a.k.a capacity)

Ok, I see it. So capacity() is an alias for back_capacity(). That might work for if a user only calls push_back in generic code but any generic code that checks for capacity() == size() would be broken.

...

The limit where new memory is acquired for insert can be found easily enough.

Basically we can't both generalize vector behavior and get the segment size in one function. So pick one or none. Two of them allow generic code to compile.

I think the following is coherent, maybe not optimal in moves if only push_back is used: -----xxxxxxxxxxxxx--------- ^^^^^---------------------- front free capacity ^^^^^^^^^^^^^^^^^^--------- front capacity -----^^^^^^^^^^^^^--------- size ------------------^^^^^^^^^ back free capacity -----^^^^^^^^^^^^^^^^^^^^^^ back capacity ^^^^^^^^^^^^^^^^^^^^^^^^^^^ capacity front/back_free_capacity indicates the number of push_front/backs until data movement (and maybe allocation) is needed. front/back_capacity is front/back_free_capacity + size. If size == front/back_capacity, then data movement (and maybe allocation) will happen capacity == size means that any insertion will lead to reallocation.

...

(*) If push_back/push_front steals memory from the other end, we should be cautious about when this is done (potentially high constant factors as mentioned in the analysis you found). But it is more than that: it may mean that push_back/push_front cannot give the strong exception-safety guarantee. Are you really given that away for Boost.Container's vector? (I hope not).

Sorry about the bad news, push_back has no strong safety in boost::container::vector, as by design, move_if_noexcept is not used. It's mentioned in the documentation. A programmer that can tell the difference with std::vector needs to use throwing moves, catch the exception (if uncatched, the program will terminate and any guarantee is bogus) and do something with it that is based on strong exception safety. And vectors with potentially throwing moves will be really slow. I think vector's strong safety was a consequence from the C++03's copy semantics, and backwards compatibility had to be maintained in C++11 for types that define move constructors and assignments. If designed today, I doubt std::vector would offer strong exception safety for push_back.

degski

11 Oct 11 Oct

2:36 a.m.

On 11 October 2017 at 00:23, Ion Gaztañaga via Boost <boost@lists.boost.org> wrote:

...

I think the following is coherent, maybe not optimal in moves if only push_back is used:

-----xxxxxxxxxxxxx---------

^^^^^---------------------- front free capacity ^^^^^^^^^^^^^^^^^^--------- front capacity -----^^^^^^^^^^^^^--------- size ------------------^^^^^^^^^ back free capacity -----^^^^^^^^^^^^^^^^^^^^^^ back capacity ^^^^^^^^^^^^^^^^^^^^^^^^^^^ capacity

front/back_free_capacity indicates the number of push_front/backs until data movement (and maybe allocation) is needed.

front/back_capacity is front/back_free_capacity + size. If size == front/back_capacity, then data movement (and maybe allocation) will happen

capacity == size means that any insertion will lead to reallocation.

I've tested a toy implementation doing exactly this. template<typename T, typename P = allocation_policy<1, 1>, typename A = std::allocator<T>, typename G = vs_growth_policy<2>> class veque : private A { // Stuff... inline size_type size ( ) const noexcept { return ( size_type ) ( m_end - m_begin ); } inline size_type capacity ( ) const noexcept { return m_free_capacity_begin + size ( ) + m_free_capacity_end; } size_type m_free_capacity_begin = 0, m_free_capacity_end = 0; pointer m_begin = nullptr, m_end = nullptr; }; With a size_type std::uint32_t, the foot print is 24 bytes on 64 bit (like de::devector). The idea with the allocation policy is to re-balance the total free space on relocation between front and back free space (1 to 1 in the default case, but could obviously be anything). This leads to lower memory usage on random growth (push_front/push_back) over a std::vector of veque(s) (de::devector). This approach is speedwise on par with de:devector for push_front/push_back, and doesn't give surprises in the the free capacity department. degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

11:20 a.m.

Den 10-10-2017 kl. 23:23 skrev Ion Gaztañaga via Boost:

...

On 10/10/2017 10:08, Thorsten Ottosen via Boost wrote:

...
If we have - as free space and x as elements, then

-----xxxxxxxxxxxxx---------

^^^^^ front free capacity ^^^^^^^^^^^^^^^^^^ front capacity ^^^^^^^^^^^^^ size ^^^^^^^^^ back free capacity ^^^^^^^^^^^^^^^^^^^^^^ back capacity (a.k.a capacity)

Ok, I see it. So capacity() is an alias for back_capacity(). That might work for if a user only calls push_back in generic code but any generic code that checks for capacity() == size() would be broken.

why? If the vector is full, capacity() == size(). If the devector's right hand size is full, size() == back_capacity().

...

...
The limit where new memory is acquired for insert can be found easily enough.

Basically we can't both generalize vector behavior and get the segment size in one function. So pick one or none. Two of them allow generic code to compile.

I think the following is coherent, maybe not optimal in moves if only push_back is used:

-----xxxxxxxxxxxxx---------

^^^^^---------------------- front free capacity ^^^^^^^^^^^^^^^^^^--------- front capacity -----^^^^^^^^^^^^^--------- size ------------------^^^^^^^^^ back free capacity -----^^^^^^^^^^^^^^^^^^^^^^ back capacity ^^^^^^^^^^^^^^^^^^^^^^^^^^^ capacity

front/back_free_capacity indicates the number of push_front/backs until data movement (and maybe allocation) is needed.

front/back_capacity is front/back_free_capacity + size. If size == front/back_capacity, then data movement (and maybe allocation) will happen

capacity == size means that any insertion will lead to reallocation.

That's the alternative, but that is the one that breaks generic code, isn't it?. With my scheme above, if you absolutely must know if insertion reallocates, you can use front_free_capacity() + back_free_capacity() == 0. I don't know if that needs a special function though (it could be called full() ).

...

...
(*) If push_back/push_front steals memory from the other end, we should be cautious about when this is done (potentially high constant factors as mentioned in the analysis you found). But it is more than that: it may mean that push_back/push_front cannot give the strong exception-safety guarantee. Are you really given that away for Boost.Container's vector? (I hope not).

Sorry about the bad news, push_back has no strong safety in boost::container::vector, as by design, move_if_noexcept is not used. It's mentioned in the documentation.

Ah, yes. Only when the type is well-behaved (non-movable, only copyable or no-throw movable) we guarantee that, right? kind regards Thorsten

Ion Gaztañaga

7 p.m.

On 11/10/2017 13:20, Thorsten Ottosen via Boost wrote:

...

why? If the vector is full, capacity() == size(). If the devector's right hand size is full, size() == back_capacity().

But if capacity() is an alias for back_capacity() and size() == capacity(), an insertion in the middle would not trigger a reallocation. capacity() > size() means to me "is there room for a new insertion without reallocation"?

...

...
capacity == size means that any insertion will lead to reallocation.

That's the alternative, but that is the one that breaks generic code, isn't it?.

Which generic code? I think it's the opposite.

...

With my scheme above, if you absolutely must know if insertion reallocates, you can use front_free_capacity() + back_free_capacity() == 0. I don't know if that needs a special function though (it could be called full() ).

For any container with reserve/capacity (vector, string, flat_map, circular_buffer, poly_collection, etc.) if capacity() == size() that means a reallocation will happen for *any* insertion, not just push_back. If you want guarantees for back insertion only, then back_capacity() should give you the answer. Just my 2 cents. Best, Ion

Thorsten Ottosen

12 Oct 12 Oct

8:37 a.m.

Den 11-10-2017 kl. 21:00 skrev Ion Gaztañaga via Boost:

...

On 11/10/2017 13:20, Thorsten Ottosen via Boost wrote:

...
why? If the vector is full, capacity() == size(). If the devector's right hand size is full, size() == back_capacity().

But if capacity() is an alias for back_capacity() and size() == capacity(), an insertion in the middle would not trigger a reallocation. capacity() > size() means to me "is there room for a new insertion without reallocation"?

...
...
capacity == size means that any insertion will lead to reallocation.

That's the alternative, but that is the one that breaks generic code, isn't it?.

Which generic code? I think it's the opposite.

We started with code like (*) if( container.capacity() > container.size() + new_elements ) container.reserve( container.size() + new_elements ); ... for( ... ) container.push_back(...); and we found that this would sometimes not reallocate for devector. So I proposed a design that did. But now ...

...

For any container with reserve/capacity (vector, string, flat_map, circular_buffer, poly_collection, etc.) if capacity() == size() that means a reallocation will happen for *any* insertion, not just push_back. If you want guarantees for back insertion only, then back_capacity() should give you the answer. Just my 2 cents.

... we broke capacity() == size() means that any insertion will reallocate. What I was trying to convey is that there is no definition of capacity/back_capacity etc that can simultaneously satisfy both. I think the guiding principle should be to support idiomatic code. (*) is not idiomatic: there is absolutely no reason to guard a call to reserve when you can just write: container.reserve( container.size() + new_elements ); That means your design where capacity is the the full buffer length can satisfy both situations. So I agree with you. But it also calls into question the need for anything else than capacity (i.e., there seem to be vanishingly little use for back_capacity/front_capacity and they probably confuse more than they help). I do think reserve is harder to use than it should be. I would have preferred it was called container.anticipate_back( new_elements ); which works regardless of the size of the container. kind regards Thorsten

Ion Gaztañaga

5:17 p.m.

On 12/10/2017 10:37, Thorsten Ottosen via Boost wrote:

...

That means your design where capacity is the the full buffer length can satisfy both situations. So I agree with you. But it also calls into question the need for anything else than capacity (i.e., there seem to be vanishingly little use for back_capacity/front_capacity and they probably confuse more than they help).

I agree. If capacity is the full buffer length, only back/front_capacity make sense. Best, Ion

Nevin Liber

6:34 p.m.

On Thu, Oct 12, 2017 at 12:17 PM, Ion Gaztañaga via Boost < boost@lists.boost.org> wrote:

...

On 12/10/2017 10:37, Thorsten Ottosen via Boost wrote:

That means your design where capacity is the the full buffer length can

...
satisfy both situations. So I agree with you. But it also calls into question the need for anything else than capacity (i.e., there seem to be vanishingly little use for back_capacity/front_capacity and they probably confuse more than they help).

I agree. If capacity is the full buffer length, only back/front_capacity make sense.

Normally, my check against capacity() is usually about iterators, pointers and/or references possibly being invalidated more than knowing that an actual allocation will take place. Have we considered capacity() == min(front_capacity(), back_capacity())? Just a thought, Nevin :-) -- Nevin ":-)" Liber <mailto:nevin@cplusplusguy.com <nevin@eviloverlord.com>> +1-847-691-1404

Ion Gaztañaga

6:42 p.m.

On 12/10/2017 20:34, Nevin Liber via Boost wrote:

...

On Thu, Oct 12, 2017 at 12:17 PM, Ion Gaztañaga via Boost < boost@lists.boost.org> wrote:

...
On 12/10/2017 10:37, Thorsten Ottosen via Boost wrote:

That means your design where capacity is the the full buffer length can

...
satisfy both situations. So I agree with you. But it also calls into question the need for anything else than capacity (i.e., there seem to be vanishingly little use for back_capacity/front_capacity and they probably confuse more than they help).

I agree. If capacity is the full buffer length, only back/front_capacity make sense.

Normally, my check against capacity() is usually about iterators, pointers and/or references possibly being invalidated more than knowing that an actual allocation will take place.

Might be. But the name capacity() might not be a good name to check for invalidation. Best, Ion

Thorsten Ottosen

13 Oct 13 Oct

10:49 a.m.

Den 12-10-2017 kl. 20:34 skrev Nevin Liber via Boost:

...

Normally, my check against capacity() is usually about iterators, pointers and/or references possibly being invalidated more than knowing that an actual allocation will take place.

What is the difference be an allocation and an invalidation? I assume your context is "what happens when I call push_back?".

...

Have we considered capacity() == min(front_capacity(), back_capacity())?

Or is your context "what happens when I call either push_back or push_front?" Maybe you can tell us what your program does when it detects that things are invalidated? Do you then recompute the iterators and references? kind regards -Thorsten

Thorsten Ottosen

26 Oct 26 Oct

6:10 p.m.

Hi Nevin,

...

On Thu, Oct 12, 2017 at 12:17 PM, Ion Gaztañaga via Boost < boost@lists.boost.org> wrote:

...
On 12/10/2017 10:37, Thorsten Ottosen via Boost wrote:

That means your design where capacity is the the full buffer length can

...
satisfy both situations. So I agree with you. But it also calls into question the need for anything else than capacity (i.e., there seem to be vanishingly little use for back_capacity/front_capacity and they probably confuse more than they help).

I agree. If capacity is the full buffer length, only back/front_capacity make sense.

Normally, my check against capacity() is usually about iterators, pointers and/or references possibly being invalidated more than knowing that an actual allocation will take place.

Have we considered capacity() == min(front_capacity(), back_capacity())?

Just a thought,

I'm just rereading all the messages when I came to this one. That definition seems to satisfy: A. size() == capacity() implies any insertion /can/ cause reallocation. B. if( size() + new_elements > capacity ), then reserve() style does not lead to any invalidation in subsequent push_back (addressing Joaquin's concern). Ion was using the definition "any insertion /will/ cause reallocation. This is all very subtle since it depends on the definition of reserve() too. But I think it's the best definition we have seen. kind regards Thorsten

Ion Gaztañaga

27 Oct 27 Oct

12:47 p.m.

On 26/10/2017 20:10, Thorsten Ottosen via Boost wrote:

...

I'm just rereading all the messages when I came to this one.

That definition seems to satisfy:

A. size() == capacity() implies any insertion /can/ cause reallocation.

B. if( size() + new_elements > capacity ), then reserve() style does not lead to any invalidation in subsequent push_back (addressing Joaquin's concern).

Ion was using the definition "any insertion /will/ cause reallocation.

This is all very subtle since it depends on the definition of reserve() too.

But I think it's the best definition we have seen.

Yes, not only that. These days I was thinking about a modified capacity() definition. If size() == capacity() an insertion can provoke more than optimal data movement (iterator invalidation) and (optionally) reallocation. That could be implemented if capacity() is defined as: max(size(), min(back_capacity(), front_capacity()); If an insertion is done in the side that has free capacity, no data movement and memory allocation is done, but for the user is similar to a in-place buffer expansion. Capacity has grown to accommodate the insertion without moving anything more that needed. If a user want's to be sure if an insertion in one side will be supported without more than optimal data movement or iterator invalidation, then he/she can check back/front_capacity() against size(). I've haven't thought how reserve() is affected. Best, Ion

Thorsten Ottosen

30 Oct 30 Oct

6:27 p.m.

Den 27-10-2017 kl. 14:47 skrev Ion Gaztañaga via Boost:

...

On 26/10/2017 20:10, Thorsten Ottosen via Boost wrote:

...

...
But I think it's the best definition we have seen.

Yes, not only that. These days I was thinking about a modified capacity() definition.

If size() == capacity() an insertion can provoke more than optimal data movement (iterator invalidation) and (optionally) reallocation. That could be implemented if capacity() is defined as:

max(size(), min(back_capacity(), front_capacity());

Is this not just equivalent to min(back_capacity(), front_capacity()) given that back/front_capacity >= size()?

...

If an insertion is done in the side that has free capacity, no data movement and memory allocation is done, but for the user is similar to a in-place buffer expansion. Capacity has grown to accommodate the insertion without moving anything more that needed.

If a user want's to be sure if an insertion in one side will be supported without more than optimal data movement or iterator invalidation, then he/she can check back/front_capacity() against size().

Sure, we can make new semantics. I think Joaquin's concern was that if we provide both capacity() and reserve(), we better do it 100% compatible with vector. In our source code, I found just one mention of capacity(). In the dependencies, less than fourty. And only one involved a comparison with size(). That one in 800-900k lines of code. Other code bases may be different. And given that we do provide means for exact queries (this code will reallocate unless you call reserve_xxx), I don't know if it makes any sense to use time on capacity. My only point was: Nevin's definition would solve Joaquin's concern if reserve() was an alias for reserve_back(), but since we may not want to do that (e.g., depending on policy, reserve() could allocate space in both ends), I think the best option is to drop capacity() for devector and dream about a standard that had used the names back_capacity()/reserve_back()/resize_back() for what it now calls capacity()/reserve()/resize(). kind regards Thorsten

Ion Gaztañaga

11:25 p.m.

On 30/10/2017 19:27, Thorsten Ottosen via Boost wrote:

...

...
That could be implemented if capacity() is defined as:

max(size(), min(back_capacity(), front_capacity());

Is this not just equivalent to

min(back_capacity(), front_capacity())

given that back/front_capacity >= size()?

Right. Good catch.

Gavin Lambert

31 Oct 31 Oct

1:48 a.m.

On 31/10/2017 07:27, Thorsten Ottosen wrote:

...

Sure, we can make new semantics. I think Joaquin's concern was that if we provide both capacity() and reserve(), we better do it 100% compatible with vector.

In our source code, I found just one mention of capacity(). In the dependencies, less than fourty. And only one involved a comparison with size(). That one in 800-900k lines of code. Other code bases may be different.

At least for vector, I think most codebases make the assumption that capacity() will always return a value >= the value passed to reserve(), and reason that they're therefore allowed to emplace/push_back() up to or equal (reserve-argument minus size()) times without causing further allocation or iterator invalidation, and without needing to explicitly check the capacity. It is likely that they would be able to call it even more times (as reserve() typically rounds up), but most code would not care about this or attempt to do so. (It's also probably most common that size() == 0 when reserve() is called, but not guaranteed. In cases where it is not, it is then probably most common to use reserve(size() + N), where N is the number of push_backs you want to make without reallocation/iterator invalidation.) Unit tests might be more picky. I don't know whether devector should try to emulate these patterns (which might cause over-allocation of the back end) or just require people to use reserve_back()/size_back() explicitly instead.

Ion Gaztañaga

12 Oct 12 Oct

6:43 p.m.

On 12/10/2017 19:17, Ion Gaztañaga via Boost wrote:

...

On 12/10/2017 10:37, Thorsten Ottosen via Boost wrote:

...
That means your design where capacity is the the full buffer length can satisfy both situations. So I agree with you. But it also calls into question the need for anything else than capacity (i.e., there seem to be vanishingly little use for back_capacity/front_capacity and they probably confuse more than they help).

I agree. If capacity is the full buffer length, only back/front_capacity make sense.

I wanted to say only back/front_FREE-capacity makes sense if capacity is the full buffer length. Ion

Benedek Thaler

5 Oct 5 Oct

7:04 a.m.

On Wed, Oct 4, 2017 at 1:30 AM, Ion Gaztañaga via Boost < boost@lists.boost.org> wrote:

...

On 21/09/2017 19:38, Thorsten Ottosen via Boost wrote:

...
Dear users and members of Boost,

I'm proud to announce that the formal review of Benedek Thaler's Boost.DoubleEnded library starts today, September 21, and runs through September 30. This library is the result of Benedek's participation in Google Summer of Code 2015.

Hi, this is my review of Boost.DoubleEnded.

Thank you for your time and effort!

...

2) Defaulting size_type to unsigned int is arbitrary and against existing practice. [snip]

To satisfy sizeof(devector) == sizeof(std::vector) It was an initial goal to make devector resemble vector, as close as possible, to ease transition. The familiarity makes it easy use, but also produces some imbalance, you also note below. Unfortunate indeed.

...

5) reserve() / resize(): I think they have no sensible meaning in the container (capacity() could be useful, depending on the reallocation strategy). We should not treat back insertion as the “default” insertion. IMHO the API of devector should be symmetrical.

Again, compatibility.

...

6) unsafe_push_back/front: I think they can be useful, many times we iterate until a known upper bound and capacity checks might be redundant in some code, just like we do with arrays. I suggest replacing “unsafe” with “unchecked” and “push” with “emplace” (which is much more general): “unchecked_emplace_back/front(Args&&….)”.

s/unsafe/unchecked, agreed. Also, we need the emplace methods as well.

...

8) Regarding "move_if_noexcept", Boost.Container does not use it as it consider a pessimized design for a situation (throwing moves) that will occur nearly never and few times will be handled using the provided strong guarantees.

Right, this should be fixed in devector, at the expense of compatibility.

...

2) Segment_iterators: Raw pointers are stored in segment iterators. That means that those could not be placed in the memory that the allocator allocates, if the allocator does not defined ::pointer as T*. Also, it seems strange that there is not segment_type/segment_info class. The iterator itself has members that return the address and the size of a segment. I think that’s unexpected, the expected segment iterator would return a reference to a class (e.g. segment_type) that represents the segment, which has the required operations, and also internal begin()/end() operations to obtain segment-local iterators (a la unordered containers). This could allow some optimized segment-based algorithms (see lafstern.org/matt/segmented.pdf, for more information)

Thanks, this can be improved.

...

-------------------------- devector --------------------------

emplace_slow_path: The forced temporary creation should be avoided, it is overkill for containers of containers, devector<static_vector> or containers with sentinel nodes. IMHO supporting references to already included elements is a bad feature of std::vector (a corner case supported in single element insertion, ignored for ranges, etc.) Boost.Container chose not to support it and no one has demanded that feature: http://www.boost.org/doc/libs/1_65_1/doc/html/container/Cpp1 1_conformance.html#container.Cpp11_conformance.container_ const_reference_parameters

True. Reallocation strategy: First, I considered this strategyunnatural, but now

...

I am not sure. For a push_front, if no free front capacity is found, but there is free back capacity, then all data is shifted to insert the element. If interleaved front and back insertions are performed, I don’t know if this will lead to too many data movements. [snip]

If it does not shift, it has to reallocate - and then copy/move everything over. I don't see how is that any better.

...

emplace_slow_path: // construct at back + 1 from back (no guard) alloc_construct(end(), std::move(back())); // move back half right std::move_backward(position, end() - 1, end()); ++_back_index; This does not seem exception safe “++_back-index” should be done just after alloc_construct, same for front insertion.

I'll take a look.

...

insert_range_slow_path_near_front/back: I see that construction + rotation is done, but I think this is not optimal, the implementation might be simplier, but more than necessary data movement is done.

I'll take a look.

...

Documentation is correct, but I expected more detail. If one of the goals of the library is efficiency, benchmarks should be there (maybe against Boost and Std vector and deque).

I posted some micro-benchmarks on the list recently. I think that's a start, but those should be reviewed/reproduced as well. I kind of don't trust such benchmarks.

...

------------------------------------- ------------------------------------- Did you try to use the library? With which compiler(s)? Did you have any problems? ------------------------------------- ------------------------------------- Just executed tests and examples and debugged to help code review. MSVC 2017 triggers quite a lot of warnings, mainly about constant conditional expressions.

That wasn't available back then. Otherwise, it is a definite goal to make it warning free (as it is on some other compilers).

...

------------------------------------- ------------------------------------- ¿Should we accept the library? ------------------------------------- ------------------------------------- I think the library (see my design comments) and the documentation still need improvements, but it's not the main issue. I’m reluctant to accept the library because I don’t see substantial benefits of batch_deque against deque (configurable segment size could be added to Boost.Container if that feature is considered useful), and I don’t know if the use cases where devector is supposed to shine needs a whole new library (we have no clear rules about how tiny a library can be, but in this case it would be a new container and reimplementation of another).

Joaquín has proposed to merge this proposal to Boost.Container, but supported compilers and goals are quite different and a direct merge would be problematic as Boost.Container would loose coherency, common goals and minimum requirements between all containers.

A middle solutions is to:

1) Start “porting” devector, after all issues have been agreed, (without small vector optimization, at least in the first version) to Boost.Container’s requirements (including some pre C++11 compilers)

2) Improve Boost.Container’s deque to support features (and maybe code) from batch_deque (configurable segment size, local iterators, etc.).

That’s a lot of work, discards a lot of batch_queue code and I don’t know if Benedek would agree to help on this task after I spoke against this option when it was proposed and after doing so much work to write the library.

The interest of the community should be put first. If we think adding pieces to Container would improve situations, we should find a way. On the other hand, I wouldn't love to port code to C++03, for two reasons: It is a soul crushing work, and, we should not encourage writing new code using old standards (I know, there's no other option in certain cases.). Thanks, Benedek

Ion Gaztañaga

8:58 p.m.

On 05/10/2017 9:04, Benedek Thaler via Boost wrote:

...

...
5) reserve() / resize(): I think they have no sensible meaning in the container (capacity() could be useful, depending on the reallocation strategy). We should not treat back insertion as the “default” insertion. IMHO the API of devector should be symmetrical.

Again, compatibility.

Understood.

...

s/unsafe/unchecked, agreed. Also, we need the emplace methods as well.

Why both? Isn't emplace enough?

...

Reallocation strategy: First, I considered this strategyunnatural, but now

...
I am not sure. For a push_front, if no free front capacity is found, but there is free back capacity, then all data is shifted to insert the element. If interleaved front and back insertions are performed, I don’t know if this will lead to too many data movements. [snip]

If it does not shift, it has to reallocate - and then copy/move everything over. I don't see how is that any better.

Yes, this is tricky, as I pointed out in other reply to Thorsten, we could maybe move elements in bigger steps to avoid quadratic behavior with repeated inserts in one end while buffer is being reallocated.

...

The interest of the community should be put first. If we think adding pieces to Container would improve situations, we should find a way. On the other hand, I wouldn't love to port code to C++03, for two reasons: It is a soul crushing work, and, we should not encourage writing new code using old standards (I know, there's no other option in certain cases.).

Yes. I know will be the last warrior supporting C++03 in Boost ;-) Best, Ion

Thorsten Ottosen

6 Oct 6 Oct

8:39 a.m.

Den 05-10-2017 kl. 22:58 skrev Ion Gaztañaga via Boost:

...

On 05/10/2017 9:04, Benedek Thaler via Boost wrote:

...

...
s/unsafe/unchecked, agreed. Also, we need the emplace methods as well.

Why both? Isn't emplace enough?

It should be. I guess the only reason we have both in the standard is for historic reasons. kind regards Thorsten

Thorsten Ottosen

5 Oct 5 Oct

3:52 p.m.

Den 04-10-2017 kl. 01:30 skrev Ion Gaztañaga via Boost:

...

-------------------------- batch_deque -------------------------- 1) I like that deque_[const_]iterator is lightweight, I don’t know if iteration/access it’s more efficient than more traditional approaches. boost::container::deque iterators are really fat and should be improved to something similar to batch_deque iterator.

I guess the simplest would be to replace the end iterator with a function that computes it when needed (I think that can be done easily since you now the segment size at compile time). kind regards Thorsten

Thorsten Ottosen

4:11 p.m.

Den 05-10-2017 kl. 17:52 skrev Thorsten Ottosen via Boost:

...

Den 04-10-2017 kl. 01:30 skrev Ion Gaztañaga via Boost:

...
-------------------------- batch_deque -------------------------- 1) I like that deque_[const_]iterator is lightweight, I don’t know if iteration/access it’s more efficient than more traditional approaches. boost::container::deque iterators are really fat and should be improved to something similar to batch_deque iterator.

I guess the simplest would be to replace the end iterator with a function that computes it when needed (I think that can be done easily since you now the segment size at compile time).

Note also that AFAICT, having the iterator store the pointer to the segment pointer was the reason batch_deque's push_back was slower than boost::deque (what could have been just the return of a pointer became in indirection + an offset calculation). As for the overall fatness of deque, then I would not care too much about this. Many times you only need one. kind regards -Thorsten

Thorsten Ottosen

10 Oct 10 Oct

8:50 a.m.

Den 04-10-2017 kl. 01:30 skrev Ion Gaztañaga via Boost:

...

and I don’t know if the use cases where devector is supposed to shine needs a whole new library It's hard to guess about all the use-cases. We never had this as a type in our toolbox, so we never gave it any thought. Maybe systems working with RTL and LTR text would be natural.

Given that boost::flat_set/flat_map now has support for a custom container, I can easily see myself plugging in devector in there. Also with a small buffer optimization. And having deque backed by devector allows for example to use the small buffer for, say, one or two of the first segment pointers. So what is the benefit of plugging devector into flat_set/flat_map? Assume a balanced initial capacity (same front free capacity and back free capacity). Then compared with vector when we insert N elements - if all go to the back end, both containers are optimal O(N \lg N) - the worst case for vector is when all go to the front O(N^2), this is O(N \lg N) for devector - unfortunately the average case is O(N^2) for both containers (so we only use it with relatively small sets) /However, the average case for devector uses about half as many moves as vector./ Based on that, I would pick devector as the container in flat_set/flat_map in my code. kind regards Thorsten

Ion Gaztañaga

9:26 p.m.

On 10/10/2017 10:50, Thorsten Ottosen via Boost wrote:

...

So what is the benefit of plugging devector into flat_set/flat_map? Assume a balanced initial capacity (same front free capacity and back free capacity). Then compared with vector when we insert N elements

- if all go to the back end, both containers are optimal O(N \lg N) - the worst case for vector is when all go to the front O(N^2), this is O(N \lg N) for devector - unfortunately the average case is O(N^2) for both containers (so we only use it with relatively small sets)

/However, the average case for devector uses about half as many moves as vector./ Based on that, I would pick devector as the container in flat_set/flat_map in my code.

Interesting points. Anyone willing to do some benchmark? ;-) Ion

Thorsten Ottosen

11 Oct 11 Oct

7:36 a.m.

Den 10-10-2017 kl. 23:26 skrev Ion Gaztañaga via Boost:

...

On 10/10/2017 10:50, Thorsten Ottosen via Boost wrote:

...

...
/However, the average case for devector uses about half as many moves as vector./ Based on that, I would pick devector as the container in flat_set/flat_map in my code.

Interesting points. Anyone willing to do some benchmark? ;-)

I'll try to do one later. But, in theory, if there is sufficient room to maneuver at both ends: - on average, insert takes about half the moves that a vector requires Independently of having space at both ends: - on average, erase takes about half the moves that a vector requires It must be so. Apologies for not realizing this sooner ... kind regards -Thorsten

Thorsten Ottosen

4:39 p.m.

Den 10-10-2017 kl. 23:26 skrev Ion Gaztañaga via Boost:

...

On 10/10/2017 10:50, Thorsten Ottosen via Boost wrote:

...

Interesting points. Anyone willing to do some benchmark? ;-)

Here is one example comparing boost::vector<int> with devector<int>. 32 bit, insert: 10E3 3.89232 3.4131 10E4 2.51283 2.21077 10E5 5.60785 5.48309 10E6 28.7817 26.3436 10E7 340.473 271.399 32 bit, erase 10E3 3.57414 3.87152 10E4 2.79556 2.33145 10E5 11.4395 8.43574 10E6 80.8497 68.3668 10E7 912.091 667.667 64 bit, insert: 10E3 3.11854 2.52416 10E4 2.14097 1.97018 10E5 4.72984 3.56871 10E6 27.8463 15.3933 10E7 335.868 165.534 64 bit, erase: 10E3 2.63694 3.2763 10E4 1.80957 2.77551 10E5 4.66725 9.4179 10E6 30.4746 70.3418 10E7 360.72 666.874 The 32 bit results are in line with expectations. For 64 bit I'm puzzled as to why erase differs from the 32 bit version. Looking at the code for devector, I can see erase is defined as follows: iterator erase(const_iterator position) { return erase(position, position + 1); } Digging deeper, devector is using template <typename Iterator> void move_if_noexcept(Iterator first, Iterator last, Iterator dst) { while (first != last) { *dst++ = std::move_if_noexcept(*first++); } } and template <typename Iterator> void move_if_noexcept_backward(Iterator first, Iterator last, Iterator dst_last) { while (first != last) { *(--dst_last) = std::move_if_noexcept(*(--last)); } } so there is no optimization done for trivially copyable types. Checking boost::container's copy_move_algo.hpp, the optimization applies to boost::vector. I then plugged in boost::container's code for move-forwards and move-backwards: 32-bit, erase: 10E3 3.39647 4.09578 10E4 2.85362 2.61519 10E5 11.4961 6.07559 10E6 80.8788 27.0292 10E7 913.022 272.85 64-bit erase: 10E3 2.47863 3.61393 10E4 1.75806 2.47981 10E5 4.77791 3.98943 10E6 30.6545 15.9707 10E7 361.743 160.078 Is this awesome or what? :-) kind regards Thorsten

Ion Gaztañaga

8:27 p.m.

On 11/10/2017 18:39, Thorsten Ottosen via Boost wrote:

...

I then plugged in boost::container's code for move-forwards and move-backwards:

32-bit, erase:

10E3 3.39647 4.09578 10E4 2.85362 2.61519 10E5 11.4961 6.07559 10E6 80.8788 27.0292 10E7 913.022 272.85

64-bit erase:

10E3 2.47863 3.61393 10E4 1.75806 2.47981 10E5 4.77791 3.98943 10E6 30.6545 15.9707 10E7 361.743 160.078

Is this awesome or what? :-)

Sure ;-) Differences are not very noticeable for small containers, numbers are better than expected for big containers. I can't deduce why, maybe the initial reserve and index modulo are adding noise to the benchmark. In any case, for flat_xxx family, devector would be a better choice than vector, maintaining contiguous storage, so we already have the use case where devector must shine ;-) Best, Ion

Thorsten Ottosen

12 Oct 12 Oct

8:48 a.m.

Den 11-10-2017 kl. 22:27 skrev Ion Gaztañaga via Boost:

...

On 11/10/2017 18:39, Thorsten Ottosen via Boost wrote:

...
I then plugged in boost::container's code for move-forwards and move-backwards:

32-bit, erase:

10E3 3.39647 4.09578 10E4 2.85362 2.61519 10E5 11.4961 6.07559 10E6 80.8788 27.0292 10E7 913.022 272.85

64-bit erase:

10E3 2.47863 3.61393 10E4 1.75806 2.47981 10E5 4.77791 3.98943 10E6 30.6545 15.9707 10E7 361.743 160.078

Is this awesome or what? :-)

Sure ;-) Differences are not very noticeable for small containers, numbers are better than expected for big containers. I can't deduce why, maybe the initial reserve and index modulo are adding noise to the benchmark.

Yes, I think first fast running ones are drowning in allocation cost. The container size is only 10 & 100 for the first two. I think it would be better to write a test that did not rely on random shuffle, but just inserted one element at each position (resetting the container before moving to next position). That requires a framework where you can stop the time, so I asked Benedek to write one based on Google's framework. The large numbers give IMO a much more reliable estimate of what one can expect, also for small containers.

...

In any case, for flat_xxx family, devector would be a better choice than vector, maintaining contiguous storage, so we already have the use case where devector must shine ;-) Yeah, it sort of means that devector is far more versatile than we originally anticipated and moving into more areas of vector's turf.

-Thorsten

Benedek Thaler

8:18 p.m.

On Wed, Oct 11, 2017 at 6:39 PM, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

Den 10-10-2017 kl. 23:26 skrev Ion Gaztañaga via Boost:

...
On 10/10/2017 10:50, Thorsten Ottosen via Boost wrote:

Interesting points. Anyone willing to do some benchmark? ;-)

...
Here is one example comparing boost::vector<int> with devector<int>.

Here are my results of a similar test: Run on (8 X 3500 MHz CPU s) 2017-10-12 22:16:02 ------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------- devector_insert/128 3837 ns 3818 ns 176825 devector_insert/512 23179 ns 23168 ns 31408 devector_insert/4096 980113 ns 978796 ns 708 devector_insert/32768 60061674 ns 59983258 ns 11 devector_insert/65536 237795405 ns 237520129 ns 3 cvector_insert/8 853 ns 855 ns 807388 cvector_insert/64 1988 ns 1989 ns 350874 cvector_insert/512 26493 ns 26494 ns 26234 cvector_insert/4096 1752987 ns 1750600 ns 399 cvector_insert/32768 108519299 ns 108385834 ns 6 cvector_insert/65536 436277805 ns 435846258 ns 2 Conditions of the measurement as posted earlier. Thanks, Benedek

Thorsten Ottosen

13 Oct 13 Oct

5:28 p.m.

Den 12-10-2017 kl. 22:18 skrev Benedek Thaler via Boost:

...

On Wed, Oct 11, 2017 at 6:39 PM, Thorsten Ottosen via Boost <

...

Here are my results of a similar test:

Hi Benedek, Many thanks! I'm just curious: why are the N's different for the two classes. For example, devector has 128, but not 8 or 64? There is one final test I would like to see. Instead of relying on shuffled positions, we should keep the container size fixed and insert exactly one time at each position. So instead of for (std::size_t p : positions) { c.insert(c.begin() + p, p); } we should before the loop fill the container to 90% capacity. For devector, make sure there is space in both ends. Then do the following: for( std::size_t p = 0; p < c.size(); ++p ) { state.ResumeTiming(); c.insert( c.begin() + p, p ); state.PauseTiming(); c.erase( c.begin() + p ); } That would be great to see even for small values 8, 16, 32, 64, 128, etc. Thanks in advance Thorsten

Benedek Thaler

5:45 p.m.

On Fri, Oct 13, 2017 at 7:28 PM, Thorsten Ottosen <tottosen@dezide.com> wrote:

...

Den 12-10-2017 kl. 22:18 skrev Benedek Thaler via Boost:

...
On Wed, Oct 11, 2017 at 6:39 PM, Thorsten Ottosen via Boost <

Here are my results of a similar test:

...
Hi Benedek,

Many thanks!

I'm just curious: why are the N's different for the two classes. For example, devector has 128, but not 8 or 64?

Forgot to change a constant.

...

There is one final test I would like to see. Instead of relying on shuffled positions, we should keep the container size fixed and insert exactly one time at each position. So instead of

for (std::size_t p : positions) { c.insert(c.begin() + p, p); }

we should before the loop fill the container to 90% capacity. For devector, make sure there is space in both ends. Then do the following:

for( std::size_t p = 0; p < c.size(); ++p ) { state.ResumeTiming(); c.insert( c.begin() + p, p ); state.PauseTiming(); c.erase( c.begin() + p ); }

That would be great to see even for small values 8, 16, 32, 64, 128, etc.

Here you are: Run on (8 X 3500 MHz CPU s) 2017-10-13 19:42:26 ***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead. ------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------- devector_insert/8 6079 ns 6091 ns 114176 devector_insert/64 49051 ns 49162 ns 14195 devector_insert/512 409902 ns 410626 ns 1703 devector_insert/4096 4734707 ns 4740961 ns 147 devector_insert/32768 123148029 ns 123081746 ns 6 devector_insert/65536 442823644 ns 442517234 ns 2 cvector_insert/8 6133 ns 6138 ns 115351 cvector_insert/64 49637 ns 49692 ns 13908 cvector_insert/512 430861 ns 431102 ns 1626 cvector_insert/4096 6275393 ns 6277401 ns 111 cvector_insert/32768 218750948 ns 218571840 ns 3 cvector_insert/65536 825168625 ns 824351333 ns 1 (It'd be great if someone could re-run these tests to verify my results) Thanks, Benedek

Thorsten Ottosen

6:32 p.m.

Den 13-10-2017 kl. 19:45 skrev Benedek Thaler via Boost:

...

(It'd be great if someone could re-run these tests to verify my results)

I gave it a try. My compile options are /GS- /GL /analyze- /W3 /Gy /Zc:wchar_t /I"C:\dezide_cpp_projects\CDependencies\include" /Zi /Gm- /O2 /Ob2 /Fd"Release\vc141.pdb" /Zc:inline /fp:precise /D "WIN32" /D "NDEBUG" /D "_CONSOLE" /D "_UNICODE" /D "UNICODE" /errorReport:prompt /WX- /Zc:forScope /arch:SSE2 /Gd /Oy- /Oi /MD /Fa"Release\" /EHsc /nologo /Fo"Release\" /FAs /Ot /Fp"Release\TestDoubleEnded.pch" /diagnostics:classic /D _SECURE_SCL=0 /D _CRT_SECURE_NO_DEPRECATE=1 /D _SCL_SECURE_NO_DEPRECATE=1 /D _CRT_NONSTDC_NO_DEPRECATE=1 /D BOOST_DISABLE_ASSERTS /Qpar /DHAVE_STD_REGEX 32 bit: Run on (8 X 3492 MHz CPU s) 10/13/17 20:12:05 ------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------- devector_insert/8 3266 ns 3105 ns 236166 devector_insert/64 26578 ns 28473 ns 34517 devector_insert/512 227531 ns 255922 ns 2804 devector_insert/4096 2505054 ns 2836382 ns 374 devector_insert/32768 68911138 ns 74880480 ns 10 devector_insert/65536 259531320 ns 254801633 ns 3 cvector_insert/8 3290 ns 3598 ns 195094 cvector_insert/64 26778 ns 26700 ns 28045 cvector_insert/512 227676 ns 219034 ns 3205 cvector_insert/4096 2592786 ns 2192785 ns 249 cvector_insert/32768 79741257 ns 71067122 ns 9 cvector_insert/65536 298786812 ns 280801800 ns 2 64 bit: Run on (8 X 3492 MHz CPU s) 10/13/17 20:08:24 ------------------------------------------------------------- Benchmark Time CPU Iterations ------------------------------------------------------------- devector_insert/8 1609 ns 1577 ns 524336 devector_insert/64 12828 ns 14185 ns 112179 devector_insert/512 117994 ns 156453 ns 4487 devector_insert/4096 1668282 ns 1813238 ns 499 devector_insert/32768 70929677 ns 80229086 ns 7 devector_insert/65536 278686251 ns 312002000 ns 2 cvector_insert/8 1603 ns 1596 ns 498572 cvector_insert/64 13468 ns 12724 ns 74786 cvector_insert/512 133092 ns 126901 ns 8974 cvector_insert/4096 2624790 ns 2561781 ns 408 cvector_insert/32768 140670228 ns 146640940 ns 5 cvector_insert/65536 624126003 ns 702004500 ns 1 I wonder why there is so little difference when the container is small. I tried to replace int with std::array<std::string, 32>: 32 bit: Run on (8 X 3492 MHz CPU s) 10/13/17 20:27:33 ------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------ devector_insert/8 14473 ns 16062 ns 40792 devector_insert/64 223566 ns 205850 ns 2804 devector_insert/512 8662095 ns 9750063 ns 64 devector_insert/4096 516563332 ns 475803050 ns 2 cvector_insert/8 12911 ns 10326 ns 49857 cvector_insert/64 319786 ns 344166 ns 2493 cvector_insert/512 16773834 ns 18973095 ns 37 cvector_insert/4096 1056974335 ns 920405900 ns 1 64 bit: Run on (8 X 3492 MHz CPU s) 10/13/17 20:28:54 ------------------------------------------------------------ Benchmark Time CPU Iterations ------------------------------------------------------------ devector_insert/8 10183 ns 11056 ns 74786 devector_insert/64 178753 ns 187744 ns 4487 devector_insert/512 7788603 ns 8666722 ns 90 devector_insert/4096 493624150 ns 475803050 ns 2 cvector_insert/8 9110 ns 9804 ns 74786 cvector_insert/64 275540 ns 294866 ns 2804 cvector_insert/512 15341661 ns 18373451 ns 45 cvector_insert/4096 978790847 ns 904805800 ns 1 kind regards Thorsten

Bjorn Reese

4 Oct 4 Oct

12:29 p.m.

On 10/04/2017 01:30 AM, Ion Gaztañaga via Boost wrote:

...

2) Defaulting size_type to unsigned int is arbitrary and against existing practice. According to the usual STL rules, size_type shall come from the allocator, which knows what type can represent the number of elements that can be allocated. I find interesting the idea of expressing a different size_type to save space, but I am not sure if that should come from the allocator or from an option.

I cannot connect the dots between size_type and the allocator. The Container requirements [1] only states that size_type should be an unsigned integer type large enough to contain any non-negative value of difference_type. And difference_type comes from the iterator. The AllocatorAwareContainer requirements do not mention size_type. [1] Section [container.requirements.general] in the C++ standard.

Ion Gaztañaga

2:25 p.m.

On 04/10/2017 14:29, Bjorn Reese via Boost wrote:

...

On 10/04/2017 01:30 AM, Ion Gaztañaga via Boost wrote:

...
2) Defaulting size_type to unsigned int is arbitrary and against existing practice. According to the usual STL rules, size_type shall come from the allocator, which knows what type can represent the number of elements that can be allocated. I find interesting the idea of expressing a different size_type to save space, but I am not sure if that should come from the allocator or from an option.

I cannot connect the dots between size_type and the allocator.

The Container requirements [1] only states that size_type should be an unsigned integer type large enough to contain any non-negative value of difference_type. And difference_type comes from the iterator.

The AllocatorAwareContainer requirements do not mention size_type.

[1] Section [container.requirements.general] in the C++ standard.

True. I revised with surprise the standard. It's only required for basic_string. This is a contradiction and IMHO a bug introduced in C++11. libc++ and msvc define it from allocator_traits (although libstdc++ defines it as size_t). For C++03 it was typedef to Allocator::size_type; so it does not make sense to define it in another way, as it would break code. The original C++03 definition should have been updated to allocator_traits<T>::size_type. Best, Ion

Louis Dionne

6 Oct 6 Oct

11:54 p.m.

New subject: [SPAM] Re: [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

Hi, This is my review of the proposed Boost.DoubleEnded library. I did my review and read the threads on the ML only after (to avoid bias), so I know some of the points below have already been raised by other reviewers. I vote to REJECT the library as currently proposed. However, I could see parts of the functionality provided by this library be added to Boost.Container.

...

- What is your evaluation of the design?

Major points: - The first thing I thought when I saw the library was: why is this not an addition to Boost.Container? - Regarding `batch_deque`, is there any reason why it can't simply be unified with `boost::container::deque` by adding more customization points to the latter? - I don't see the use for the uninitialized constructors and `unsafe_push_back`/`unsafe_push_front`. I mean, I do understand the use _in theory_, but I'd really like to see a good motivation for them with a more concrete use case. This is pretty dangerous and tricky to work with, so I think this deserves a good justification. I can't see myself using this container or recommending anyone to use the container without this. Minor points: - `de::reserve_only_tag{}` could be just `de::reserve_only` by providing the following definition: static constexpr de::reserve_only_tag reserve_only{}; - `de::batch_deque_policy<256>` should really be `de::segment_size<256>` or something like that.

...

- What is your evaluation of the implementation?

Looks okay with a quick study. However, I found a few red flags that made me uneasy. For example, https://git.io/vdu3x: // TODO should move elems-by-elems if the two allocators differ Given the fact that it's a container library with support for custom allocators, this makes me wonder whether there are other oversights like this and whether the library is Boost quality as of now. I'm not sure, as I may be misunderstanding what the TODO means. Also, the fact that the library has not seen significant work in 1.5 years makes me think that this TODO might not go away.

...

- What is your evaluation of the documentation?

Not bad, but it could use some more work. - I found it quite difficult to get something significant out of the tutorial documentation without having read the reference documentation before. This is because the documentation makes several references to particularities of the containers that are only explained in the reference documentation. - The design rationale for `devector` at the beginning of the documentation is slightly misleading. `devector` can't be a drop-in replacement for `std::vector`, because of iterator invalidation guarantees of move operations (see https://stackoverflow.com/a/41384937/627587). - In section "devector usage": > Here, while push_back calls make use of the small buffer, push_front > results in an allocation: Push operations do not shift elements to > avoid breaking the promise of reserve. The "promise of reserve" needs to be explained; what does this mean? - How do those containers play with custom Allocators (I'm specifically thinking about those that provide fancy pointers)? The documentation should mention this. - A simple diagram would be very helpful in quickly understanding what the `devector` is. - I would really like to see actual benchmarks, even if this needs coming up with more realistic use cases. IMO, it is not acceptable for a library tailored for performance (and having unsafe operations for that purpose) to provide only a listing of assembly at `-O2`. - Several minor things like typos and weird formulations. I submitted a PR for some of those, but the situation could still be improved. This is not a big deal, though, as it would be trivial to fix. - I must say I appreciate the attention that was given to exception guarantees in the documentation. This is quite important for containers, and it seems like the author was careful about this.

...

- What is your evaluation of the potential usefulness of the library?

This is where it falls apart for me. I feel like the documentation fails to make a case for the library, and as a result, I don't see myself using it or recommending someone else to use it. I do understand the particularities of the provided containers, I just fail to see why I should care enough about them to justify a Boost library. As I said before, I also think the unsafe operations should be better justified, because otherwise one of the purposes of the library seems to be to aid C++ programmers to shoot themselves in the foot.

...

- Did you try to use the library? With which compiler(s)? Did you have any problems?

I built the unit tests on my Mac with Clang without problem. I did have problems when I tried building without specifying a toolset, but this may be a misunderstanding of how Boost.Build works on my part.

...

- How much effort did you put into your evaluation? A glance? A quick reading? In-depth study?

About 3 hours of reading all the documentation and looking at the implementation.

...

- Are you knowledgeable about the problem domain?

I know stuff about containers, but I'm not an expert. I'm not exactly sure what the "problem domain" of the library is, which is one of the issues I have with it. Also, I'd like to point out that formal reviews are usually carried out by review managers without direct involvement in the library. I do not want to imply that the review outcome will actually be biased, and I'm sure the review manager is capable of impartiality, but I thought this was worth pointing out. Like I said at the beginning, I don't think this should be a new Boost library. However, I think it would be worthwhile to consider adding `devector` to Boost.Container and augmenting Boost.Container's `deque`, and I would encourage the author of the library to pursue this path. Thank you, Louis Dionne -- Sent from: http://boost.2283326.n4.nabble.com/Boost-Dev-f2600599.html

Benedek Thaler

7 Oct 7 Oct

10:35 a.m.

New subject: [SPAM] Re: [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

On Sat, Oct 7, 2017 at 1:54 AM, Louis Dionne via Boost < boost@lists.boost.org> wrote:

...

I vote to REJECT the library as currently proposed. However, I could see parts of the functionality provided by this library be added to Boost.Container.

Thanks for your review! (And the PR fixing typos)

...

...
- What is your evaluation of the design?

Major points:

- The first thing I thought when I saw the library was: why is this not an addition to Boost.Container?

I suppose this is now explained above. - I don't see the use for the uninitialized constructors and

...

`unsafe_push_back`/`unsafe_push_front`. I mean, I do understand the use _in theory_, but I'd really like to see a good motivation for them with a more concrete use case. This is pretty dangerous and tricky to work with, so I think this deserves a good justification. I can't see myself using this container or recommending anyone to use the container without this.

Again, benchmarks above - they show some things are faster this way. On the other hand, I absolutely get your point.

...

- The design rationale for `devector` at the beginning of the documentation is slightly misleading. `devector` can't be a drop-in replacement for `std::vector`, because of iterator invalidation guarantees of move operations (see https://stackoverflow.com/a/41384937/627587).

I suppose this happens only with a small buffer in effect. - In section "devector usage":

...

> Here, while push_back calls make use of the small buffer, push_front > results in an allocation: Push operations do not shift elements to > avoid breaking the promise of reserve.

The "promise of reserve" needs to be explained; what does this mean?

d.reserve(d.size() + 10); // next 10 push_back will not allocate

...

- How do those containers play with custom Allocators (I'm specifically thinking about those that provide fancy pointers)? The documentation should mention this.

Probably not supported. https://github.com/erenon/double_ended/issues/2

...

- I would really like to see actual benchmarks, even if this needs coming up with more realistic use cases. IMO, it is not acceptable for a library tailored for performance (and having unsafe operations for that purpose) to provide only a listing of assembly at `-O2`.

See above - tough I'm skeptic regarding micro-benchmarks. Too easy to get them wrong, without knowing it, no way to prove one got them right.

...

- Several minor things like typos and weird formulations. I submitted a PR for some of those, but the situation could still be improved. This is not a big deal, though, as it would be trivial to fix.

Thanks, merged. Benedek

Thorsten Ottosen

10 Oct 10 Oct

8:28 a.m.

New subject: [SPAM] Re: [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

Hi Louis, Thanks for the review. I forgot to give some feedback. Comments below.

...

Hi, - I don't see the use for the uninitialized constructors and `unsafe_push_back`/`unsafe_push_front`. I mean, I do understand the use _in theory_, but I'd really like to see a good motivation for them with a more concrete use case. This is pretty dangerous and tricky to work with, so I think this deserves a good justification. I can't see myself using this container or recommending anyone to use the container without this.

Did you see the benchmarks? Did you know Boost.Container has similar constructs? Would you then now not use Boost.Container?

...

...
- What is your evaluation of the documentation?

...

- The design rationale for `devector` at the beginning of the documentation is slightly misleading. `devector` can't be a drop-in replacement for `std::vector`, because of iterator invalidation guarantees of move operations (see https://stackoverflow.com/a/41384937/627587).

Can you explain why ? ... it is not obvious that devector differs in this respect.

...

- A simple diagram would be very helpful in quickly understanding what the `devector` is.

Agreed.

...

- I would really like to see actual benchmarks, even if this needs coming up with more realistic use cases. IMO, it is not acceptable for a library tailored for performance (and having unsafe operations for that purpose) to provide only a listing of assembly at `-O2`.

Some have been posted. What is your take on them?

...

- I must say I appreciate the attention that was given to exception guarantees in the documentation. This is quite important for containers, and it seems like the author was careful about this.

I'm sure he was. Notice that reserve functions in batch_queue are not only to satisfy the precondition of unchecked_push_back/front, but also give the possibility to have a subsequent section of code that does not throw.

...

Also, I'd like to point out that formal reviews are usually carried out by review managers without direct involvement in the library. I do not want to imply that the review outcome will actually be biased, and I'm sure the review manager is capable of impartiality, but I thought this was worth pointing out.

Yes, I'm certainly biased. I don't know what is best: to have a review manager that doesn't care about the proposed library or one that does? That said, we had GSOC in 2015 and should not do that if no one is willing to review the good end results. So Benedek asked for a review a long time ago, and no one came forward as review manager. A year or so later, I felt I had to do something about that. That's the background.

...

Like I said at the beginning, I don't think this should be a new Boost library. However, I think it would be worthwhile to consider adding `devector` to Boost.Container and augmenting Boost.Container's `deque`, and I would encourage the author of the library to pursue this path.

Thanks. -Thorsten

Louis Dionne

11:05 p.m.

New subject: [SPAM] Re: [SPAM] Re: [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

...

Hi Louis,

...
Hi, - I don't see the use for the uninitialized constructors and `unsafe_push_back`/`unsafe_push_front`. I mean, I do understand the use _in theory_, but I'd really like to see a good motivation for

Thanks for the review. I forgot to give some feedback. Comments below. them

...
with a more concrete use case. This is pretty dangerous and tricky to work with, so I think this deserves a good justification. I can't see myself using this container or recommending anyone to use the container without this.

Did you see the benchmarks?

I did. See below.

...

Did you know Boost.Container has similar constructs? Would you then now not use Boost.Container?

I did not know Boost.Container had similar constructs, and I could not find them in the documentation. Could you please share a link? If it does have them, I certainly hope they were justified properly when they were introduced or when the library was reviewed.

...

...
...
- What is your evaluation of the documentation?

...
- The design rationale for `devector` at the beginning of the documentation is slightly misleading. `devector` can't be a drop-in replacement for `std::vector`, because of iterator invalidation guarantees of move operations (see https://stackoverflow.com/a/41384937/627587).

Can you explain why ? ... it is not obvious that devector differs in this respect.

With the small buffer optimization, I think moving containers will invalidate iterators.

...

...
- A simple diagram would be very helpful in quickly understanding what the `devector` is.

Agreed.

...
- I would really like to see actual benchmarks, even if this needs coming up with more realistic use cases. IMO, it is not acceptable for a library tailored for performance (and having unsafe operations for that purpose) to provide only a listing of assembly at `-O2`.

Some have been posted. What is your take on them?

It seems like the benchmarks (if we're talking about the serialization and deserialization ones) are mostly about the facilities allowing to skip initialization. This is nice, but would it be possible to construct benchmarks that do not use Boost.Serialization for demonstrating the same thing? Doing otherwise opens the door to speedups and slowdowns caused by things unrelated to DoubleEnded itself. Also, my understanding is that these benchmarks only showcase the benefit of not double-initializing containers, but not the actual main purpose of the library, which is to provide double-ended containers. Benchmarks showing why one cares about that (beyond std::vector and std::deque) would be welcome. Finally, those should be added to the library documentation. Regards, Louis -- Sent from: http://boost.2283326.n4.nabble.com/Boost-Dev-f2600599.html

Thorsten Ottosen

11 Oct 11 Oct

9:54 a.m.

Den 11-10-2017 kl. 01:05 skrev Louis Dionne via Boost:

...

...
Hi Louis,

...

...
Did you know Boost.Container has similar constructs? Would you then now not use Boost.Container?

I did not know Boost.Container had similar constructs, and I could not find them in the documentation. Could you please share a link? If it does have them, I certainly hope they were justified properly when they were introduced or when the library was reviewed.

http://www.boost.org/doc/libs/1_65_1/doc/html/boost/container/vector.html

...

...
...
...
- What is your evaluation of the documentation?

...
- The design rationale for `devector` at the beginning of the documentation is slightly misleading. `devector` can't be a drop-in replacement for `std::vector`, because of iterator invalidation guarantees of move operations (see https://stackoverflow.com/a/41384937/627587).

Can you explain why ? ... it is not obvious that devector differs in this respect.

With the small buffer optimization, I think moving containers will invalidate iterators.

Sure. But by default the is no small buffer. But the documentation should stress those differences when one starts to use the small buffer or a version of decvetor that supports a small buffer.

...

...
...
- I would really like to see actual benchmarks, even if this needs coming up with more realistic use cases. IMO, it is not acceptable for a library tailored for performance (and having unsafe operations for that purpose) to provide only a listing of assembly at `-O2`.

Some have been posted. What is your take on them?

It seems like the benchmarks (if we're talking about the serialization and deserialization ones) are mostly about the facilities allowing to skip initialization. This is nice, but would it be possible to construct benchmarks that do not use Boost.Serialization for demonstrating the same thing? Doing otherwise opens the door to speedups and slowdowns caused by things unrelated to DoubleEnded itself.

Basically, whenever you encounter double initialization of trivially copyable types, you can avoid it. That's what you save, nothing less, nothing more. I can't understand your last sentence.

...

Also, my understanding is that these benchmarks only showcase the benefit of not double-initializing containers, but not the actual main purpose of the library, which is to provide double-ended containers. Benchmarks showing why one cares about that (beyond std::vector and std::deque) would be welcome.

Well, if you take devector, I think it's very much pointless to show push_front is faster than vector front insertion. It simply is. If you take batch_deque, it also very much pointless to compare it with std::deque. No fixed segment size can be best for all possible use cases. It cannot. For example, when I work with an A* algorithm for decrete optimization problems, I need one batch_deque with a fairly large segments size. Other programs that have a need for many small deques, would not want a very large segment size. There are some benchmarks done here: http://larshagencpp.github.io/blog/2016/05/22/devector If you walk through the messages in this thread there are benchmarks on batch_deque iteration and unsafe_push_back. Hopefully I will also post one showing random insertion/erase is very good for devector. kind regards -Thorsten

degski

10:25 a.m.

On 11 October 2017 at 12:54, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

There are some benchmarks done here:

http://larshagencpp.github.io/blog/2016/05/22/devector

What I take away from that link is that the devector discussed and bench-marked in that blog, is conceptually different from the boost::de::devector we are discussing in this thread. degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

11:28 a.m.

Den 11-10-2017 kl. 12:25 skrev degski via Boost:

...

On 11 October 2017 at 12:54, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...
There are some benchmarks done here:

http://larshagencpp.github.io/blog/2016/05/22/devector

What I take away from that link is that the devector discussed and bench-marked in that blog, is conceptually different from the boost::de::devector we are discussing in this thread.

How so? The main concept of a devector is a contiguous container that can grow efficiently in both directions. Balancing strategies, resize strategies etc is something we can discuss within that framework (which we have). -T

degski

12:24 p.m.

On 11 October 2017 at 14:28, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

The main concept of a devector is a contiguous container that can grow efficiently in both directions. Balancing strategies, resize strategies etc is something we can discuss within that framework (which we have).

Yes, agreed as to the main concept. I don't think there are many people that feel that "the main concept" is wrong. "WE" like it! But, we are discussing implementation here, so we *are* discussing re-balancing, re-sizing and reserving.... The blog you're quoting discusses the same main concept (as defined by you), but it discusses a (conceptually, in regard of implemention) different beast all-together... Nothing wrong with that, it mostly shows that there are other (and not the only) ideas floating around. One could f.e. imagine adaptive re-balancing (as proposed in another blog, previously quoted in this thread) as well, but the foot print of *that* bugger won't be 24 bytes (on 64 bit). So "horses for courses", let's have plenty... I re-frained from writing a review, mainly because I don't think a common opinion has (yet) been built around what the concept (beyond "main concept") should be. As it stands, the relocation-policy can be qualified as extreme, which I could adhere to, if that particular concept gave use a quantifiable benefit. My own experiments say it doesn't and one can have a "more balanced result", while still achieve the same performance (I wrote about that earlier). On the other hand, I think that batch_deque *should* be rejected, as the main attraction of batch_deque, could be integrated in boost::container::deque. I have no clue why iterating over the segments is usefull, but I live to learn, so convince me. But to conclude this post on a positive note, I do think there is a space for a collection of "special interest" containers [1]. They should be in a separate library, so they have the "dangerous" marking (Boost.Dangerous could be a thing). Yes, they might not give the same guarantees as standard containers, or work with a statefull allocator, or deliver any other guarantee the STL provides (but those guarantees come at a cost. Obviously, something has to give!). I don't see *that* as a problem. It's like with your washing machine, you need to read the manual and stick to the rules... degski [1] f.e. I wrote a 16 byte footprint vector (limited to size_type = std::uint32_t). It is, on push_back, as fast as std::vector on VS2017. I don't have the capacity to write a boost::proof thing, but that doesn't take away that *it* can be done, I would be a taker... -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

3:07 p.m.

Den 11-10-2017 kl. 14:24 skrev degski via Boost:

...

On 11 October 2017 at 14:28, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

As it stands, the relocation-policy can be qualified as extreme, which I could adhere to, if that particular concept gave use a quantifiable benefit. My own experiments say it doesn't and one can have a "more balanced result", while still achieve the same performance (I wrote about that earlier).

Sure, and we have certainly floated various ideas. I leaning towards the following principles: A. insert never allocates if it can avoid it B. insert, when allocation is needed, balances the free capacity C. a push at either end by default does not balance the free capacity D. in certain situations push at either end may choose to balance the free capacity, say of the free capacity in the opposite end is larger than size() or perhaps larger or equal to size(). I would hate to see a push artificially create capacity in the opposite end, only to reclaim it later by an expensive operation.

...

[1] f.e. I wrote a 16 byte footprint vector (limited to size_type = std::uint32_t). It is, on push_back, as fast as std::vector on VS2017. I don't have the capacity to write a boost::proof thing, but that doesn't take away that *it* can be done, I would be a taker...

You can always store extra stuff in the allocated buffer instead of directly in the class. I don't if that is what you did. But you don't actually same memory that way, you just moved the location. kind regards -Thorsten

degski

12 Oct 12 Oct

9:42 a.m.

On 11 October 2017 at 18:07, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...

Sure, and we have certainly floated various ideas. I leaning towards the following principles:

A. insert never allocates if it can avoid it B. insert, when allocation is needed, balances the free capacity C. a push at either end by default does not balance the free capacity

That is how I would do it as well, relocation should re-balance, though. D. in certain situations push at either end may choose to balance the free

...

capacity, say of the free capacity in the opposite end is larger than size() or perhaps larger or equal to size().

resize and reserve are also candidates for having such a consideration. One could imagine resize_back/resize_front/reserve_back/reserve_front and let resize and reserve always re-balance (when appropriate).

...

You can always store extra stuff in the allocated buffer instead of directly in the class. I don't if that is what you did. But you don't actually same memory that way, you just moved the location.

Not necessary, the relevant bits for this point are below: template<typename T, typename A = std::allocator<T>, typename G = golden_ratio_growth_policy<2>> class pod_collection : private A { // Other stuff, the rest... inline iterator begin ( ) noexcept { return iterator { begin_pointer ( ) }; } inline iterator end ( ) noexcept { return iterator { m_end }; } inline bool empty ( ) const noexcept { return not ( m_size ); } inline bool full ( ) const noexcept { return m_size == m_capacity; } inline void clear ( ) noexcept { m_end = begin_pointer ( ); m_size = 0; } inline reference operator [ ] ( const size_type i_ ) noexcept { return *( begin_pointer ( ) + i_ ); } private: __forceinline pointer begin_pointer ( ) const noexcept { return m_end - m_size; } __forceinline pointer back_pointer ( ) const noexcept { return m_end - 1; } size_type m_size, m_capacity; // size_type = std::uint32_t's pointer m_end; }; Needless to say that the whole exercise only makes sense on 64-bit platforms. The only pointer is the end pointer, so push_back can be fast. Since one doesn't need the pointer to the allocated memory all the time, it can be calculated on the fly, when needed. For iteration there is hardly a penalty (one subtraction, for begin()), but for indexed access there *is* a penalty (of one subtraction per access), so to be avoided (with iterators). degski -- "*Ihre sogenannte Religion wirkt bloß wie ein Opiat reizend, betäubend, Schmerzen aus Schwäche stillend.*" - Novalis 1798

Thorsten Ottosen

12:49 p.m.

Den 12-10-2017 kl. 11:42 skrev degski via Boost:

...

On 11 October 2017 at 18:07, Thorsten Ottosen via Boost < boost@lists.boost.org> wrote:

...
Sure, and we have certainly floated various ideas. I leaning towards the following principles:

A. insert never allocates if it can avoid it B. insert, when allocation is needed, balances the free capacity C. a push at either end by default does not balance the free capacity

That is how I would do it as well, relocation should re-balance, though.

That would mean a series of push_back or push_front creates more and more free space in the end that is not used, wouldn't it?

...

D. in certain situations push at either end may choose to balance the free

...
capacity, say of the free capacity in the opposite end is larger than size() or perhaps larger or equal to size().

resize and reserve are also candidates for having such a consideration. One could imagine resize_back/resize_front/reserve_back/reserve_front and let resize and reserve always re-balance (when appropriate).

Yeah. There is a better design screaming to get out here. The whole discussion of reserve/capacity/clear has made me wonder if we can actually do it all with one class. Said differently, we need a policy. I can see two obvious use cases of devector: A. a better vector (many push_backs, but also erase and inserts) B. the default container for flat_set/flat_map The differences belong to GrowthPolicy. (I think I like the name ResizingPolicy a little better.) Say we had this: struct balanced_resizing_policy { static balancing_strategy = balancing_strategy::balance_middle; static size_type new_capacity(size_type); static bool should_shrink(size_type, size_type, size_type); }; struct vector_resizing_policy { static balancing_strategy = balancing_strategy::balance_left; static size_type new_capacity(size_type); static bool should_shrink(size_type, size_type, size_type); }; Then we could have template< class T, class A > using balanced_devector = devector<T,A,balanced_resizing_policy>; template< class T, class A > using unbalanced_devector = devector<T,A,balanced_resizing_policy>; The balancing_strategy will then affect - reserve - clear - resize (perhaps)

...

The only pointer is the end pointer, so push_back can be fast. Since one doesn't need the pointer to the allocated memory all the time, it can be calculated on the fly, when needed. For iteration there is hardly a penalty (one subtraction, for begin()), but for indexed access there *is* a penalty (of one subtraction per access), so to be avoided (with iterators).

I see. Clever. kind regards Thorsten

2811

Age (days ago)

2851

Last active (days ago)

List overview

Download

159 comments

24 participants

participants (24)

Benedek Thaler
Bjorn Reese
Chris Glover
Daniel James
degski
Edward Diener
Fletcher, John P
Gavin Lambert
Glen Fernandes
Hans Dembinski
Howard Hinnant
Ion Gaztañaga
Joaquin M López Muñoz
Louis Dionne
Nevin Liber
Niall Douglas
Paul A. Bristow
Soul Studios
Steven Watanabe
Thorsten Ottosen
Thorsten Ottosen
Tim Song
Vinnie Falco
Zach Laine