Hi Zach, Thanks for spending a lot of time thinking about the library. Some minor points for discussion follow.
My first concern is that this library is centered around runtime performance, yet there appears to be no measurement done to determine what -- or even whether -- optimization occurs when using this library instead of the standard containers.
For devector: ------------- Is it it not obvious that an O(1) push_front performs better than an O(n) vec.insert( vec.begin(), x ) ? Is it not obvious that having a small buffer optimization can be useful? boost::small_vector does it. Nearly all std::string does it. stlsoft::auto_buffer has done it for a decade, described in detail in Matthew Wilson's book. Chandler Carruth had the following presentation last year https://www.youtube.com/watch?v=vElZc6zSIXM about small buffer optimizations use in Clang data structures. For batch_deque: ---------------- With std::deque you cannot control the segment size, and implementations differ widely. Is it not obvious that having a block size of 1 or 2 (as e.g. visual c++ does) is very fast. That iteration over a deque can be slow has been known for two decades, Matt Austern wrote an article about it.
My second concern is that the containers in this library are difficult to reason about.
Look at it this way: devector is like vector with O(1) front insertion; batch_deque is like deque with custom control over the segment size. Are you opposed to such containers or is it "all the extra" stuff (which you don't have to use btw) that makes you uneasy? Serialization: -------------- std types are over encapsulated which means that they can't avoid double initialization, nor process elements in batches. Is it not obvious that double initialization is more expensive than single initialization?
For that matter, is the performance of devector better than std::deque when I don't know the relative frequency of front- and back-pushes? It seems in many cases that it would be quite a bit worse.
You are comparing apples and oranges. push_back on vector may be worse
than on deque (just try to store array
Another example from the docs:
de::devectorstd::string reversed_lines; reversed_lines.reserve_front(24);
[snip]
? If it's just to avoid the verbose for-loop, that's not enough for me. If it's an efficiency claim, I have no numbers to back that up, and the use of IO in the example makes it moot anyway.
It's not about rewriting a loop. It about having a different layout for your container's sequence because that layout is either more natural or convenient or simply needed. The first use case for devector is actually in the implementation of batch_deque. One could use some deque implementation, but that would not be optimal for performance. Being able to use devector there has made the code so much simpler.
I need a justification for unsafe_push_front() besides "avoiding a check, thus sparing a few instructions". My intuition is that it does not matter, since branch prediction is quite good these days, and a single allocation will swallow all the gains if there are any. However, *measurement* is the only thing that can make this justification, pro or con.
Yes, and what do you do when you actually measure and find out push_back is the bottleneck? vector has both operator[] and at(), and I'm pretty sure people would go mad if they were forced to use an operator[] that did what at() does. So why is push_back any different? You have basically the same situation: two functions that differ only slightly in their precondition.
Again, from the example docs:
// int sockfd = socket(...); connect(...); de::devector<char> buffer(256, de::unsafe_uninitialized_tag{}); long recvSize = recv(sockfd, buffer.data(), buffer.size(), 0);
if (recvSize >= 0) { buffer.unsafe_uninitialized_resize_back(recvSize); // process contents of buffer } else { /* handle error */ }
The unsafe resize does nothing in this case, since char has a trivial dtor.
It changes the size of the container so you can't read uninitialized values. kind regards -Thorsten