Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

25 Sep 2017

      Hi Zach,

Thanks for spending a lot of time thinking about the library. Some minor 
points for discussion follow.
...
My first concern is that this library is centered around runtime
performance,
yet there appears to be no measurement done to determine what -- or even
whether -- optimization occurs when using this library instead of the
standard
containers.
For devector:
-------------

Is it it not obvious that an O(1) push_front performs better than an 
O(n) vec.insert( vec.begin(), x ) ?

Is it not obvious that having a small buffer optimization can be useful? 
boost::small_vector does it. Nearly all std::string does it. 
stlsoft::auto_buffer has done it for a decade, described in detail in 
Matthew Wilson's book. Chandler Carruth had the following presentation 
last year

https://www.youtube.com/watch?v=vElZc6zSIXM

about small buffer optimizations use in Clang data structures.

For batch_deque:
----------------

With std::deque you cannot control the segment size, and implementations 
differ widely. Is it not obvious that having a block size of 1 or 2 (as 
e.g. visual c++ does) is very fast.

That iteration over a deque can be slow has been known for two decades, 
Matt Austern wrote an article about it.
...
My second concern is that the containers in this library are difficult to
reason about.
Look at it this way: devector is like vector with O(1) front insertion; 
batch_deque is like deque with custom control over the segment size.

Are you opposed to such containers or is it "all the extra" stuff (which 
you don't have to use btw) that makes you uneasy?

Serialization:
--------------

std types are over encapsulated which means that they can't avoid double 
initialization, nor process elements in batches. Is it not obvious that 
double initialization is more expensive than single initialization?
...
For that matter, is the performance of devector better than std::deque when
I
don't know the relative frequency of front- and back-pushes?  It seems in
many
cases that it would be quite a bit worse.
You are comparing apples and oranges. push_back on vector may be worse 
than on deque (just try to store array<T,1024>).

It will depend on the situation, how expensive move operations are, how 
expensive allocation is etc. But if you need or want a contiguous 
layout, you can't use deque.
...
Another example from the docs:
de::devector<std::string> reversed_lines;
reversed_lines.reserve_front(24);
[snip]
...
?  If it's just to avoid the verbose for-loop, that's not enough for me.  If
it's an efficiency claim, I have no numbers to back that up, and the use of
IO in the example makes it moot anyway.
It's not about rewriting a loop. It about having a different layout for 
your container's sequence because that layout is either more natural or 
convenient or simply needed.

The first use case for devector is actually in the implementation of 
batch_deque. One could use some deque implementation, but that would not 
be optimal for performance. Being able to use devector there has made 
the code so much simpler.
...
I need a justification for unsafe_push_front() besides "avoiding a check,
thus sparing a few instructions".  My intuition is that it does not matter, since
branch prediction is quite good these days, and a single allocation will
swallow all the gains if there are any.  However, *measurement* is the only
thing that can make this justification, pro or con.
Yes, and what do you do when you actually measure and find out push_back 
is the bottleneck?

vector has both operator[] and at(), and I'm pretty sure people would go 
mad if they were forced to use an operator[] that did what at() does. So 
why is push_back any different? You have basically the same situation: 
two functions that differ only slightly in their precondition.
...
Again, from the example docs:
// int sockfd = socket(...); connect(...);
de::devector<char> buffer(256, de::unsafe_uninitialized_tag{});
long recvSize = recv(sockfd, buffer.data(), buffer.size(), 0);
if (recvSize >= 0)
{
   buffer.unsafe_uninitialized_resize_back(recvSize);
   // process contents of buffer
}
else { /* handle error */ }
The unsafe resize does nothing in this case, since char has a trivial
dtor.
It changes the size of the container so you can't read uninitialized values.

kind regards

-Thorsten

Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

Thorsten Ottosen