Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

26 Sep 2017

      El 21/09/2017 a las 19:38, Thorsten Ottosen via Boost escribió:
...
Dear users and members of Boost,
I'm proud to announce that the formal review of Benedek Thaler's 
Boost.DoubleEnded
library starts today, September 21, and runs through September 30. 
This library is the
result of Benedek's participation in Google Summer of Code 2015.
Hi, this is my review of Boost.DoubleEnded. As my review approach is heavily
influenced by my acceptance vote, let me begin with that:

I vote for conditional acceptance of both containers after they have 
been presented
as separate components of Boost.Container and undergone significant 
changes, mostly
along the line of simplifying their interface. A bit of elaboration:

* A library for a familiy of containers (or any other classes) is 
justified when its members
are related either by shared concepts, constructs and/or implementation. 
This is not
the case of Boost.DoubleEnded, whose only general theme is the very lax 
notion of
"double endianness", and for which the amount of common implementation 
code is small
and mostly utility-level. We already have a bestiary of containers in 
Boost, namely
Boost.Container, so I think both devector and batch_deque are better 
served if proposed
as components of this latter libray. Code in boost::double_ended::detail 
can go to / merge
into boost::container[::detail] --for instance, 
boost::double_ended::detail::allocator_traits
seems to be redundant with boost::containter::allocator_traits.
* The main characteristics of both containers are obscured by many 
"extras" that feel
insufficiently justified to me or even detrimental (I'll discuss in more 
detail below), so my
vote is for this fat to be removed prior to acceptance in Boost. I can 
see much work
and love have gone into adding this many niceties, and the author could 
understansably
resist to simplification, but libraries, in my opinion, should start 
with a minimal and
well grounded design and add more stuff as demand proves their need.

So, since I see this library as a pair of two unrelated classes, let me 
continue with the
review for each separately.

REVIEW FOR DEVECTOR

DESIGN

1. The container is presented as a "hybrid of the standard vector and 
deque containers,
as well as the small_vector of Boost.Container.". It's hard to see what 
the selling point
is. To me, the main feature (and the only one worth retaining) of 
devector is
amortized constant insertion at both ends with memory contiguity. So, 
I'd present
devector as a "contiguous memory container with amortized constant 
insertion at both
ends of the sequence". Note that this characterizes the container to the 
point of
nearly fixing its implementation and (assuming STL style is followed) 
its interface.
2. SBO is orthogonal to devector's raison d'être (I mean, SBO can be 
added to any container)
and violates the "don't pay for what you don't use" principle. I'd 
remove it. If someone
asks for it, provide a separate small_devector, but I doubt there'll be 
huge demand for
this (devector, to begin with, is likely to be used only in very 
specialized scenarios, so
the probability that SFO comes in handy as well is, IMHO, vanishingly 
small).
3. Unsafe methods are poorly justified: if std::vector does not have 
them, why add them
here? Such degree of unsafety should be backed up with hard data on its 
superior
performance. I'd remove this bit.
4. Same goes for the growing policy. Not even boost::container::vector 
features a growing
policy, which seems an indication that this is not a heavily demanded 
customization point.
I'd remove it.
5.  Defining size_type as unsigned int for the sake of class size seems 
to me overzealous.
6. reserve_only_tag construction looks like too much hassle to avoid 
default construction
followed by proper reserving calls, which is what everybody already does 
with
std::vector. Again, I'd remove this one.
7. Modulo the points discussed above, the interface of devector follows 
closely that
of std::vector (with the obvious additions, e.g. 
[push|pop_emplace]_front), which is
very good. The only interface decisions I'm not fully convinced about is 
capacity/
resize/reserve/shrink_to_fit without "front" or "back" qualification:
   - capacity() seems to be (from what I can infer reading the docs 
only) the sum of
   front capacity and back capacity. I can't find any sensible use of 
this info, and, besides,
   the careless reader could assume capacity() works as 
std::vector::capacity(), which
   raises usability concerns. My suggestion is to remove this. FWIW, we 
had a similar
   discussion about capacity() in Boost.PolyCollection, and that member 
function finally
   was dropped.
   - front_free_capacity/back_free_capacity could be more aptly named
   front_capacity/back_capacity.
   - resize and reserve as aliases to resize_back and reserve_back have 
usability  problems as
   discussed with capacity. I'd remove them.
   - For symmetry, shrink_to_fit could be complemented with 
[front|back]_shrink_to_fit
   member functions --note I'm not proposing that shrink_to_fit be 
removed, as its
   semantics are perfectly clear.

[SERIALIZATION]

IMPLEMENTATION

1. I've taken just  cursory look at the code, but everything looks very 
good in general.
2. I take that lines 2367-8 (and similar) in devector.hpp:

   // avoid invalidating any reference in args later
   T tmp(std::forward<Args>(args)...);

are somehow related to LWG Issue 526, right?
3. I have issues with growing and capacity handling. If I'm reading the 
code right,
growth factor is x4 (!!) and whether the new space is given to the front 
or back side
depends on whether the insertion that triggered the expansion is closer 
to one
or the other. x4 is much larger than any growing factor I've seen so far 
for std::vector
(typically x1.5 or x2), I suggest this should be brought down to what's 
customary. As
for the policy for giving the space to front or back, I think it makes 
sense but needs to
be documented officially. I'm assuming that erasure behaves analogously 
(i.e. space
is given back to the side that's closer to the erasure point) --in 
particular, inserting
an element and erasing it immediately after should return front and back 
capacities
to their original values.
4. I'm not comfortable with the fact that shifting on insertion turns to 
the opposite
side when the favored side (the one closer to the insertion point) 
happens to be
full. This makes it very hard to reason about post-insertion front and 
back capacities.
Also, the reference statement that insertion is "linear in half the size 
of *this [...]"
is currently not true (when the favored side is full insertion takes 
longer than that).
I'd rather drop this optimization and document that shifting always 
happen to the
direction where the least operations are needed.
5. I wonder why serialization code is so complicated: doesn't it suffice 
to rely on
boost::serialization::stl::collection_load_impl|save_collection?
6. Tests look very good, with corner cases and all.

DOCUMENTATION

1. If the extra features are removed, the design rationale would 
consequently be much
shorter.
2. I confess I don't get what "devector implements the NAD (Not A 
Defect) resolution
of LWG Issue 526" is about.
3. The examples section is 90% devoted to the extra features I advocate 
to suppress.
4. The benchmark section should go (IMHO). The display of assembly 
outputs is anecdotal
at best, since it is quite plain to guess that devector::push_back 
should be basically the same
code as std::vector::push_back. I don't think devector needs any 
benchmark backup to be
bought by potential users.
5. The reference section is very good. Capacity handling and which side 
shifting goes
on insertion/erasure should IMHO be documented exhaustively (see my 
points above on
this matter).

POTENTIAL USEFULNESS OF DEVECTOR

I can only use my intuition here, but memory contiguity plus constant 
insertion at both
ends looks a general enough proposition to meet potential users. FIFO 
queues jump to mind.
Time will tell.

OTHER

1. I didn't try to use the library.
2. I spent aroud four hours reviewing devector.
3. I have some experience writing STL-style containers.

REVIEW FOR BATCH_DEQUE

DESIGN

1. In the design rationale, batch_deque is presented as a deque with 
configurable segment
size, which is true, but not the end of the story: segment access 
(segment_begin) is not
mentioned, neither is stable insertion.
2. As for segment access, I think the chosen interface is very odd. From 
what I can gather after
looking at the code and testing (because the reference does not document 
this), a
segment_iterator dereferences to the first element of the pointed to 
segment and has
additional data() and data_size() member functions to access the full 
segment. This is a sort
of hybrid entity I fail to see the utility of (for instance, what can I 
do with referencing?). I have
some experience with segmented structures and think a better approach 
would be for
segment_iterator to dereference to a segment descriptor with begin() and 
end() member
functions ranging over the segment elements. This way, one can write 
stuff like:

   for(auto it=d.segment_begin(),end=d.segment_end();it!=end;++it){
     for(auto& element:*it){
       // do something with element
     }
   }

The thing can be further stylized by dropping segment_[begin|end] and 
having a
segments() member functon returning an object with equivalent begin() 
and end()
member functions, i.e., one would write d.segments().[begin|end]() 
rather than
d.segment_[begin|end](), which allows us to use a range for in:

   for(auto segment:d.segments()){
     for(auto& element:segment){
       // do something with element
     }
   }

3. The question arises of whether segmented access can really gain us some
performance I wrote a small test to measure the performance of plain 
std::for_each
vs. an equivalent sequence of segment-level loops, and this is what I 
got in a
...
The library may be downloaded from
https://github.com/erenon/double_ended
and the documentation may be found here:
http://erenon.hu/double_ended/
Anyone is welcome to post a review and/or take part in subsequent 
discussions (see below for review guidelines).
Introduction
------------
This is a container library that provides two containers:
A. boost::devector
B. boost::batch_deque
Both containers provides efficient push_front and push_back 
operations, hence the name DoubleEnded.
The boost::devector class can be seen as an extension of std::vector 
with efficient support for push_front (so boost::devector provides 
contiguous storage). In addition, it provides a customizable small 
buffer optimization like boost::small_vector(*).
It furthermore allows for customization of the growth factor such that 
the user can decide if a reallocation should allocate 1.5 or 2 or 
whatever more memory. And finally it provides new functionality that 
does not exist in normal vector implementations -- this functionality 
allow the class to excel in performance in certain situations like 
serialization or transfer of data from sources that require a 
preallocated buffer.
The boost::batch_deque is similar to std::deque, but with two 
important twists. Like std::deque, it provides reference stability 
when calling push_front/push_back. Reference stability is crucial for 
programs that
uses intrusive containers e.g. using Boost.Intrusive (**). 
boost::batch_deque has the following main advantages:
1. It allows the user to specify the segment size
2. It allows for external iteration over the segments
Taken together, they provide a very efficient and flexible basis for
intrusive containers. Access to the segments also increases 
performance for tasks such as serialization. Finally, 
boost::batch_deque may be seen as an efficient alternative to 
std::vector if you want to replace
a percent-wise memory overhead with a constant memory overhead and 
have no need for contiguous storage.
Review guidelines
-----------------
Please provide in your review whatever information you think is
valuable to understand your final choice of ACCEPT or REJECT including
Fit as a Boost library. Please be explicit about your decision.
Some other questions you might want to consider answering:
  - What is your evaluation of the design?
  - What is your evaluation of the implementation?
  - What is your evaluation of the documentation?
  - What is your evaluation of the potential usefulness of the library?
  - Did you try to use the library? With which compiler(s)? Did you
    have any problems?
  - How much effort did you put into your evaluation? A glance? A quick
    reading? In-depth study?
  - Are you knowledgeable about the problem domain?
More information about the Boost Formal Review Process can be found 
here: http://www.boost.org/community/reviews.html
Kind regards
Thorsten, Review manager
(*) boost::small_vector: https://tinyurl.com/ycslp4ot
(**) http://www.boost.org/doc/libs/1_64_0/doc/html/intrusive.html
_______________________________________________
Unsubscribe & other changes: 
http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [boost] [review] The review of Boost.DoubleEnded starts today: September 21 - September 30

Joaquin M López Muñoz