Re: [boost] [devector] optimum growing/insertion policy

16 Oct 2017

      Den 16-10-2017 kl. 01:45 skrev Ion Gaztañaga via Boost:
...
On 15/10/2017 17:59, Joaquin M López Muñoz via Boost wrote:
...
The assumption is of an ever growing sequence. That said, I recorgnize 
there are
scenarios not falling under this, most notably FIFO queues. I've given 
the policy
a little more thought and think we can twist it to also accommodate 
non-growing
scenarios, please keep on reading.
Thougths?
I think that the default (we could have other policies) should be simple 
and similar to vector/string/stable_vector/etc....
Yes, let's keep it simple.
...
I propose a simple strategy:
Definitions:
---------
capacity() = the size of the buffer
free_capacity() = capacity() - size()
xxx_free_capacity() = the number of unconstructed items in the xxx (xxx 
= left or right) side of the buffer.
N = the number of items to be inserted
GF = growth factor
Strategy:
---------
1) devector is initialized with equal free capacity for both ends.
I assume that implies reserve() does the very same?
...
2) If a xxx (xxx = left or right) insertion is needed and 
xxx_free_capacity >= the number of items to be inserted, a xxx insertion 
is done (minimizes moves).
3) Otherwise if a xxx (xxx = left or right) insertion is needed and 
xxx_free_capacity < N, the sequence containing new plus old elements 
should be placed in the middle of the buffer. (Note: This step should be 
limited if amortized constant time is desired for push_xxx insertions)
There might be some corner cases here. Suppose we have 
front_free_capacity() == 3, back_free_capacity() == 0 and is about to 
insert to the right half. Does it then make sense to place elements in 
the middle?

If we shift by one element position we pay (on average)

  size * 3/4 moves

If we move to the middle, we have to move everything and pay

  size * moves

We now have two (average) insertions to gain the lost moves before we 
allocate. One will be in the left side, so has no impact. The other 
insertion will now take

  size * 1/4 moves

if we placed in the middle and

  size * 3/4 moves

if we shifted. So shifting leads to

  size * 6/4 moves

and moving to the middle leads to

  size * 5/4 moves.

So I agree, we should probably always move to the middle if the free 
space >= 3 (at the other end). This applies to situations where 
reallocating is not desired.
...
Worst case: for a repeated one-side insertion, when N/2 insertions are 
performed, the first relocation step will take place. In general, 
log(N/2) relocation steps (each one with a increasing number of moves) 
will be performed before reallocating a new buffer. I don't think it's 
amortized constant time, but amortized logarithmic time (but see below).
Extra moves can be reduced by imposing a capacity() that is a factor 
(say reallocation factor, RF) of the buffer size, forcing to 
reallocation when the load factor is high. This means that if size() is 
capacity() = RF*internal_buffer_size, even if there is room for the 
left/right insertions a reallocation will be performed. This means that 
capacity() = RF*internal buffer is known and means something to the 
user. Memory usage will grow, but data movement will be minimized. RF 
could be a policy-based parameter.
I think the RF factor obtains an amortized constant time for one side 
insertion if that property is considered useful.
I think that is a must.
...
A fast choice would be to allow just two relocations, a RF of 7/8 = 
(0,87) which would mean aprox. 2,6*N moves to push_back N elements. A RF 
of 3/4 (0.75, single relocation step) would mean 1,33*N moves to 
push_back N elements.
I like low numbers here.
...
For a FIFO, I think a deque that reuses free segment capacity on one 
side and transfers it to the other side before allocating new memory 
would be the best choice. Currently boost::container::deque does not do 
this, but pull requests are welcome ;-)
Or maybe just boost::circular_buffer (is this not the optimal way to 
handle that case?). No one is saying that devector has to solve every 
problem.

I do want to focus on simple properties from a user's perspective. Some 
cases to consider:

# case 1: push_back from empty state
------------------------------------

   vector<int> v;
   v.reserve( N );

This guarantees N pushes without exceptions and without reallocation.

# case 2: push_back from unknown state
--------------------------------------

   vector<int> v;
   ...
   v.reserve( v.size() + N );

This guarantees N pushes without exceptions and without reallocation.

# case 3: insert from empty state
---------------------------------

   vector<int> v;
   v.reserve( N );

This guarantees N inserts without exceptions and without reallocation.

# case 4: insert from unknown state
-----------------------------------

   vector<int> v;
   ...
   v.reserve( v.size() + N );

This guarantees N inserts without exceptions and without reallocation.

How do these properties work for a devector? If I understand your 
strategy, we get the following:

# case 1: push_back from empty state
------------------------------------

   devector<int> v;
   v.reserve( N );

This does not guarantee N pushes without exceptions or without 
reallocation. To get that guarantee we have to write

   devector<int> v;
   v.reserve_back( N );

# case 2: push_back from unknown state
--------------------------------------

Same as for case 1: no guarantees unless reserve_back is called.

# case 3: insert from empty state
---------------------------------

   devector<int> v;
   v.reserve( N );

This does not guarantees N inserts without exceptions or without 
reallocation. Using reserve_back or reserve_front doesn't help either. 
The only thing that works is if insert defers all reallocation till it 
is absolutely needed.

# case 4: insert from unknown state
-----------------------------------

Same as for case 3: no guarantees unless insert exhaust memory.

There are various ways to fix that: either the user replaces

    v.reserve( N )
    v.reserve( v.size() + N )

with

    v.reserve( N * 2 )
    v.reserve( v.size() + N * 2 )

or reserve() automatically acquires double the extra memory requested.
The latter will also make generic code work as expected. Using the 
double amount of requested memory automatically guarantees the optimal 
insertion performance: no move to the middle is ever required and on 
average devector performs half the moves of vector.

If the user wants less memory waste for push_front/push_back, he can use
reserve_front/reserve_back.

If the user wants full capacity usage for insert, he can only get that 
if we exhaust memory for insert by moving to the middle and make reserve
allocate exactly what he asks for. Even so, devector should perform no 
worse than vector, and likely somewhat better.

Unfortunately, the is no scheme that is 100% compatible with vector if 
reserve() for devector produces front_free_capacity() == 
back_free_capacity(). For that to work, reserve() must act as 
reserve_back().

Would that be undesirable for insert of reserve() is modeled after the 
vector? What would happen is that the first time an element falls in the 
front half, the elements are moved to the middle. This is probably 
acceptable compared to removing reserve() altogether because its 
semantics are ambiguous.

How do we get the optimal performance for flat containers? We can still
call reserve( 2 * N ) and then only pay one move to the middle 
operation. This is not 100% optimal, granted. For that to happen flat 
containers would need to forward additional arguments for reserve().
(and possibly clear() as well)

How bad is it to waste 50% memory for optimal flat containers. If the 
container is local in a function, it is probably of no concern at all.
Otherwise one may use trim_to_fit() if that applies. At any rate, it 
would be the user that decides.

kind regards

Thorsten

Re: [boost] [devector] optimum growing/insertion policy

Thorsten Ottosen