On 15/10/2017 17:59, Joaquin M López Muñoz via Boost wrote:
The assumption is of an ever growing sequence. That said, I recorgnize there are scenarios not falling under this, most notably FIFO queues. I've given the policy a little more thought and think we can twist it to also accommodate non-growing scenarios, please keep on reading.
Thougths?
I think that the default (we could have other policies) should be simple and similar to vector/string/stable_vector/etc.... In this sense, capacity() and reserve(), if provided, should make sense. If we reallocate depending on the free capacity of one side, then capacity() and reserve() make no sense to the user I think the user expects capacity() as the maximum number of insertions, in any position, that will be accepted before reallocating. Making free_capacity = capacity() - size(), less than, say, back_free_capacity() will be really hard to understand by users. I propose a simple strategy: Definitions: --------- capacity() = the size of the buffer free_capacity() = capacity() - size() xxx_free_capacity() = the number of unconstructed items in the xxx (xxx = left or right) side of the buffer. N = the number of items to be inserted GF = growth factor Strategy: --------- 1) devector is initialized with equal free capacity for both ends. 2) If a xxx (xxx = left or right) insertion is needed and xxx_free_capacity >= the number of items to be inserted, a xxx insertion is done (minimizes moves). 3) Otherwise if a xxx (xxx = left or right) insertion is needed and xxx_free_capacity < N, the sequence containing new plus old elements should be placed in the middle of the buffer. (Note: This step should be limited if amortized constant time is desired for push_xxx insertions) 4) Otherwise if the allocator supports in-place expansion, the buffer is expanded for max(size()+N, capacity()*GF) and if successful, steps 2 or 3 are performed depending on the new xxx_free_capacity value. 5) Otherwise a new buffer is allocated and the new sequence containing new plus old elements should be placed in the middle of the buffer. For uniformly distributed random insertions on both ends, very few extra movements will be done. This means it's amortized constant time, just like vector. Worst case: for a repeated one-side insertion, when N/2 insertions are performed, the first relocation step will take place. In general, log(N/2) relocation steps (each one with a increasing number of moves) will be performed before reallocating a new buffer. I don't think it's amortized constant time, but amortized logarithmic time (but see below). Extra moves can be reduced by imposing a capacity() that is a factor (say reallocation factor, RF) of the buffer size, forcing to reallocation when the load factor is high. This means that if size() is capacity() = RF*internal_buffer_size, even if there is room for the left/right insertions a reallocation will be performed. This means that capacity() = RF*internal buffer is known and means something to the user. Memory usage will grow, but data movement will be minimized. RF could be a policy-based parameter. I think the RF factor obtains an amortized constant time for one side insertion if that property is considered useful. It's equivalent to reduce the number of relocations to a to a fixed number of steps, so the average movement to insert N elements in a devector of capacity C = N/RF is proportional to N. For a quick example (number might be wrong, but you get the idea) of a repeated push_back: If relocation steps are limited to 3, which happens with RF = 15/16, we need N = 15/16 C moves to insert new elements, plus C/2 + 3/4*C + 7/8*C moves to move already inserted elements. This means something like N + 17/8*C = N + 17/8*N/RF = 49/15*N = 3,26*N moves. A fast choice would be to allow just two relocations, a RF of 7/8 = (0,87) which would mean aprox. 2,6*N moves to push_back N elements. A RF of 3/4 (0.75, single relocation step) would mean 1,33*N moves to push_back N elements. Statistics based policies seem more complex, and I don't know how well they will adapt to pattern changes thorough the lifetime of the container, plus they impose a size cost to the container. I wouldn't make the the default choice, as the average user should easily understand the default policy. For a FIFO, I think a deque that reuses free segment capacity on one side and transfers it to the other side before allocating new memory would be the best choice. Currently boost::container::deque does not do this, but pull requests are welcome ;-) Best, Ion