El 15/10/2017 a las 15:50, thorsten.ottosen via Boost escribió:
hi Joaquin,
Thanks for this. Some comments below.
Assumptions
* The cost of allocating new memory is negligible when compared to the cost of moving elements around (either intra-buffer or when migrating to a new buffer); so, the growing/insertion policy should focus on minimizing number of elements moved. This implies fully occupying free space before reallocation is *not* the primary goal of the growing/insertion policy. — But in the real world allocations (+ the implied deallocation ) is not that cheap, especially not with non trivial destructors. For small containers, typical for flat containers, I’m not sure this is a good assumption.
The assumption is of an ever growing sequence. That said, I recorgnize there are scenarios not falling under this, most notably FIFO queues. I've given the policy a little more thought and think we can twist it to also accommodate non-growing scenarios, please keep on reading.
Facts
[...] * If a right insertion at position N arrives and there's no free back capacity (but there is free front capacity), we have only two options: A. Reallocate to get back free capacity. B. Shift elements to the left. The key observation here is that A is cheaper in the long run: as the scenario is one of an ever-growing sequence, reallocation will happen eventually and we can just amortize it in our analysis, so reallocating early saves us the extra movements incurred when left shifting on a right insertion, namely N - (size()-N) = 2*N - size()
— How do you get that number ? I would say we save Incur 1/2 * size movements on average for shifting to the end with free space.
The number follows from: assume right insertion is at position N, that means that under normal conditions (there's back free capacity) we need to make size()-N movements to the right. If instead of that we shift to the left the resulting number of movements is N, so the penalty for taking the "wrong" decision is the difference N - (size()-N) = 2*N - size(). Did I explain myself now? The worst case is when N is right at the end of the sequence (N=size()) and there's no free space at the back, the penalty is then N - (size()-N) = size(), that is, we have to shift the entire sequence to the left (in which case it's clear that reallocating would have been a better decision).
Description of resulting policy
[...]
—
Interesting analysis. Thanks. I agree if the sequence is ever growing, it’s better to allocate when one end is full.
The idea of keeping track of left right insertions also interesting. It does assume additionally that the insert pattern in the future is the same as in the past, right?
I think in the case where the user has called reserve, it’s a little problematic that insert may allocate anyway. That is, for many uses there is no infinitely growing sequence .
The right_ member can become larger than size if elects are added to the right and removed from the left, not sure what to do about that.
Yes, you're right. right_ becoming greater thatn size() can't happen under the ever- growing assumtion, but it certainly can in the real world. Let's slightly adjust the policy as follows. I think this copes with the ever-growing and fixed-size scenarios equally well. Let us remember that the goal when reserving new free space at both front and back is to make it so that both ends get exhausted at the same time --if they ever are. * Fix the growing factor to G. * The right balancing factor B is defined as the ratio between back free capacity and total free capacity. Let's say this is 50% at the start, that is, we have front_free_capacity()= back_free_capacity(). * Right insertions and left insertions shift to the right and left, respectively (as before). * When reallocation is needed, calculate the new balance factor as a function of front and back space consumption just prior to reallocation. Pseudocode follows: struct free_space { std::size_t back,front }; free_space new_space(free_space original,free_space remaining) { float balance=(float)std::max(original.back-remaining.back,0)/ (original.back+original.front-remaining.back-remaining.front); // denominator can't be 0 std::size_t new_space=std::max(size()*G,size()+remaining.back+remaining.front), new_free_space=new_space-size(); return {balance*new_free_space,(1-balance)*new_free_space}; } This has the effect that if no insertions happened at one end (more precisely, more erasures than insertions took place at that end) then the new reserved space for that end will be zero. Safe guards may be applied. * When reallocation results in the same total space being reserved (this happens when the size at reallocation time is equal or less than the size() at the last reallocation), there is no need to request memory and we can simply shift the sequence so that back and front free capacity are adjusted according to new_space calculations. This nicely handles the FIFO scenario, for instance: when the growing side meets its end, reallocaton resolves to shifting the entire sequence within the current buffer so that space is kept at that end: this keeps cycling forever with no memory allocation and minimal element shifting. Thougths? Joaquín M López Muñoz