Peter Dimov wrote:
Ion Gazta?aga wrote:
El 13/03/2015 a las 14:38, Phil Endecott escribi?:
(Personally, I have a flat_map that does batched insertions i.e. the new items are appended individually and then merged at the end of the batch. This will have significant benefits for some workloads.)
Yes, this should be added to flat_map. The interface won't be very pretty, but I would like to hear suggestions in this direction.
I think we discussed this before. My preferred approach is flat_set S; { flat_set::batch b(S); b.insert(xxx); b.insert(yyy); } // b's dtor does the merge Or, if you don't like doing things in destructors, give batch a "commit" or "merge" method. One question is: should lookups while a batch is in progress (a) look only at pre-batch items, or (b) crash, or (c) work correctly (with a linear scan of the new items)? I prefer (a).
It doesn't have to have an interface, at least in principle. flat_map can just append the new elements to the end, keeping them sorted, until it hits a threshold, then merge. Lookups will first look at the new element range (because those are likely to be in cache), if not found, at the old element range.
I once tried to work out the complexity of operations on this structure, and I don't think the results were very impressive. Say you have N elements in the "main" container, and M elements in the "recently added" portion. Insertion will be O(M) due to moving the older "new" elements, plus some amortised amount for the eventual merge into the main container. Lookup will be O(log N + log M). Traversal is O(N + M) but with a significant constant factor as you have to compare main and new elements as you go. I think a useful maximum M is sqrt(N). If M is smaller than sqrt(N), then you are doing the merge often enough that the amortised merge complexity exceeds the insertion complexity. It is also possible to make this a recursive data structure, i.e. the "recently added" portion is the same sort of container with its own "very recently added" portion, ad infinitum. I don't think I managed to work out the complexity of that structure. I suspect that there is a provable lower bound on what can be achieved with "flat" data structures, though. Regards, Phil.