On 11/09/2015 10:14 AM, Phil Endecott wrote:
I refer to the feedback that I gave last time you posted:
Thanks for your feedback Phil. Sorry I did not properly respond in the original thread.
I think this could be most useful if it were decomposed into several layers:
- An in-memory B+-Tree - An augmented tree - An order-statistics tree
That way, we could have in-memory B+-trees the provide a std::map-like interface, which would surely be useful, and we could have O(log N) random access insert/erase on a binary tree where that is appropriate, and we could have other kinds of augmented trees such as interval trees.
Working out exactly what these layers should look like is not easy, but I think it's worth doing in order to broaden the applicability.
My library implements a counted B+Tree. I see this as the basis to which other indexes can be added. In particular I would like to add support for maps and sets as this would provide a high performance alternative to the standard containers for trivially-copyable keys. I see no reason to make the count index optional. The memory overhead of storing the count index for a set where the key is 64 bits is less than a tenth of a percent. This is because the counts stored in level 1 of the tree are used to encode the length of segments, which maps/sets need to know as well as sequences. I have also benchmarked the CPU overhead of maintaining the count index and found it to be negligible. On top of this, it would dramtically complicate implementation for the count index to be optional as I would still need to store the segment lengths somewhere and implement iterator operations less efficiently when it was not present. Keeping the index around would also mean that maps/sets would recieve efficient access by position for practically free. That being said, I think the data structure is useful today and should be a part of Boost. Pouring over the mailing list I see that other people have proposed and even implemented sequences with logarithmic random access operations many times, so clearly there is a need for such a thing in the community. The reaon I am posting again to the list is because I have worked on reference documentation, boost documentation, testing, and polished the implementation. This addresses some of the comments from my last post so it's worth taking another look at. Regards, Chris.