On Sat, Sep 12, 2015 at 8:46 PM, Ion GaztaƱaga
On 12/09/2015 1:10, Glen Fernandes wrote:
Useful links:
- Source code: https://github.com/det/segmented_tree_seq - Documentation: http://det.github.io/segmented_tree_seq/ - Benchmarks: https://github.com/det/segmented_tree_seq/tree/benchmarks
Any feedback is greatly appreciated.
The benchmarks are impressive. I hope to take a deeper look this weekend, but from my cursory glance: the implementation looks very clean too.
Ion: Is this something you would be interested in reviewing?
I'd like to separate the proposal to put that data structure into Boost and the proposal to put it into Boost.Container.
The implementation looks good although I have no experience with B+Trees or augmented data structures. I think we should have more innovative container in Boost and there have been more proposals to similar data structures that were not.
Regarding Boost.Container, one of the biggest problems with adding new stuff into Boost.Container is that it should be consistent with the rest of the library. The library supports C++03, supports Interprocess requirements, recursive types, SCARY iterators, etc... it must be maintained and documented as the rest of the library. I wouldn't like to maintain data structures I really don't fully understand or just left them unmaintained. We added stable_vector and static_vector but those were easier data structures.
Just like we have Boost.CircularBuffer or Boost.Unordered I'd like to propose adding adding new containers in its own library. Maybe an (optionally augmented) B+Tree library would be nice, and it could also contain a sequence container based on that data structure.
Just my 2 cents, does it make sense?
All of your considerations make sense. I understand technical problems that can be created by adding other (potentially quite different) types of containers to Boost libraries. But I am sure that there will be new proposals and new requests for advanced data structures. Sequences are everywhere in real life applications, not only texts and strings. In addition to this, there is an interesting lesson that I have learned since written "STL extensions" library. These days big interest and selling point for augmented data structures is not so much single augmenting by count of container elements, but multiple augmenting, which supports fast summation algorithms (more generally fast computations using binary operations). These data structure can be used to compute important statistical parameters, such as mean value and standard deviation, in logarithmic running time on cheap single processor systems. The algorithms are general and can be applied to both ordered and unordered data sets. Alternative parallel computations require supercomputers to achieve the same performance. The application area - big data and data science. Regards, Vadim Stadnik