Dear Phil, FYI,
I would be interested to see a comparison with Beman Dawes' B-Tree library from 2010:
https://github.com/Beman/btree seems to depend on an older version of
Boost.Endian,
It doesn't compile with the recent versions of boost. I can't find a proper
version of Boost.Endian which is compatible with.
To compare with Google's 2011 B-Tree implementation, (
https://code.google.com/archive/p/cpp-btree/)
I saw the following results:
https://github.com/frozenca/BTree/tree/295a9ec4b5495a83c546fd3e300d54d4c78d8...
Regards,
Jooseong
2022년 7월 26일 (화) 오전 12:03, Jooseong Lee
I see that you have O(log N) n-th element access, that it is an interesting feature. Does this mean that every non-leaf node has a count of its descendants?
Yes, every non-leaf (and leaf) node has a count of keys in the subtree where that node is the root.
I don't see how erase(a,b) can be O(log N), is that a typo?
No, it's not. In order to explain this, I have to explain split() first: split(Tree&& tree, key a, key b) splits a tree to two trees, say tree_l and tree_r, so that tree_l contains all keys less than a, tree_r contains all keys greater than b. By B-Tree's nature, this is O(log N).
erase(a, b) works like this: 1. compute lb = lower_bound(a), ub = upper_bound(b) pair. These are iterators. O(log N) 2. compute order(lb), order(ub). Therefore, order(ub) - order(lb) is the number of elements that will be erased. O(log N) 3-1. When order(ub) - order(lb) < 30, the tree erases element one by one 3-2. When it exceeds 30, the algorithm first splits the tree to two trees by calling spit(std::move(*this), lb, ub) and joins the two split trees. split() and join() is O(log N), and there is one split() call and join() call, so the whole complexity is O(log N)
I plan to benchmark mine with abseil's B-Tree soon. Thank you for pointing out Beman's implementation and sqlite, I'll list them too.
I'll add Boost license too, thanks for that
Best, Jooseong
2022년 7월 25일 (월) 오후 11:29, Phil Endecott via Boost
님이 작성: Jooseong Lee wrote:
I recently wrote a general-purpose STL-like ordered associative B-Tree based container in C++. . Its API and usage is very similar to std::set, std::map, std::multiset, std::multimap, and when the number of elements is relatively large, my benchmark shows that it is 2.5~3 times faster than std::set and its friends.
Repo link: https://github.com/frozenca/BTree
I would be interested to see a comparison with Beman Dawes' B-Tree library from 2010:
https://lists.boost.org/Archives/boost//2010/09/170995.php https://github.com/Beman/btree
Personally, I often use flat sets/maps. Of course these are best suited when lookups dominate over modifications. It would be interesting to know how your library compares for lookups.
I see that you have O(log N) n-th element access, that it is an interesting feature. Does this mean that every non-leaf node has a count of its descendants?
I don't see how erase(a,b) can be O(log N), is that a typo?
In the past I have used a lot of memory-mapped, mostly read-only, binary data structures but these days I'm more likely to use sqlite. Have you tried to benchmark against that?
I note it is not Boost licensed.
Regards, Phil.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost