What is not stored contiguously is child, this is stored in std::vectorstd::unique_ptr. But when N keys are erased, the number of children erased is not O(N), because a node can have #(fanout) keys.
I have to correct this sentence, the number of children erased is O(N).
But the coefficient is small, because practically B-Trees use large fanouts.
Most technically, a call for erase_range(a, b) would be O(log N) + O(K)
where N is the number of whole keys and K is the number of keys being
erased.
I've updated the repo description, thanks for the catch.
Best,
Jooseong
2022년 7월 27일 (수) 오전 12:06, Jooseong Lee
If the element type has a trivial destructor and they are stored contiguously, then you can try to argue that it is better than O(N). This will depend on the complexity guarantees of your memory allocator. But your elements are not stored contiguously; destroying N elements will require O(N) "pages" to be released, if your pages are fixed-size. So it is still O(N).
Actually, the keys are stored contiguously:
https://github.com/frozenca/BTree/blob/bc62015a3789e5004ec98be0b82587de4edfd...
For trivially copyable types, the type of key array is std::array; for other types, the type of key array is std::vector. (in this case, your argument would be right)
What is not stored contiguously is child, this is stored in std::vectorstd::unique_ptr. But when N keys are erased, the number of children erased is not O(N), because a node can have #(fanout) keys.
Best, Jooseong
2022년 7월 26일 (화) 오후 11:40, Phil Endecott via Boost
님이 작성: I don't see how erase(a,b) can be O(log N), is that a typo?
No, it's not. In order to explain this, I have to explain split() first: split(Tree&& tree, key a, key b) splits a tree to two trees, say tree_l and tree_r, so that tree_l contains all keys less than a, tree_r contains all keys greater than b. By B-Tree's nature, this is O(log N).
erase(a, b) works like this: 1. compute lb = lower_bound(a), ub = upper_bound(b) pair. These are iterators. O(log N) 2. compute order(lb), order(ub). Therefore, order(ub) - order(lb) is
number of elements that will be erased. O(log N) 3-1. When order(ub) - order(lb) < 30, the tree erases element one by one 3-2. When it exceeds 30, the algorithm first splits the tree to two
by calling spit(std::move(*this), lb, ub) and joins the two split trees. split() and join() is O(log N), and
Jooseong Lee wrote: the trees there
is one split() call and join() call, so the whole complexity is O(log N)
You have to call the destructors on the elements that you have erased, so it must be O(N) in general. Your description skips the part where you drop the portion of the tree between the two splits.
If the element type has a trivial destructor and they are stored contiguously, then you can try to argue that it is better than O(N). This will depend on the complexity guarantees of your memory allocator. But your elements are not stored contiguously; destroying N elements will require O(N) "pages" to be released, if your pages are fixed-size. So it is still O(N).
Regards, Phil.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost