Re: [boost] Proposal: Vector-like container, which takes O(log(N)) for insert/erase (Glen Fernandes)
Hi, again, I've implemented a container which should replace vector and have O(log(N)) for insert/delete. Please review. https://github.com/alexkupri/array/blob/master/description/description.txt The code is small, the description is verbose. Best regards, A.K.
On Thu, Apr 3, 2014 at 4:42 PM, Alexander Kuprijanov wrote:
Hi, again, I've implemented a container which should replace vector and have O(log(N)) for insert/delete. Please review. https://github.com/alexkupri/array/blob/master/description/description.txt The code is small, the description is verbose. Best regards, A.K.
I wasn't sure at first why my name was in parenthesis in the e-mail subject. :-) I'll take a look. Glen
Hi, Alexander Kuprijanov wrote:
Hi, again, I've implemented a container which should replace vector and have O(log(N)) for insert/delete. Please review. https://github.com/alexkupri/array/blob/master/description/description.txt
The code is small, the description is verbose.
AFAIU just like a B-tree can be seen as a generalization of a binary search tree in that a node can have more than two children, your container can be seen as a similar generalization of a Rope. Or it can be seen as an augumented B-tree (not really because the values are no longer stored in internal nodes and the tree isn't sorted, correct me if I'm wrong). Your proposal is very similar to Counted B-tree (http://www.chiark.greenend.org.uk/~sgtatham/algorithms/cbtree.html), which is a B-tree with random access. The difference is that your container can't be used to search for a stored value in O(logN) without sorting and binary search. This may be ok if someone require only random access, the size of a node is smaller which probably improves the cacheing. In general, B-tree was introduced to improve the performance of the access to data stored in persistent memory (lower performance). In this case it was better to load the whole page into memory (and cache it) than load many small, scatered parts. But in the case of RAM-only-access, a binary search tree should be faster. As far as we don't consider spatial indexing, interval trees, etc. where values/nodes may "overlap" somehow and we must search in more than one subtree. Therefore I'm wondering if a classic Rope would be faster than your container in the use-case you're testing - simple insert and random access to a in-memory container. Have you test it? I'd perform 2 tests: 1. Very big number of values + truly random insert/access to prevent cacheing 2. insert/access in the same area IMHO the benefits related to the use of this container should be more noticable if the persistent storage were used. The times measured in benchmarks are very small ~e-04. You can't rely on such small values. I'd keep them above 1s. As for the name, it's definietly not an array. I'd call it B-rope. Regards, Adam
On Fri, Apr 04, 2014 at 02:42:51AM +0300, Alexander Kuprijanov wrote:
Hi, again, I've implemented a container which should replace vector and have O(log(N)) for insert/delete.
If it doesn't have contiguous storage, it's not a replacement for vector, however much you want it to be the universal container that everyone uses for everything. It may be a better general-purpose container compared to the usual default containers, deque and vector, but it isn't a replacement for either of them. You use the term 'intrusive' in the documentation, but it doesn't seem to carry the usual meaning that it requires the types involved to participate in the container structure. Your tests doesn't seem to exercise much else than container<int> either. I'm not quite sure I like the explicit L/M non-type template parameters either, considering that it leaks implementation detail into every single user of the type. As for the Copy-on-Write semantics, isn't the general consensus that CoW is a horrible horrible thing and is very hard to get right+fast in the modern world with concurrency? -- Lars Viklund | zao@acc.umu.se
On 04/04/2014 02:10 PM, Lars Viklund wrote:
As for the Copy-on-Write semantics, isn't the general consensus that CoW is a horrible horrible thing and is very hard to get right+fast in the modern world with concurrency?
On Fri, Apr 4, 2014 at 9:42 AM, Alexander Kuprijanov
Hi, again, I've implemented a container which should replace vector and have O(log(N)) for insert/delete. Please review. https://github.com/alexkupri/array/blob/master/description/description.txt The code is small, the description is verbose.
Could you please provide a figure that shows data association with nodes in your data structure? In the simplest form, it can be done as in counted B-trees already mentioned by Adam. For better representation you can have a look at the description of augmented red-black trees [1] and binary trees [2]. Regards, Vadim Stadnik [1] T.H. Cormen, C.E. Leiserson, R.L. Rivest and C. Stein. Introduction to Algorithms. 2nd Edition, MIT Press and McGraw-Hill, 2001. [2] R. Sedgewick and K. Wayne. Algorithms. 4th Edition, Addison-Wesley, 2011.
participants (6)
-
Adam Wulkiewicz
-
Alexander Kuprijanov
-
Bjorn Reese
-
Glen Fernandes
-
Lars Viklund
-
Vadim Stadnik