Re: [boost] [Container] interest in a flat_unordered_map

19 Sep 2015


      Funny, I just came across this.  Nearly 2 years ago I mentioned to Ion that I had a flat unordered set/map/multiset/multimap containers and was under the impression he wasn't interested.  I have a working (albeit a bit old, looking at it now there's a few spots I could clean things up, but it works) implementation here: https://sourceforge.net/projects/unorderedflatse/files/?source=navbar, that I've used for some time now.  I've found them to be wonderful, and are one of my most used containers.  I used open addressing with linear probing, so if you need an implementation to compare Treb Connell's against perhaps this will come in handy.
...
Date: Sat, 19 Sep 2015 03:31:12 +0300
From: andrey.semashev@gmail.com
To: boost@lists.boost.org
Subject: Re: [boost] [Container] interest in a flat_unordered_map
On Fri, Sep 18, 2015 at 11:09 PM, Treb Connell <trebconnell@gmail.com> wrote:
...
I'm hoping to port a flat hash map that I've written to boost.
*Performance (primary reason for use):*
It uses memory and cache very effectively and ends up being *very fast*.
It's hard to benchmark real world situations, but in inserting, finding,
erasing, and iterating over a dictionary it was up to six orders of
magnitude faster than VS2013's unordered_map.  This isn't a totally fair
test because the cache efficiency really shines.
I believe, much of the benefit depends on the nature of keys and
values (e.g. ints vs. strings)? Is there a more detailed description
of the tests and results, at different container sizes?
...
The most optimal implementation of iterators for this container *does not
have constant time traversal*.  The implementation is a vector with
elements scattered across it.  To increment an iterator I continue to
increment the internal vector iterator until I hit a new element.  In
practice elements should be scattered evenly, and will be close to one
another, but in the worst case this is N-1.  Additional memory or logic
could be added to make this O(1), but I don't think that trade is worth
it.  In my applications of unordered_map I rarely iterate, and would not
want to penalize insert, find, or erase to make iteration faster.
Can the iterator tradeoff be made a compile time option?
One frequently used operation that generally amounts to iterating
through elements is clear(). Do I understand correctly that its
complexity can be O(N) where N>size()? I believe the same applies to
range erase().
Also, is size() constant time?
...
I could make a new concept for this, pay the cost for making it truly O(1),
or *since the common case is not O(N) would it be okay to consider these
bidirectional iterators*?
I think the iterators should have bidirectional traversal category
regardless of the choice on the complexity. It's the closest standard
category that fits anyway.
_______________________________________________
Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost

Re: [boost] [Container] interest in a flat_unordered_map

Ryan Boghean