On Fri, Sep 18, 2015 at 4:09 PM, Treb Connell
I'm hoping to port a flat hash map that I've written to boost. Since boost has a flat_map I think it'd be good to have a flat_unordered_map as well, especially since there's an opportunity to be much more efficient. The one I would like to use is based on the hash map used in Lua, which is a mix of Coalesced and Robin Hood hashes. I don't think there's a name that describes it yet unfortunately, but I describe it near the end of this post.
I would welcome such a data structure into boost. I just wonder if flat_unordererd_map is a good name.
*Performance (primary reason for use):* It uses memory and cache very effectively and ends up being *very fast*. It's hard to benchmark real world situations, but in inserting, finding, erasing, and iterating over a dictionary it was up to six orders of magnitude faster than VS2013's unordered_map. This isn't a totally fair test because the cache efficiency really shines.
*Standards compliance:* Differences from std::unordered_map are widely the same differences as flat_map listed here http://www.boost.org/doc/libs/1_59_0/doc/html/container/non_standard_contain.... Also bucket related methods no longer apply, so will be removed. *That said there's one major standards compliance issue that I'm struggling with:*
The most optimal implementation of iterators for this container *does not have constant time traversal*. The implementation is a vector with elements scattered across it. To increment an iterator I continue to increment the internal vector iterator until I hit a new element. In practice elements should be scattered evenly, and will be close to one another, but in the worst case this is N-1. Additional memory or logic could be added to make this O(1), but I don't think that trade is worth it. In my applications of unordered_map I rarely iterate, and would not want to penalize insert, find, or erase to make iteration faster.
I could make a new concept for this, pay the cost for making it truly O(1), or *since the common case is not O(N) would it be okay to consider these bidirectional iterators*?
How about letting the user decide and making the iterator behavior choice a template boolean argument, and select the appropriate implementation based on that (e.g like many container options are done in Boost.Intrusive)? <snip> -- François