Hey everyone, I figured I'd take some time to talk about Unordered and where it's at. At the time of writing, Unordered's develop branch has been updated to include a new container: `boost::concurrent_flat_map`. A high-level description of the container and its algorithms can be found here: https://www.boost.org/doc/libs/develop/libs/unordered/doc/html/unordered.htm... It's been an honor and a privilege to develop alongside our own Joaquín M López Muñoz and Peter Dimov. While the advancements done to the algorithms themselves are note-worthy, what's not often talked about is taming implementation complexity. After all, algorithms don't mean much until they're translated to code. Boost.Unordered offers several different containers, all each a similar flavor of one another. The broad categorization is closed-addressing containerss, open-addressing containers and finally the concurrent containers, which are in actuality a subset of the open-addressing ones. The closed-adressing (CA) containers are the most straight-forward to abstract over as an `unordered_map` and `unordered_multimap` only differ in terms of how insertion is performed. Internally, the CA containers simply wrap a `table` struct that's templated over a Types policy-like which is used for pulling relevant typedefs from. Though I must confess, the `table` itself does contain myriad member functions such as `equals_unique` and `equal_equiv` which are used when comparing the tables themselves via `operator==`. The open-addressing (OA) containers are spiritually similar to the CA containers in that they wrap a simple common table type with a type policy template parameter used for pulling typedefs from in the relevant spots. But what makes the OA containers a good bit more nuanced is that there's both a `unordered_flat_map` and a `unordered_node_map`. The implementation complexity increases a good amount here. The CA containers all operate in terms of a simple `node`-like type that contains the appropriate `value_type` but the OA containers must abstract away a good bit more. The flat OA containers store their `value_type`s directly in a contiguous allocation so that lookup operations benefit from spatial locality in addition to being much more simple to manage from an implementation perspective. To support `unordered_node_map`, we have to update the type that's stored to be a `unique_ptr`-like. This essentially further extends the type policies already used by the containers to abstract over the storage type, the actual value type and how those values are accessed from a pointer to the storage type. This includes construction, destruction and dereferencing. The concurrent map adds yet another layer of abstraction but this time, in a much different spot and required a much larger holistic refactor of the codebase. The fundamental differences in container types require abstracting over how the table reads and inserts elements itself so we introduce an extra table layer. Previously there was but a single table type that supported all of the OA containers but now there's two table types that instead wrap a common `table_core` which contains common table primitives. The `concurrent_table` is responsible for managing its own locking schemes and then dispatches internally to its `table_core` and similarly for the `table` used by the non-concurrent containers. The OA containers are essentially implemented in 3 layers with the `table_core` being the lowest layer and the user-facing containers as the highest, with the specific middle layer being responsible for managing atomicity for the concurrent map and for providing things like iterators for the flat/node containers (the concurrent map offers no iterators). On paper it sounds relatively complex but in practice, navigating the code with this kind of foresight is easy and straight-forward, even more so now that GitHub's UI allows you to view and jump to struct definitions. Taming such implementation complexity into a neat and organized set of traits was definitely something beyond my skilset so I'm happy Joaquin was able to navigate such waters and I have a humbled understanding of policy-based design and how powerful it can be when applied properly. There's a couple of other things to note about the implementation that Boost authors may like. One is that we do allocator-aware constrution of a stack-local in emplace(). This is so that we can avoid a heap-allocation when doing lookup in the node-based OA containers. Originally, we actually didn't allocator-construct the stack-local but this turned out to be wholly incorrect because we'd move the stack-local to the heap when the key wasn't present in the container. This could lead to subtle bugs like not passing the memory resource when a user is using `std::pmr`. Another interesting aspect is that we recently updated the `erase(const_iterator)` method of the OA containers to return a proxy type. For performance reasons, the signature was originally `void erase(const_iterator)` but users found this cumbersome and annoying to work with as they'd often times swap containers entirely via a typedef. We now return a proxy type that lazily increments the internal iterator via implicit conversion to the container's iterator type. To this end, we're able to essentially have our cake and eat it too at the cost of making always-auto subtly confusing. I'm more than happy to talk about any other implementation strategies or answer questions if anyone is interested in the nitty gritty and can offer potential optimizations. - Christian
participants (1)
-
Christian Mazakas