[poly_collection] User-defined polymorphism model
Hi Joaquín and all reviewers,
Before we reach the end of the review period, I wanted to comment an
(IMHO) interesting feature used in the internal design of
PolyCollection. The library offers three of polymorphism models:
base-derived, boost::any and signature-sharing callable types.
Under the hood, the library abstracts those polymorphic models into a
generic class in the detail namespace:
template
Em 14 may 2017, às 1:32, Ion Gaztañaga
escreveu: Hi Joaquín and all reviewers,
Before we reach the end of the review period, I wanted to comment an (IMHO) interesting feature used in the internal design of PolyCollection. The library offers three of polymorphism models: base-derived, boost::any and signature-sharing callable types.
Under the hood, the library abstracts those polymorphic models into a generic class in the detail namespace:
[...]
I have several questions on this polymorphism model.
1) Wouldn't be a good idea to make this model-based polymorphism (or a refined one) public? Maybe the current model needs more work to support additional polymorphism models. Of course requirements for each operation in a model should be detailed. A correct abstraction to define user-provided polymorphism models is hard but this seems really a great feature that makes this library more valuable for users.
I'd be really conservative here and withhold such a big move until real use cases are found. The problem is once this is made public the design of the lib becomes effectively frozen.
2) If understand this model abstraction correctly, each Model chooses the segment type to store values (it could be any sequence type, I guess). The segment_backend abstraction does not only abstract ConcreteTypes from value_types (using void pointers that will be casted to real types in the implementation of the virtual function), but also the segment type (the sequence holding the concrete values).
Current implementation uses two segment types: split_segment (stores Concrete types and generic value_type's in two separate vectors) and packed_segment (stores ConcreteTypes which already contain value_type's as a subobject). function_collection and any_collection use the first segment, base_collection uses the second.
Why are different segment types needed? Can't we store value_types and ConcreteTypes in a struct and use only one type of segment? If this is possible, I guess iteration performance would be improved for function_collection and any_collection. Model definition could be simplified.
I tried using packed_segment with function_collection and any_collection, profiled and found split_segment is faster: basically, it gets you more values per cache line.
On the other hand, if the sequence type can be left out of the model definition, the underlying sequence type (change vector with deque, small vector or any other) could be made configurable. Would this be useful for users?
This was also proposed by Thorsten, please see my answer there.
3) I understand that caching a base_sentinel in each segment is an optimization to avoid a virtual function call when iterating with base_iterators. Is this assumption correct?
Yes, this is it.
Why is a different base_sentinel type needed to mark the end of each segment? Wouldn't a "base_iterator" pointing to that on-past-the end position suffice? Is an optimization (maybe base_sentinel is cheaper) or a must have?
An optimization. base_iterator would do. Joaquín M López Muñoz
On 14/05/2017 9:51, Joaquín M López Muñoz via Boost wrote:
I have several questions on this polymorphism model.
1) Wouldn't be a good idea to make this model-based polymorphism (or a refined one) public? Maybe the current model needs more work to support additional polymorphism models. Of course requirements for each operation in a model should be detailed. A correct abstraction to define user-provided polymorphism models is hard but this seems really a great feature that makes this library more valuable for users.
I'd be really conservative here and withhold such a big move until real use cases are found. The problem is once this is made public the design of the lib becomes effectively frozen.
Right. We'd need more to test more polymorphism models in order to deduce a more generic model. Noted for a future improvement.
Current implementation uses two segment types: split_segment (stores Concrete types and generic value_type's in two separate vectors) and packed_segment (stores ConcreteTypes which already contain value_type's as a subobject). function_collection and any_collection use the first segment, base_collection uses the second.
Why are different segment types needed? Can't we store value_types and ConcreteTypes in a struct and use only one type of segment? If this is possible, I guess iteration performance would be improved for function_collection and any_collection. Model definition could be simplified.
I tried using packed_segment with function_collection and any_collection, profiled and found split_segment is faster: basically, it gets you more values per cache line.
Ok, now I see it. Since poly_collection optimizes processing and not space or other operations, split_segment seems a good choice as objects are tightly packed and all additional data is stored in another segment. packed_segment is an optimization when value_type is already stored inside the object (like a base class). Makes sense.
On the other hand, if the sequence type can be left out of the model definition, the underlying sequence type (change vector with deque, small vector or any other) could be made configurable. Would this be useful for users?
This was also proposed by Thorsten, please see my answer there.
Right. Maybe the least intrusive approach is to require element contiguity so something like small-vector could be used. But I agree this is not an essential feature. Best, Ion
participants (2)
-
Ion Gaztañaga
-
Joaquín M López Muñoz