[optional] Specializing optional to save space
Andrzej posted another thread about creating a new compact_optional class.
The goal of this class is to make it easy to create a new optional type
with a special sentinel value. This allows easy customization per instance
of an optional value.
This post is about the opposite approach: making it easier to specialize
optional to provide special behavior for all instances of a given optional
type.
When I bring this up, people often compare it to the vector<bool>
specialization, but this is a completely different issue. The problem of
vector<bool> is that it does not actually store a bool anywhere, so it
cannot return a reference to one. This would still store a T somewhere.
For the rest of the discussion, I will be assuming a typical standard
library implementation on a 64-bit system. Whenever I talk about specific
sizes of objects, mentally insert "on most common systems". The principles
apply generally.
Consider the case of optionalstd::string. Using the naive implementation,
this object is 8 bytes larger than std::string, due to alignment
requirements. However, there are certain representations of std::string
that are not actually possible. For instance, for common implementations of
std::string you cannot have std::string::size() ==
std::numeric_limitsstd::string::size_type::max() *. We can take advantage
of this if our std::string implementation stores the capacity as an integer
value, rather than a pointer to the end of the storage.
Therefore, we could have implementations that look something like this:
class string {
friend optional<string>;
};
template<>
class optionalstd::string {
optional(none_t) {
// Use friendship to just set capacity to sentinel value
}
explicit operator bool() const {
return m_data.capacity() ==
std::numeric_limitsstd::string::size_type::max();
}
};
optionalstd::string still stores std::string in it, so operator*() works
just fine. The initialized check is still just comparing a field against a
compile-time constant. This makes this purely a space optimization that
cannot be detected by the user, unless they use
sizeof(optionalstd::string).
I believe all of this is possible today with the current specification of
boost::optional (and would definitely be allowed for the proposed
std::experimental::optional). However, this requires the creator of each
specialization to implement the full optional interface. What's more, it is
especially tricky to specialize optional for a class template when you only
want to specialize on certain specializations of the class template.
My use case for this is my bounded::integer library. This library lets you
specify the bounds of your integer as template parameters. If your
bounded::integer has a narrower range than its underlying integer, you can
take advantage of this to make a more space-efficient optional. If,
however, the bounded::integer min and max are equal to that of the
underlying type, there is no extra space and we want the default optional
implementation. With the current specialization approach, it is the
responsibility of the specializer to reimplement all of optional.
A better approach to easily support this space optimization is as follows:
optional contains an instance of optional_storage
On 27 September 2015 at 15:03, David Stone
Andrzej posted another thread about creating a new compact_optional class. The goal of this class is to make it easy to create a new optional type with a special sentinel value. This allows easy customization per instance of an optional value.
This post is about the opposite approach: making it easier to specialize optional to provide special behavior for all instances of a given optional type.
When I bring this up, people often compare it to the vector<bool> specialization, but this is a completely different issue.
No, it's the same issue. It'll have subtly different behavior. Generic programming counts on the behavior being the same. The only reason vector<bool> is bad is because of its spelling. Had it been a different name, most people would be happy. optional<T> controls the lifetime of T. If you have a sentinel instead, then the lifetime of the engaged T is exactly the same as the lifetime of optional<T>. If you want a type with different behavior, give it a different name. -- Nevin ":-)" Liber mailto:nevin@eviloverlord.com (847) 691-1404
On 27.09.2015 23:38, Nevin Liber wrote:
On 27 September 2015 at 15:03, David Stone
wrote: Andrzej posted another thread about creating a new compact_optional class. The goal of this class is to make it easy to create a new optional type with a special sentinel value. This allows easy customization per instance of an optional value.
This post is about the opposite approach: making it easier to specialize optional to provide special behavior for all instances of a given optional type.
When I bring this up, people often compare it to the vector<bool> specialization, but this is a completely different issue.
No, it's the same issue. It'll have subtly different behavior. Generic programming counts on the behavior being the same.
The only reason vector<bool> is bad is because of its spelling. Had it been a different name, most people would be happy.
optional<T> controls the lifetime of T. If you have a sentinel instead, then the lifetime of the engaged T is exactly the same as the lifetime of optional<T>.
I got the impression that the OP intended that the specialized optional<> still controlled the lifetime of the adopted object the way it does now. The proposed change basically offloads engagement checking to a user-specializable trait but really nothing more than that.
On 27.09.2015 23:38, Nevin Liber wrote:
On 27 September 2015 at 15:03, David Stone
wrote: Andrzej posted another thread about creating a new compact_optional class. The goal of this class is to make it easy to create a new optional type with a special sentinel value. This allows easy customization per instance of an optional value.
This post is about the opposite approach: making it easier to specialize optional to provide special behavior for all instances of a given optional type.
When I bring this up, people often compare it to the vector<bool> specialization, but this is a completely different issue.
No, it's the same issue. It'll have subtly different behavior. Generic programming counts on the behavior being the same.
The only reason vector<bool> is bad is because of its spelling. Had it been a different name, most people would be happy.
optional<T> controls the lifetime of T. If you have a sentinel instead, then the lifetime of the engaged T is exactly the same as the lifetime of optional<T>.
I got the impression that the OP intended that the specialized optional<> still controlled the lifetime of the adopted object the way it does now. The proposed change basically offloads engagement checking to a user-specializable trait but really nothing more than that.
On 27 September 2015 at 16:19, Andrey Semashev
On 27.09.2015 23:38, Nevin Liber wrote:
On 27 September 2015 at 15:03, David Stone
wrote: Andrzej posted another thread about creating a new compact_optional class.
The goal of this class is to make it easy to create a new optional type with a special sentinel value. This allows easy customization per instance of an optional value.
This post is about the opposite approach: making it easier to specialize optional to provide special behavior for all instances of a given optional type.
When I bring this up, people often compare it to the vector<bool> specialization, but this is a completely different issue.
No, it's the same issue. It'll have subtly different behavior. Generic programming counts on the behavior being the same.
The only reason vector<bool> is bad is because of its spelling. Had it been a different name, most people would be happy.
optional<T> controls the lifetime of T. If you have a sentinel instead, then the lifetime of the engaged T is exactly the same as the lifetime of optional<T>.
I got the impression that the OP intended that the specialized optional<> still controlled the lifetime of the adopted object the way it does now. The proposed change basically offloads engagement checking to a user-specializable trait but really nothing more than that.
His example was to use a specific value of std::string::size() as a sentinel. How does that work with his specialized optional if the specialization is orthogonal to lifetime control? Feel free to assume string and optional<string> are friends, as well as a specific implementation of std::string. -- Nevin ":-)" Liber mailto:nevin@eviloverlord.com (847) 691-1404
On 28.09.2015 00:47, Nevin Liber wrote:
On 27 September 2015 at 16:19, Andrey Semashev
wrote: I got the impression that the OP intended that the specialized optional<> still controlled the lifetime of the adopted object the way it does now. The proposed change basically offloads engagement checking to a user-specializable trait but really nothing more than that.
His example was to use a specific value of std::string::size() as a sentinel. How does that work with his specialized optional if the specialization is orthogonal to lifetime control?
Excerpt from the OP:
For instance, for common implementations of std::string you cannot have std::string::size() == std::numeric_limitsstd::string::size_type::max() *. We can take advantage of this if our std::string implementation stores the capacity as an integer value, rather than a pointer to the end of the storage.
From this I gather that the intention is to use the storage from the capacity member of std::string to store the discriminator. I assume, the std::string object has to be not constructed for this as in no way you can create such string through its interface. But maybe I'm seeing too much here - I'll let David clarify this himself.
class string {
string() = default;
string & operator=(string && other) noexcept {
delete m_data;
m_data = other.m_data;
other.m_data = nullptr;
m_capacity = other.m_capacity;
other.m_capacity = 0;
m_size = other.m_size;
other.m_size = 0;
return *this;
}
private:
friend optional<string>;
char * m_data = nullptr;
size_t m_capacity = 0;
size_t m_size = 0;
};
template<>
class optional_storage<string> {
optional_storage() noexcept {
reset();
}
template
On 9/27/2015 11:47 PM, David Stone wrote:
emplace can call the move assignment operator after constructing a temporary, because that is identical to destruct + construct.
Emplace doesn't move assign. Emplace can't have temporaries. Move assigning is not at all identical to destruct + construct. All emplace can assume is that there exists a constructor that takes the given arguments via direct-non-list-initialization. This is fundamental, as emplacing allows putting into places stuff that is not even move assignable. Regards, -- Agustín K-ballo Bergé.- http://talesofcpp.fusionfenix.com
That is true in general, but for the string class I outlined they are identical. That is why the specialization can exist.
On 28.09.2015 06:32, David Stone wrote:
That is true in general, but for the string class I outlined they are identical. That is why the specialization can exist.
David, please do keep a quote that you're replying to. It's difficult to follow discussion without it. Please read the discussion policy: http://www.boost.org/community/policy.html#effective
On Sun, Sep 27, 2015 at 1:03 PM, David Stone
Andrzej posted another thread about creating a new compact_optional class. The goal of this class is to make it easy to create a new optional type with a special sentinel value. This allows easy customization per instance of an optional value.
This post is about the opposite approach: making it easier to specialize optional to provide special behavior for all instances of a given optional type.
I think this is a great idea; the fact that generic code will get the optimized version automatically is a strength, not a weakness, of this approach. It is critical, though, that the behavior is indeed identical in all respects to the unspecialized version, so that it is a pure optimization. It does seem though that much of the benefit from this comes from standard library integration, which of course won't be available if it is just part of Boost, so it may be hard to get significant user experience in support of standardization.
Andrzej posted another thread about creating a new compact_optional class. The goal of this class is to make it easy to create a new optional type with a special sentinel value. This allows easy customization per instance of an optional value.
This post is about the opposite approach: making it easier to specialize optional to provide special behavior for all instances of a given optional type.
I think this is a great idea; the fact that generic code will get the optimized version automatically is a strength, not a weakness, of this approach. It is critical, though, that the behavior is indeed identical in all respects to the unspecialized version, so that it is a pure optimization.
surely this just isn't possible? before: i have an optional<int>, a, and i set it's value to -1. assert(a) passes. after: i have an optional<int>, b, which uses the optimized for space enhancement which internally uses the value -1 as it's sentinel value: assert(b) fails. this applies to any possible value of int -- so therefore somebody, somewhere will have a valid int that is an optional and some purely genetic code will get it wrong
On Mon, Sep 28, 2015 at 4:02 PM, Sam Kellett
this applies to any possible value of int -- so therefore somebody, somewhere will have a valid int that is an optional and some purely genetic code will get it wrong
er.. generic.. doh
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
This is more in line with the compact_optional proposal in the other thread. The specialization is only created for a value that cannot otherwise exist for your type. Any magic value of int is a valid int value, so there would not be a specialization of optional<int> that makes, say, -1 invalid. In this proposal, there is absolutely nothing the user can do other than sizeof that lets them tell the difference (and they won't run out of memory as quickly from allocating a bunch of them).
On Mon, Sep 28, 2015 at 2:50 PM, Sam Kellett
surely this just isn't possible?
before: i have an optional<int>, a, and i set it's value to -1. assert(a) passes. after: i have an optional<int>, b, which uses the optimized for space enhancement which internally uses the value -1 as it's sentinel value: assert(b) fails.
this applies to any possible value of int -- so therefore somebody, somewhere will have a valid int that is an optional and some purely genetic code will get it wrong
Indeed, you wouldn't be able to define a space-saving specialization for builtin types that have no invalid representations. You could, however, define a custom type that wraps int and conveys the "not -1 semantics", which might be a good thing to do anyway, and then specialize optional on that. It would in most cases be possible (based on implementation details) to define specializations for standard library types like string, vector, etc., as David demonstrates.
I have a new version that was inspired by Agustín K-ballo Bergé in the compact_optional thread. My new interface involves a special tag type used to 'unlock' access to particular functions. I have creatively named this type optional_tag. Only optional can construct an optional_tag. For a type to opt-in to a space-efficient representation, it needs the following member functions: * T(optional_tag) constructs an uninitialized value * initialize(optional_tag, T && ...) constructs an object when there may be one in existence already * uninitialize(optional_tag) destroys the contained object * is_initialized(optional_tag) checks whether the object is currently in an initialized state By always requiring the optional_tag parameter, we do not limit any function signatures. This is why, for instance, we cannot use operator bool() as the test, because the type may want that operator for other reasons. An advantage of this over some other possible methods of implementing it is that you can make it work with any type that can naturally support such a state. It does not add any requirements such as having a move constructor. You can see a full code implementation of the idea at https://bitbucket.org/davidstone/bounded_integer/src/8c5e7567f0d8b3a04cc98142060a020b58b2a00f/bounded_integer/detail/optional/optional.hpp?at=default&fileviewer=file-view-default and for a class using the specialization: https://bitbucket.org/davidstone/bounded_integer/src/8c5e7567f0d8b3a04cc98142060a020b58b2a00f/bounded_integer/detail/class.hpp?at=default&fileviewer=file-view-default (lines 220 through 242) The problem with my previous approach is that it is simply more work for the user. Rather than adding four member functions, the user must go into a new namespace and specialize a template. In practice, all specializations would have an in_place_t constructor that forwards all arguments to the underlying type. The optional_tag approach, on the other hand, can just use the underlying type's constructors directly. In my old approach, the user also has the responsibility of adding proper reference-qualified overloads of a value function. In the optional_tag approach, we already have the value so we do not have to pull it out. My old approach also required standardizing as part of the interface of optional two helper classes, only one of which the user is supposed to specialize (and sometimes delegate their specialization to the other).
participants (6)
-
Agustín K-ballo Bergé
-
Andrey Semashev
-
David Stone
-
Jeremy Maitin-Shepard
-
Nevin Liber
-
Sam Kellett