
On Thu, Feb 20, 2014 at 5:35 PM, Peter Dimov
DUPUIS Etienne wrote:
I second Andrey, whatever the type T is, it would be nice to have an easy way of specifying a stronger alignment constraint. This is particularly useful to allocate for example buffers of uint8_t on 16-byte or 64-byte boundaries for performance reasons.
You're perhaps missing a certain subtlety here and it is that the container/algorithm taking the allocator<T> may allocate objects other than T. It does this via rebind<>. And the question then becomes, do you want this 64 byte boundary to also apply to these additional allocations? The answer is often 'no' - you don't want deque<T>'s bookkeeping structures to be overaligned - but sometimes it's 'yes', if you passed allocator<float> but the function actually uses allocator<char>. And sometimes, as with list<T>, the answer is non-binary.
Yes, that is true. However, when I specify alignment in the allocator, what I explicitly care about is alignment of the elements. For the most part I don't care what alignment the container uses for its internal structures, although I realize that this probably affects memory overhead. This can be perceived as a shortcoming of the current containers interface - you can only specify one allocator, the one that is "supposed" to be used for the elements, and the container uses it for other purposes as well behind your back. Luckily, most containers only allocate structures that embed elements, so the alignment is justified. One notable exception is unordered containers - these would have to also allocate the bucket list, which need not be aligned as strict as the elements.
If the required alignment is equal to alignof(T), it all works - structures having T as a member will automatically receive an alignment at least as strict, and functions using allocator<char> instead of allocator<T> will take the necessary steps to std::align the resulting pointer at alignof(T).
Yes.
So if you allocate T = struct { char[64]; } alignas(64), it would avoid all these complications. This depends on the compiler providing a proper support for overaligned types. But that's needed for __m128 and __m256, so I'd expect it to be there.
::allocate(n), you're basically doing new __m128[n], and this does not align memory to 16 bytes. Well, it does on x86_64 Linux/Windows/OS X but simply because all memory allocations are 16-byte aligned on
I think, current implementations (at least, those I have worked with) don't support this. I.e. if you do std::allocator< __m128 that architecture and not because alignas(__m128) == 16. I think, with C++03 this was justified by the fact that __m128 has non-standard alignment, which is not covered by the standard. Not sure if this changed in C++11 with introduction of alignas.
Not that aligned_allocator
would not be useful; it would be, as long as it works. There's no guarantee that it will in all cases though.
Exactly. Basically, when you use the aligned_allocator< __m128 > with a container, you have no guarantee that aligned_allocator< __m128 > will actually be used to allocate memory. I.e. if you want to allocate __m128 elements aligned to 64 bytes and specify: template< typename T > struct my_alignment_of : alignment_of< T > {}; template< > struct my_alignment_of< __m128 > : mpl::int_< 64 > {}; std::list< __m128i, aligned_allocator< __m128, my_alignment_of > > d; this will likely not work because std::list won't use my_alignment_of< __m128 > but instead some my_alignment_of< list_node< __m128 > >. To make this work you'd have to write some fake metafunction that just always returns 64, and this is equivalent to just specifying 64 in aligned_allocator template parameters, only more complicated.