On 2019-12-04 23:50, Peter Koch Larsen wrote:
On Wed, Dec 4, 2019 at 7:10 PM Andrey Semashev via Boost
wrote: On 2019-12-04 19:05, Peter Dimov via Boost wrote:
The idea here is that you win one byte by reusing the last byte of the storage as the size, overlapping it with the null terminator in the size() == N case (because capacity - size becomes 0).
I'm not sure this would actually be beneficial in terms if performance. Ignoring the fact that size() becomes more expensive, and this is a relatively often used function, you also have to access the tail of the storage, which is likely on a different cache line than the beginning of the string. It is more likely that the user will want to process the string in the forward direction, possibly not until the end (think comparison operators, copy/assignment, for instance). If the string is not close to full capacity, you would only fetch the tail cache line to** get the string size.
It is for this reason placing any auxiliary members like size is preferable before the storage array. Of course, if you prefer memory size over speed, placing size in the last byte is preferable.
There is another reason to place the size (or rather the free size) at the end of the data: C-compatibility. I have a similar class named fc_string where I for a fixed_string of N chars use one extra char for the free space. This character steps in and becomes the null-terminator in case the string is full, so I use no extra space in any case. If (for char-based arrays) more than 255 bytes are used, I store 255 in the end and the free size in the characters below the last one.
C compatibility beyond zero termination of strings is non-existant. You have that special convention, and that is fine, but that convention is not standard and only you know and follow it. No C function would be able to use that extra information without explicit support. You special use case does not make an argument for designing a general utility like fixed_string.
I doubt that cacheing does matter (much) for performance.
It all depends on the use case, of course, but memory bandwidth is the main bottleneck in the modern systems. From the space standpoint, there is little difference between N and N+1 or even N+4 or N+8 bytes for a fixed_string<N> object. Given this, it is preferable to choose a data layout that is more efficient in terms of memory accesses and computation complexity on typical use.