Re: [boost] [atomic] Generalized read-modify-write operation

2 Sep 2015

      On 02.09.2015 02:20, Gavin Lambert wrote:
...
On 1/09/2015 21:35, Andrey Semashev wrote:
...
I was wondering is there would be interest in a generalized
read-modify-write operation in Boost.Atomic. The idea is to have a
atomic<T>::read_modify_write method that would take a function that
would perform the modify on the atomic value. The interface could be
roughly this:
[...]
Does this look interesting to anyone? Comments?
It's not strictly necessary since it's equivalent to a do { r =
modify(n); } while (!var.compare_exchange_weak(n, r, order)) loop, which
isn't that much more typing. (Although I elided the initial load, so
it's a bit more typing than it appears here.)
Yes, correct. r and n also have to be declared outside the loop, which 
is kind of annoying.
...
(And the above is also supposed to use LL/SC on architectures where this
is cheaper than CAS, although I'm not sure if this is the case.)
I'm not sure there are architectures that implement both CAS and LL/SC 
instructions, at least I'm not aware of such. On the architectures that 
support LL/SC, the instructions will be used to implement 
compare_exchange_weak. The modify function in this CAS loop will not be 
executed within the LL/SC region, which is why the additional load 
before the loop is required. There is also a probability of CAS failure.

There is another point to consider. compare_exchange_weak/strong 
opereation on an LL/SC architecure is more complex than what is required 
to implement a simple RMW operation. For example, let's see it in 
Boost.Atomic code for ARM. Here is fetch_add, which can be used as a 
prototype of what could be done with a generic RMW operation:

   "1:\n"
   "ldrex   %[original], %[storage]\n"
   "add     %[result], %[original], %[value]\n" // modify
   "strex   %[tmp], %[result], %[storage]\n"
   "teq     %[tmp], #0\n"
   "bne     1b\n"

Frankly, I'd like to be able to generate code like this for operations 
other than those defined by the standard atomic<> interface.

And here is CAS (weak):

   "mov %[success], #0\n"
   "ldrex %[original], %[storage]\n"
   "cmp %[original], %[expected]\n"
   "itt eq\n"
   "strexeq %[success], %[desired], %[storage]\n"
   "eoreq %[success], %[success], #1\n"

To this we will also have to add the load, the modify and the loop.
...
But provided that the code generated is no worse than this (in
particular that the compiler can inline the function call in optimised
builds, at least where the function is relatively simple) then it would
still be useful, particularly for people who aren't used to using weak
exchange loops.
I certainly hope the compiler will be able to inline the modify 
function. If it doesn't, for relatively simple functions, then it 
probably isn't worth it.