Andrey Semashev wrote:
While we're on the subject, on what architectures would opaque_sub be more efficient than sub_and_test?
On x86 and gcc < 7 opaque_sub allows to use "lock sub" or "lock dec" without setting the bool according to the zero flag, i.e. it saves a register and an instruction.
Right, thanks. I was thinking that testing for zero comes for free, but it's not (entirely) free for the reason you give. Does this actually matter in practice? I would expect the atomic to dominate the `set(n)z al`.
Gcc 7 introduced the ability to return flags from the asm statement, so the code can be written the same way. Although I noticed that the compiler tends to save the flag into a register early unless it is tested immediately, so in some cases opaque_sub might still be preferable where it suits.
Don't see how opaque_sub could be preferable if you need to test the flag later. :-) Presumably, if you just call the function and discard the return value - the equivalent of opaque_ - the compiler would be smart enough to not save the flag. I remember some compilers being smart enough to notice that you don't use the result of the atomic fetch_op intrinsic and generating the `lock op` themselves, without a separate opaque_op being needed. We can't do that on the library level, of course.