[atomic] Support for specialized instructions
Hi, I've been contemplating the idea of adding support for some specialized atomic instructions to Boost.Atomic. Since I'm only familiar with x86 architecture, my current candidates are: * Increment/decrement with a check for zero result (lock inc/dec). The decrement is the most essential of the two because it would be useful for reference counters of various kinds. The increment is mostly beneficial in terms of code size, and it probably doesn't need the check for zero in most cases. * In-place logical operations (lock and/or/xor). These are useful in cases when the code needs to operate on flags or bit sets but doesn't need the result of the operation. Optional variants can be provided with a test for zero result. * Bit set/reset operations (lock bts/btr). The use cases are similar to the logical operations but the advantage is that the previous value of the altered bit can be returned. BTW, these operations are needed in Boost.Sync to implement mutexes on Windows. All these operations can be implemented through the standard atomic<> interface, and that's what would be done on platforms without support for the specialized instructions. For now the idea is to add special member functions to atomic<>, when used with arithmetic types. I was thinking about providing free functions as well, although I think that the complete set of the standard atomic functions would be required for that. Comments? Opinions?
hi andrey,
I've been contemplating the idea of adding support for some specialized atomic instructions to Boost.Atomic. Since I'm only familiar with x86 architecture, my current candidates are:
* Increment/decrement with a check for zero result (lock inc/dec). The decrement is the most essential of the two because it would be useful for reference counters of various kinds. The increment is mostly beneficial in terms of code size, and it probably doesn't need the check for zero in most cases.
* In-place logical operations (lock and/or/xor). These are useful in cases when the code needs to operate on flags or bit sets but doesn't need the result of the operation. Optional variants can be provided with a test for zero result.
* Bit set/reset operations (lock bts/btr). The use cases are similar to the logical operations but the advantage is that the previous value of the altered bit can be returned. BTW, these operations are needed in Boost.Sync to implement mutexes on Windows.
All these operations can be implemented through the standard atomic<> interface, and that's what would be done on platforms without support for the specialized instructions.
For now the idea is to add special member functions to atomic<>, when used with arithmetic types. I was thinking about providing free functions as well, although I think that the complete set of the standard atomic functions would be required for that.
Comments? Opinions?
hmmm, i see your points ... however in a way, i'd prefer if this functionality could be built upon boost.atomic, mainly to keep API compatibility with std::atomics. with compiler-support for c++11 atomics, the first two instructions could probably be generated by a smart compiler ... bts/btr may be useful, but i suppose they are rather specific to x86? i wonder, how would they map to arm? but, yes ... if you think it is reasonable to add them, please go ahead! cheers, tim
On Thu, Nov 21, 2013 at 1:04 PM, Tim Blechmann
hmmm, i see your points ... however in a way, i'd prefer if this functionality could be built upon boost.atomic, mainly to keep API compatibility with std::atomics.
I suppose, the extensions could be made just as functions. What I don't like about this approach is the need to cast pointers to atomic<> to pointers to the underlying integers. We have a few places in Boost.Sync with such code and it really bothers me. Do you think we could at least add a method to get the pointer or reference to the internal storage inside atomic<>?
with compiler-support for c++11 atomics, the first two instructions could probably be generated by a smart compiler ...
Although theoretically possible, I don't think that current compilers are able to transform a CAS loop with an operation to an in-place operation. Especially, if CAS is implemented in inline assembler. It's a little easier with inc/dec but in practice I've never seen such transformation done automatically.
bts/btr may be useful, but i suppose they are rather specific to x86? i wonder, how would they map to arm?
AFAIK, ARM implements atomic ops with LL/SC instructions, so it should be flexible enough to implement it. I'm not very familiar with the architecture though.
but, yes ... if you think it is reasonable to add them, please go ahead!
Thanks.
Andrey Semashev
but, yes ... if you think it is reasonable to add them, please go ahead!
Thanks.
If you do add these using either gcc inline assembly or intrinsics, please make sure to fence the changes with a macro and provide implementations for compilers that don't support those features. Some compilers, Cray's CCE for instance, do set __GNUC__ so that they can use glibc, but don't support all gcc features. Thanks. Matt
hi andrey,
hmmm, i see your points ... however in a way, i'd prefer if this functionality could be built upon boost.atomic, mainly to keep API compatibility with std::atomics.
I suppose, the extensions could be made just as functions. What I don't like about this approach is the need to cast pointers to atomic<> to pointers to the underlying integers. We have a few places in Boost.Sync with such code and it really bothers me. Do you think we could at least add a method to get the pointer or reference to the internal storage inside atomic<>?
yes, this sounds reasonable ...
bts/btr may be useful, but i suppose they are rather specific to x86? i wonder, how would they map to arm?
AFAIK, ARM implements atomic ops with LL/SC instructions, so it should be flexible enough to implement it. I'm not very familiar with the architecture though.
true ... though i'm not sure how well it performs: it emulates cas with ll/sc and atomic ops with cas ... but true, this is another issue and one of the reasons why i always suggest to use std::atomic if possible ... cheers, tim
On Saturday 23 November 2013 12:37:43 tim wrote:
bts/btr may be useful, but i suppose they are rather specific to x86? i wonder, how would they map to arm?
AFAIK, ARM implements atomic ops with LL/SC instructions, so it should be flexible enough to implement it. I'm not very familiar with the architecture though.
true ... though i'm not sure how well it performs: it emulates cas with ll/sc and atomic ops with cas ... but true, this is another issue and one of the reasons why i always suggest to use std::atomic if possible ...
Yes, that's worth fixing too. PowerPC also implements LL/SC model and it has the complete implementation in Boost.Atomic. We'll need to do the same for ARM.
On Saturday 23 November 2013 12:37:43 tim wrote:
hi andrey,
hmmm, i see your points ... however in a way, i'd prefer if this functionality could be built upon boost.atomic, mainly to keep API compatibility with std::atomics.
I suppose, the extensions could be made just as functions. What I don't like about this approach is the need to cast pointers to atomic<> to pointers to the underlying integers. We have a few places in Boost.Sync with such code and it really bothers me. Do you think we could at least add a method to get the pointer or reference to the internal storage inside atomic<>?
yes, this sounds reasonable ...
Resurrecting this discussion... I gave some thought to this and implementing the operations just as functions and not as atomic<> members seems clunky. Users will reasonably ask why they are not part of the atomic<> interface. Using atomic<> members on some cases and free functions in other looks odd and inconvenient. So far I can see 3 alternatives: 1. Just add new members to atomic<> and document them as extensions. Personally, I'd go with this one. 2. Same as #1 but make an option to disable these extensions at compile time (e.g. through a config macro). This will have negative consequences with Boost.Sync because it will require these extensions and won't compile without them. 3. Leave atomic<> with the standard interface and add its equivalent but with extensions. My favorite name is nuclear<>. :) With the new Boost.Atomic design it shouldn't cause much code duplication, although it'll be difficult to document why there are two similar components in the library. But it's my second preference. What do you think?
participants (4)
-
Andrey Semashev
-
Matthew Markland
-
tim
-
Tim Blechmann