Hi All, Niall, 1) I was reading this Intel paper [0], and this section grabbed my attention: "One common mistake made by developers developing their own spin-wait loops is attempting to spin on an atomic instruction instead of spinning on a volatile read. Spinning on a dirty read instead of attempting to acquire a lock consumes less time and resources. This allows an application to only attempt to acquire a lock only when it is free." As I can tell by looking at the source code, spinlock spins on atomic consume. I wonder if a volatile read would produce better performance characteristic? 2) AFAIK spinlocking is not necessarily fair on a NUMA architecture. Is there something already implemented or planned in Boost.Spinlock to ensure fairness? I'm thinking of something like this: [1] Thanks, Benedek [0]: https://software.intel.com/en-us/articles/implementing-scalable-atomic-locks... [1]: http://www.google.com/patents/US7334102