On 20 Sep 2013 at 20:02, Ion Gaztañaga wrote:
Should we have different "optimized" versions of a semaphore depending on the memory order guarantees? I'd love to hear what memory-model experts think about this.
On Intel, it's very tough to debug relaxed memory order code as Intel does very litle reordering relative to other CPUs. ARM is better, but I believe it needs an Alpha to really get the bugs out. There is, surely, some magic tool for LLVM somewhere which will output all programs doing all possible reordering combinations for some bit of code. That would reveal bugs even on Intel. BTW, I'd suspect you'll gain far more performance from sync primitives written using TM than from relaxed memory ordering. If you're going to expend effort there, far better to do so on TM implementations despite the lack of TM capable hardware. Niall -- Currently unemployed and looking for work. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/