22 May
2020
22 May
'20
4:31 p.m.
On Fri, 22 May 2020 at 12:54, Bjorn Reese via Boost
LEAF really only needs one thread-local pointer to the topmost context per thread, so it may be possible to replace the thread-local storage with a global lock-free hash table.
On Nvidia GPUs, there is already a mechanism for this, known as local memory, which in effect is a global region which interleaves the various thread's data so that they can all access their fields in parallel. This is what gets used whenever a register needs to be spilled to memory. I do not know why Nvidia couldn't just implement TLS using this.