On 9/05/2018 17:30, Tim Blechmann wrote:
On a second thought, maybe I could even change the implementation of the future_queue to internally use e.g. the well-known boost::lockfree queues together with some semaphore to implement the blocking behaviour. I am not 100% sure yet, but I think all current features could be implemented by that. Do you think this would improve the implementation? It might be a bit simpler and maybe even more efficient in some cases.
I would love to see a portable eventcount (http://www.1024cores.net/home/lock-free-algorithms/eventcounts) implementation in Boost.Lockfree.
i'm not sure, if boost.lockfree is a good place for it (is it even possible to implement them without relying on system calls, which may acquire locks in kernel-space?)
Even the Boost.LockFree queue may not always be lock-free -- eg. if you push more nodes than the queue's capacity (and it's not fixed_sized) it will allocate more, which will often take a Big System Lock (though modern allocators try to use smaller less-contended locks to reduce the impact). And these things are usually implemented with a fast-path that does not perform any kernel operations, and a slow-path that does (along with the hope that the slow-path is not triggered too often). As soon as you start talking about wanting to block, you have to involve the kernel (or a userspace scheduler). This is a choice to gain some convenience while sacrificing some "lock-free-ness", just like the choice to use a non-fixed queue. The basic "lock-free but blocking" expected behaviour is: 1. If pushing but the queue is full, then block until it isn't (blocking path) 2. If a popper is waiting, push and then wake them up (slow-path) 3. Otherwise just push without kernel operations (fast-path) And similarly on the consuming end for an empty queue. (And a non-fixed-size queue might omit step #1 entirely, though still end up blocking a little if it hits an allocator.) The main trick is in the wakeups (and reliably knowing whether someone is waiting); you can never really avoid making kernel calls to wake up another thread, but you want to do so only when actually needed (erring on the side of doing an unnecessary wakeup rather than never doing a wakeup), and you need to use constructs that (while perhaps not completely lock-free) are at least wait-free. Semaphores and Win32 events are usually good candidates for this, because the scheduler generally treats them as atomic operations as far as user code is concerned. Traditional mutexes (and condition variables, since they rely on those) are bad candidates because the thread might be preempted while a lock is held, which prevents forward progress of another thread. But you also usually have to be careful to use the "true" synchronisation primitives of the platform, not emulations defined in libraries, because those might use locks or other constructs that are problematic. When you're implementing a platform-specific abstraction, it's one of the rare cases that it's not a good idea to depend on other abstractions.