On 2020-05-16 21:35, Phil Endecott via Boost wrote:
Phil Endecott wrote: Andrey Semashev
wrote: * PTHREAD_MUTEX_ROBUST might be part of the solution. That seems to require the non-crashed process to do clean up, i.e. we would need to record whether the crashed process were reading or writing and react appropriately.
You can't do that reliably because the crashed process could have crashed between locking the mutex and indicating its intentions.
I don't follow. Say I have a bool in the mutex called being_written. It's initially false, the read lock doesn't touch it, and the write lock does:
lock() { m.lock(); being_written = true; memory_barrier(); } unlock() { memory_barrier(); being_written = false; m.unlock(); }
If the process crashes between locking and setting being_written, then the process doing the cleanup will see being_written = false, and that's OK because the crasher hadn't actually written anything.
What if the writer crashes in unlock(), between being_written = false and m.unlock()?
Regarding blocking signals, I agree this is not really something that should be part of the interprocess synchronisation primitives, but I do think that a modern wrapper around the ancient C signals API would be good to have.
I agree, although given that there are many different ways to handle signals, I have a hard time imagining what such a wrapper would look like.
If you need a condition variable-like behavior, in a robust way, I think your best bet is to use futexes directly.
Yes, that is the conclusion that I've also come to - but it is probably a very difficult problem. Note that robust mutexes use futexes rather differently from regular mutexes, and there is kernel involvement at process termination (see man get_robust_list). A robust condition variable would have to do something similar.
Yes, given that the robust list is an internal interface between the kernel and libc, you basically have to implement your own mechanism, which may not be absolutely equivalent to the real robust mutexes. For example, in a pseudo_robust_mutex::lock you could use a timed wait on the internal futex, and on timeout check if the mutex owner pid exists. You may have other ways of detecting and handling abandoned locks depending on your application architecture. Condition variable could be implemented without the internal mutex (i.e. it would only have one internal futex), if you have a guarantee that the associated external mutex is always locked when the condition variable methods are called.