Re: [boost] [interprocess] Mutex and condition at process termination

16 May 2020

      Phil Endecott wrote:
...
Can we improve how interprocess mutexes and condition variables
behave on process termination?
Having given this some more thought:

I think it would be useful if Boost.Interprocess added
a robust mutex, as a straightforward wrapper around the
POSIX robust mutex and equivalents on other platforms if
they exist.  I note that there is a patch that does this
on the Interprocess issue tracker but it unconditionally
cleans up the mutex when it find that the other process
died, which is wrong.  I believe that the lock() method
should fail in that case, and it should provide a
make_consistent method that the user can invoke if
appropriate before retrying.  Then read and write locks,
with appropriate clean-up behaviour, can be implemented
on top of that.

Vinicius dos Santos Oliveira <vini.ipsmaker@gmail.com> wrote:
...
After some more thought, here is another idea: PTHREAD_MUTEX_ROBUST 
is no longer a property of the mutex, but a property of the lock.
I don't see how that can be implemented on top of the
POSIX API, where robustness is a property of the mutex.

Andrey Semashev <andrey.semashev@gmail.com> wrote:
...
...
* PTHREAD_MUTEX_ROBUST might be part of the solution. That seems
to require the non-crashed process to do clean up, i.e. we would
need to record whether the crashed process were reading or writing
and react appropriately.
You can't do that reliably because the crashed process could have 
crashed between locking the mutex and indicating its intentions.
I don't follow.  Say I have a bool in the mutex called being_written.
It's initially false, the read lock doesn't touch it, and the write
lock does:

lock() { m.lock(); being_written = true; memory_barrier(); }
unlock() { memory_barrier(); being_written = false; m.unlock(); }

If the process crashes between locking and setting being_written,
then the process doing the cleanup will see being_written = false,
and that's OK because the crasher hadn't actually written anything.

Regarding blocking signals, I agree this is not really something
that should be part of the interprocess synchronisation primitives,
but I do think that a modern wrapper around the ancient C signals
API would be good to have.
...
...
I'm less clear about what happens to condition variables, but it
does seem that perhaps terminating a process while it is waiting
on a condition will cause other processes to deadlock. Perhaps
the wait conceptually returns and the mutex is re-locked during
termination.
AFAIR, pthread_cond_t uses a non-robust mutex internally, which means 
that condition variables are basically useless when you need robust 
semantics.
Yes.
...
If you need a condition variable-like behavior, in a robust way, I think 
your best bet is to use futexes directly.
Yes, that is the conclusion that I've also come to - but it is
probably a very difficult problem.  Note that robust mutexes use
futexes rather differently from regular mutexes, and there is
kernel involvement at process termination (see man get_robust_list).
A robust condition variable would have to do something similar.

I find this all rather surprising, as interrupting a waiting condition
variable is often much more common than interrupting a locked mutex.

Regards, Phil.