Re: [boost] Interprocess mutex & condition variable at process termination

15 Feb 2017

      On 02/15/17 20:42, Phil Endecott via Boost wrote:
...
Dear Experts,
I've just been surprised by the behaviour of the interprocess
mutex and condition variable on abnormal process termination, i.e.
they are not automatically released.
Google tells me that I'm not the first to be surprised by this; there
have been previous posts here, stack overflow questions etc.
One often-valid observation is that if a process crashes - or
otherwise terminates without executing its destructors - while it
holds a lock on a shared data structure then the data is probably
now corrupt, so unlocking the mutex that protects it is not very
useful.  I think there is an important case where that does not
apply - when the process that crashes is only reading the shared
data.  In my case, I had written a "monitor" utility that loops
forever, waiting on a shared condition, taking the corresponding
mutex, and then dumping the shared data to stdout.  I had been
running this and stopping it by pressing ctrl-C and it had not
occurred to me that this might not work as I expected.  My
attempt at debugging using this utility was making my problems worse,
not better!  Modifying this code to run destructors on ctrl-C is
non-trivial.
I am aware that the SysV shared semaphore is able to undo on
process termination (see SEM_UNDO in man semop), and I had assumed
that Boost.Interprocess was using this or something like it.  I
now see that it is using pthreads, which I didn't even realise
could work between processes, and I don't think this API has
any way to specify process termination behaviour.
There is a way to handle this case, but this API is not universally 
supported:

http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutexattr_...

If that API is not supported on your platform, you may want to avoid 
locking the mutex without a timeout (i.e. failing to acquire a mutex for 
a given time should be considered an indication that the mutex has been 
abandoned in the locked state).

In general, synchronization primitives that reside in shared memory 
(such as pthread mutexes or Boost.Interprocess mutexes) should be 
considered vulnerable to (a) corruption and (b) becoming unusable (like, 
indefinitely locked) because of a user process misbehavior. That is 
rather obvious considering that such primitives typically do not include 
any other resources, such as handles to kernel objects or file 
descriptors and as such "don't exist" for the kernel (consequently, the 
kernel cannot release them on process termination). Robust mutexes that 
I referenced above are an exception to that general rule.

Named primitives, such as SysV semaphores, are typically more protected 
because there is at least a file descriptor or something that 
corresponds to the name and there is usually a limited API to interact 
with the primitive (i.e. you usually don't have a direct access to the 
primitive data).

There are a number of named synchronization primitives in 
Boost.Interprocess, although I don't think they provide "auto unlock on 
process termination" feature.
...
Anyway, I'd like to suggest that the interprocess docs should
make some mention of the behaviour of the synchronisation
primitives on process termination, e.g. somewhere near the
beginning of
http://www.boost.org/doc/libs/1_63_0/doc/html/interprocess/synchronization_m...
I may now try to implement some primitives that use semop() and
unlock automatically.  I haven't yet looked at what's involved to
implement a condition variable on top of a semaphore, so I may not
get very far!  Has anyone else ever tried this?
If you want (more or less) reliable interprocess synchronization, you 
will currently have to implement it yourself. There are a number of 
compromises to make along the way. For instance, pthread robust mutexes 
API does not quite fit into the traditional C++ mutex API, so one has to 
improvise. In the absence of robust mutexes, the timeout workaround is 
not universally applicable, and the timeout itself is, obviously, 
case-specific. Also, most of these APIs are not fully portable (not 
between Windows and POSIX-compatible systems, anyway), so you end up 
with OS-specific branches.

I did implement this an a few of my projects. One example is Boost.Log, 
where I opportunistically use robust mutexes:

https://github.com/boostorg/log/blob/develop/src/posix/ipc_sync_wrappers.hpp
https://github.com/boostorg/log/blob/develop/src/posix/ipc_reliable_message_...

You can see Windows implementation is quite different:

https://github.com/boostorg/log/blob/develop/src/windows/ipc_sync_wrappers.h...
https://github.com/boostorg/log/blob/develop/src/windows/ipc_sync_wrappers.c...
https://github.com/boostorg/log/blob/develop/src/windows/ipc_reliable_messag...

The best solution to these problems, however, is to avoid locks 
altogether and use lock-free algorithms in such a way that any data in 
the shared memory is valid and can be handled.

Re: [boost] Interprocess mutex & condition variable at process termination

Andrey Semashev