[interprocess] interprocess_condition blocks on notify_all if other process crashes
When two processes share memory, and if one of them dies leaving a mutex in locked state, then the other process blocks on a notify using a interprocess_condition, stored in shared memory. This is described here: http://stackoverflow.com/questions/18240263/boost-interprocess-condition-blo... The reason is that this condition uses in general, a ipcdetail::spin_condition. Here is what it does in boost/interprocess/sync/interprocess_condition.hpp and boost/interprocess/sync/spin/condition.hpp : inline void spin_condition::notify(boost::uint32_t command) { //This mutex guarantees that no other thread can enter to the //do_timed_wait method logic, so that thread count will be //constant until the function writes a NOTIFY_ALL command. //It also guarantees that no other notification can be signaled //on this spin_condition before this one ends m_enter_mut.lock(); It locks an internal mutex, and if it is already locked and not unlocked by the crashed process, we are blocked. A fix is to replace lock() by: if( ! m_enter_mut.timed_lock(delay) ) throw an_exception(); A cleaner fix would be to derive a subclass from spin_condition to implement something like: bool derived_spin_condition::timed_notify(boost::uint32_t, ptime delay) What do you think ? Thanks.
On 21/10/2016 1:27, Remi Chateauneu wrote:
When two processes share memory, and if one of them dies leaving a mutex in locked state, then the other process blocks on a notify using a interprocess_condition, stored in shared memory.
This is described here: http://stackoverflow.com/questions/18240263/boost-interprocess-condition-blo...
The reason is that this condition uses in general, a ipcdetail::spin_condition. Here is what it does in boost/interprocess/sync/interprocess_condition.hpp and boost/interprocess/sync/spin/condition.hpp :
inline void spin_condition::notify(boost::uint32_t command) { //This mutex guarantees that no other thread can enter to the //do_timed_wait method logic, so that thread count will be //constant until the function writes a NOTIFY_ALL command. //It also guarantees that no other notification can be signaled //on this spin_condition before this one ends m_enter_mut.lock();
It locks an internal mutex, and if it is already locked and not unlocked by the crashed process, we are blocked.
A fix is to replace lock() by:
if( ! m_enter_mut.timed_lock(delay) ) throw an_exception();
A cleaner fix would be to derive a subclass from spin_condition to implement something like:
bool derived_spin_condition::timed_notify(boost::uint32_t, ptime delay)
What do you think ? Thanks.
You can check the undocumented macro: BOOST_INTERPROCESS_TIMEOUT_WHEN_LOCKING_DURATION_MS if you define it to 10000, then after 10s, an exception should be thrown saying a possible deadlock is happening. Best, Ion
participants (2)
-
Ion GaztaƱaga
-
Remi Chateauneu