--- In Boost-Users@y..., "svanechel"
Hi,
I'm trying to create a mechanism that synchronizes access to shared data on a per-operation basis. The idea is that some operations (E.g. read operations) can proceed concurrently, while others (write operations) must block all other threads. I thought boost::condition would do the trick. Since I do not have that much experience in multithreaded programming it is entirely possible that I do not fully understand boost::condition or its allowable usage.
I tried to put operations in an "Operation Group". All operations belonging to the same group are allowed to execute concurrently, while all others are blocked (E.g. all read operations are in the same group). It is also possible that an operation doesn't specify an Operation Group, in which case a temporary group is assigned to the operation for each thread that calls it. Write operations are constructed this way.
First, note that this scheme is very prone to starvation problems. If an operation "type" acquires the "lock" and there are several threads that will also acquire this same "type" in a loop, it may very well be that the "lock" is never relinquished for the other "types" to acquire it. RW locks are a little tricky to implement because of this issue. Any way, to pinpoint your error:
class OperationLevelLockable { class AccessCounter; public: OperationLevelLockable(){}
typedef Internal::DefaultTMIntType IntType;
template<typename T> struct MakeSharedType { typedef T volatile Type; };
class Lock; friend class Lock;
class Lock { public: Lock( OperationLevelLockable& obj ) : mCounter( obj.mOpAccessCounter ) { mCounter.EnterOperation( Lock::TempOpGroup() ); }
Lock( OperationLevelLockable& obj, int32 opGroup ) : mCounter( obj.mOpAccessCounter ) { mCounter.EnterOperation( opGroup ); }
~Lock() { mCounter.LeaveOperation(); }
private: Lock( Lock const& ); Lock& operator=( Lock const& );
static IntType TempOpGroup() { static volatile IntType opGroup = Internal::kFIRST_AUTO_GROUP;
IntType newGroup = OperationLevelLockable::AtomicIncrement( opGroup );
if ( newGroup >= Internal::kLAST_AUTO_GROUP ) { OperationLevelLockable::AtomicAssign( opGroup, Internal::kFIRST_AUTO_GROUP ); }
return ( newGroup ); }
AccessCounter& mCounter; };
static inline IntType AtomicIncrement( volatile IntType& lval ); static inline IntType AtomicDecrement( volatile IntType& lval ); static inline void AtomicAssign( IntType& lval, IntType rVal ); static inline void AtomicAssign( volatile IntType& lval, IntType rVal );
private:
class AccessCounter : private boost::base_from_memberboost::recursive_mutex { public:
AccessCounter() : mLock( member ), mRefCnt( 0 ),
ScopedLock objects shouldn't be used in this way. They should really only be constructed in method implementations. By constructing it here in the constructor for AccessCounter you've given it a lifetime equal to that of AccessCounter, and this is actually the cause of the deadlock. The associated mutex is locked for the lifetime of the AccessCounter, which causes the operations below to not behave the way you think they do.
mOpGroup( Internal::kANY_GROUP ) {}
bool IsOpGroupActive( IntType opGroup ) const { return ( ( Internal::kANY_GROUP == mOpGroup ) || ( opGroup == mOpGroup ) ); }
void LeaveOperation() { if ( 0 == OperationLevelLockable::AtomicDecrement( mRefCnt ) ) { SetOpGroup( Internal::kANY_GROUP ); mCanOpProceed.notify_all(); } }
The problem here is that the lock is never released, so even though you've notified all other threads that we're leaving, they can't acquire the lock below!
void EnterOperation( IntType opGroup ) { mCanOpProceed.wait( mLock, CanOpProceed( *this, opGroup ) ); SetOpGroup( opGroup ); OperationLevelLockable::AtomicIncrement( mRefCnt ); }
Here, since the lock was never released, any threads that are notified can't return from the wait() operations, because they can't reacquire the mutex lock! Study the Monitor example in Boost.Threads and note how the scoped_lock<> templates are used. You need to construct a scoped_lock<> not in the constructor of the AccessCounter class, but in both the LeaveOperation and EnterOperation methods. BTW, the use of atomic operations is very suspect here. Unless I'm missing something there's no need for it. The member variables are private and will be modified only when the mutex is locked, so there's no need for atomic operations and they just add overhead.
private:
AccessCounter( AccessCounter const& ); AccessCounter& operator=( AccessCounter const& );
struct CanOpProceed { CanOpProceed( AccessCounter const& counter, IntType opGroup ) : mOpGroup( opGroup ), mCounter( counter ) {}
bool operator()() const { return ( mCounter.IsOpGroupActive( mOpGroup ) ); }
private: CanOpProceed& operator=( CanOpProceed const& );
int32 mOpGroup; AccessCounter const& mCounter; };
void SetOpGroup( IntType opGroup ) { OperationLevelLockable::AtomicAssign( mOpGroup, opGroup ); }
boost::recursive_mutex::scoped_lock mLock; boost::condition mCanOpProceed; volatile IntType mRefCnt; volatile IntType mOpGroup; };
OperationLevelLockable( OperationLevelLockable const& ); OperationLevelLockable& operator=( OperationLevelLockable const& );
AccessCounter mOpAccessCounter; };
The idea is that the first operation that is executed acquires the lock and blocks out all threads not belonging to the same Operation Group. The original blocking works fine. Only threads executing in the same Operation Group are allowed to continue. When the last thread of the group leaves the operation in OperationLevelLockable::AccessCounter::LeaveOperation it releases all waiting threads and a thread from the other group is allowed to run. However when this thread finishes no new threads are allowed to run anymore and the application hangs.
Am I using boost::condition incorrectly. Should I use something else? It the test code wrong? Are there any other errors in the code that prevent it from behaving as expected. Should I use another technique to achieve the same?
You didn't use boost::condition incorrectly, you used the scoped_lock<> incorrectly. Fixing this, though, still leaves you with a questionable synchronization scheme that's prone to starvation.
I hope there is someone out there who has the hart to read through the entire posting and help me. I hope I have given enough information.
I hope my response was helpful. Bill Kempf