Re: boost::condition question

22 Feb 2002


      --- In Boost-Users@y..., "svanechel" <svanechel@y...> wrote:
...
Hi,
I'm trying to create a mechanism that synchronizes access to
shared 
data on a per-operation basis. The idea is that some operations 
(E.g. 
read operations) can proceed concurrently, while others (write 
operations) must block all other threads. I thought 
boost::condition 
would do the trick. Since I do not have that much experience in 
multithreaded programming it is entirely possible that I do not 
fully 
understand boost::condition or its allowable usage.
I tried to put operations in an "Operation Group". All
operations 
belonging to the same group are allowed to execute concurrently, 
while all others are blocked (E.g. all read operations are in the 
same group). It is also possible that an operation doesn't
specify an 
Operation Group, in which case a temporary group is assigned to the 
operation for each thread that calls it. Write operations are 
constructed this way.
First, note that this scheme is very prone to starvation problems.  
If an operation "type" acquires the "lock" and there are several 
threads that will also acquire this same "type" in a loop, it may 
very well be that the "lock" is never relinquished for the 
other "types" to acquire it.  RW locks are a little tricky to 
implement because of this issue.

Any way, to pinpoint your error:
...
class OperationLevelLockable
{
  class AccessCounter;
public:
  OperationLevelLockable(){}
typedef	Internal::DefaultTMIntType			
      	IntType;
template<typename T>
  struct MakeSharedType
  {
      typedef	T volatile				
      				Type;
  };
class Lock;
  friend class Lock;
class Lock
  {
  public:
      Lock( OperationLevelLockable& obj ) :
      	mCounter( obj.mOpAccessCounter )
      {
         mCounter.EnterOperation( Lock::TempOpGroup() );
      }
Lock( OperationLevelLockable& obj, int32 opGroup ) :
      	mCounter( obj.mOpAccessCounter )
      {
         mCounter.EnterOperation( opGroup );
      }
~Lock()
      {
      	mCounter.LeaveOperation();
      }
private:
      Lock( Lock const& );
      Lock& operator=( Lock const& );
static IntType TempOpGroup()
      {
         static volatile IntType opGroup =
                                      Internal::kFIRST_AUTO_GROUP;
IntType newGroup = 
OperationLevelLockable::AtomicIncrement( opGroup );
if ( newGroup >= Internal::kLAST_AUTO_GROUP )
         {
           OperationLevelLockable::AtomicAssign( opGroup, 
Internal::kFIRST_AUTO_GROUP );
         }
return ( newGroup );
      }
AccessCounter&	mCounter;
  };
static inline IntType AtomicIncrement( volatile IntType& lval );
static inline IntType AtomicDecrement( volatile IntType& lval );
static inline void AtomicAssign( IntType& lval, IntType rVal );
static inline void AtomicAssign( volatile IntType& lval, IntType 
rVal );
private:
class AccessCounter :
      private 
boost::base_from_member<boost::recursive_mutex>
  {
  public:
AccessCounter() :
      	mLock( member ), mRefCnt( 0 ),
ScopedLock objects shouldn't be used in this way.  They should really 
only be constructed in method implementations.  By constructing it 
here in the constructor for AccessCounter you've given it a lifetime 
equal to that of AccessCounter, and this is actually the cause of the 
deadlock.  The associated mutex is locked for the lifetime of the 
AccessCounter, which causes the operations below to not behave the 
way you think they do.
...
mOpGroup( Internal::kANY_GROUP )  {}
bool IsOpGroupActive( IntType opGroup ) const
      {
         return ( ( Internal::kANY_GROUP == mOpGroup ) ||
                            ( opGroup == mOpGroup ) );
      }
void LeaveOperation()
      {
      	if ( 0 == 
OperationLevelLockable::AtomicDecrement( mRefCnt ) )
      	{
      	   SetOpGroup( Internal::kANY_GROUP );
      	   mCanOpProceed.notify_all();
      	}
      }
The problem here is that the lock is never released, so even though 
you've notified all other threads that we're leaving, they can't 
acquire the lock below!
...
void EnterOperation( IntType opGroup )
      {
         mCanOpProceed.wait( mLock,
                                       CanOpProceed( *this,
                                                     opGroup ) );
         SetOpGroup( opGroup );
         OperationLevelLockable::AtomicIncrement( mRefCnt );
      }
Here, since the lock was never released, any threads that are 
notified can't return from the wait() operations, because they can't 
reacquire the mutex lock!  Study the Monitor example in Boost.Threads 
and note how the scoped_lock<> templates are used.  You need to 
construct a scoped_lock<> not in the constructor of the AccessCounter 
class, but in both the LeaveOperation and EnterOperation methods.

BTW, the use of atomic operations is very suspect here.  Unless I'm 
missing something there's no need for it.  The member variables are 
private and will be modified only when the mutex is locked, so 
there's no need for atomic operations and they just add overhead.
...
private:
AccessCounter( AccessCounter const& );
      AccessCounter& operator=( AccessCounter const& );
struct CanOpProceed
      {
      	CanOpProceed( AccessCounter const& counter,
                                      IntType opGroup ) :
      		mOpGroup( opGroup ),
                                mCounter( counter ) {}
bool operator()() const
      	{
      	  return ( 
                             mCounter.IsOpGroupActive( mOpGroup )
                          );
      	}
private:		
      	CanOpProceed& operator=( CanOpProceed 
const& );
int32 mOpGroup;
      AccessCounter const& mCounter;
      };
void SetOpGroup( IntType opGroup )
      {
       OperationLevelLockable::AtomicAssign( mOpGroup,
                                                       opGroup );
      }
boost::recursive_mutex::scoped_lock  mLock;
  boost::condition		     mCanOpProceed;
  volatile IntType                     mRefCnt;
  volatile IntType                     mOpGroup;
  };
OperationLevelLockable( OperationLevelLockable const& );
  OperationLevelLockable& operator=( OperationLevelLockable 
const& );
AccessCounter  mOpAccessCounter;
};
The idea is that the first operation that is executed acquires the 
lock and blocks out all threads not belonging to the same Operation 
Group. The original blocking works fine. Only threads executing in 
the same Operation Group are allowed to continue. When the last 
thread of the group leaves the operation in 
OperationLevelLockable::AccessCounter::LeaveOperation it releases 
all 
waiting threads and a thread from the other group is allowed to 
run. 
However when this thread finishes no new threads are allowed to run 
anymore and the application hangs.
Am I using boost::condition incorrectly. Should I use something 
else? 
It the test code wrong? Are there any other errors in the code that 
prevent it from behaving as expected. Should I use another 
technique 
to achieve the same?
You didn't use boost::condition incorrectly, you used the 
scoped_lock<> incorrectly.  Fixing this, though, still leaves you 
with a questionable synchronization scheme that's prone to starvation.
...
I hope there is someone out there who has the hart to read through 
the entire posting and help me. I hope I have given enough 
information.
I hope my response was helpful.

Bill Kempf

Re: boost::condition question

bill_kempf