[thread] shared_spinlock_t idea or faster shared_lock

Antony Polukhin

2 May 2014 2 May '14

5:37 a.m.

Hi, This is just a refining of idea that was once implemented and worked well. The idea was to make shared spin lock. Here is some pseudo-code: class shared_spin_lock { atomic_max_uint_t state_; lock_shared() { do { // CAS loop auto snapshot = state_.load(); if (snapshot.has_writer_bit_set()) { reader_wait(); // waiting for writer to finish continue; } auto new_state = snapshot + 1; } while(!CAS(state, new_state)); } unlock_shared() { --state; } lock() { do { // CAS loop auto snapshot = state_.load(); if (snapshot.has_writer_bit_set()) { writer_wait(); // waiting for other writer to finish continue; } auto new_state = snapshot.set_writer_bit(); } while(!CAS(state, new_state)); // we have set the writer bit, now waiting for readers to finish while (state_.readers_count) ; // busy loop, can we do better? writer_started(); } unlock() { auto snapshot = state_.reset_writer_bit(); writer_finished(); ASSERT (snapshot.has_writer_bit_set()); } }; When all the *_wait() and writer_*() functions do nothing we get a shared spinlock. Now if we add helper mutex and make all the *_wait() functions to lock and unlock mutex; writer_start() - lock the mutex; writer finished() - unlock the mutex... we get a simplified shared_mutex that is faster (in theory) than current implementation. Is there an interest in such shared_spinlock_t? Any comments, suggestions? -- Best regards, Antony Polukhin

Show replies by date

Rob Stewart

2 May 2 May

8:56 a.m.

On May 2, 2014 1:37:46 AM EDT, Antony Polukhin <antoshkka@gmail.com> wrote:

...

This is just a refining of idea that was once implemented and worked well. The idea was to make shared spin lock. Here is some pseudo-code:

class shared_spin_lock { atomic_max_uint_t state_;

lock_shared() { do { // CAS loop auto snapshot = state_.load(); if (snapshot.has_writer_bit_set()) { reader_wait(); // waiting for writer to finish continue; }

auto new_state = snapshot + 1; } while(!CAS(state, new_state)); }

unlock_shared() { --state; }

lock() { do { // CAS loop auto snapshot = state_.load(); if (snapshot.has_writer_bit_set()) { writer_wait(); // waiting for other writer to finish continue; }

auto new_state = snapshot.set_writer_bit(); } while(!CAS(state, new_state));

Why wouldn't you set the writer bit once you own the mutex? As it is, you release the mutex and the use CAS to try to set it, but no other writer can set it when you hold the mutex.

...

// we have set the writer bit, now waiting for readers to finish while (state_.readers_count) ; // busy loop, can we do better?

Run the busy loop a limited number of times before adding a yield or sleep instruction, then repeat.

...

writer_started(); }

unlock() { auto snapshot = state_.reset_writer_bit(); writer_finished(); ASSERT (snapshot.has_writer_bit_set()); }

};

When all the *_wait() and writer_*() functions do nothing we get a shared spinlock.

Now if we add helper mutex and make all the *_wait() functions to lock and unlock mutex; writer_start() - lock the mutex; writer finished() - unlock the mutex... we get a simplified shared_mutex that is faster (in theory) than current implementation.

I haven't seen the code to which you're comparing this implementation, so I have no idea whether it is faster. ___ Rob (Sent from my portable computation engine)

Tim Blechmann

11:28 a.m.

...

...
// we have set the writer bit, now waiting for readers to finish while (state_.readers_count) ; // busy loop, can we do better?

Run the busy loop a limited number of times before adding a yield or sleep instruction, then repeat.

on intel CPUs, one should use a PAUSE instruction as in _mm_pause() inside the busy loop ... tim

Andrey Semashev

9:04 a.m.

On Friday 02 May 2014 09:37:46 Antony Polukhin wrote:

...

Hi,

This is just a refining of idea that was once implemented and worked well. The idea was to make shared spin lock. Here is some pseudo-code:

class shared_spin_lock { atomic_max_uint_t state_;

lock_shared() { do { // CAS loop auto snapshot = state_.load(); if (snapshot.has_writer_bit_set()) { reader_wait(); // waiting for writer to finish continue; }

auto new_state = snapshot + 1; } while(!CAS(state, new_state)); }

unlock_shared() { --state; }

lock() { do { // CAS loop auto snapshot = state_.load(); if (snapshot.has_writer_bit_set()) { writer_wait(); // waiting for other writer to finish continue; }

auto new_state = snapshot.set_writer_bit(); } while(!CAS(state, new_state));

// we have set the writer bit, now waiting for readers to finish while (state_.readers_count) ; // busy loop, can we do better?

writer_started(); }

unlock() { auto snapshot = state_.reset_writer_bit(); writer_finished(); ASSERT (snapshot.has_writer_bit_set()); }

};

When all the *_wait() and writer_*() functions do nothing we get a shared spinlock.

Now if we add helper mutex and make all the *_wait() functions to lock and unlock mutex; writer_start() - lock the mutex; writer finished() - unlock the mutex... we get a simplified shared_mutex that is faster (in theory) than current implementation.

Is there an interest in such shared_spinlock_t? Any comments, suggestions?

There is shared_spin_mutex in Boost.Sync: boost/sync/mutexes/shared_spin_mutex.hpp

Giovanni Piero Deretta

9:26 a.m.

On Fri, May 2, 2014 at 6:37 AM, Antony Polukhin <antoshkka@gmail.com> wrote:

...

Hi,

This is just a refining of idea that was once implemented and worked well. The idea was to make shared spin lock. Here is some pseudo-code:

When all the *_wait() and writer_*() functions do nothing we get a shared spinlock.

[snip]

...

Now if we add helper mutex and make all the *_wait() functions to lock and unlock mutex; writer_start() - lock the mutex; writer finished() - unlock the mutex... we get a simplified shared_mutex that is faster (in theory) than current implementation.

Is there an interest in such shared_spinlock_t? Any comments, suggestions?

Not sure, you generally want to use a spinlock when you expect contention to be very low. If you expect contention to be high enough that a r/w lock could be a win, you probably do not want to spin anyway. There is also the geneal issue of naive r/w locks with readers contending for an exclusive cacheline. -- gpd

Antony Polukhin

10:31 a.m.

2014-05-02 13:26 GMT+04:00 Giovanni Piero Deretta <gpderetta@gmail.com>:

...

On Fri, May 2, 2014 at 6:37 AM, Antony Polukhin <antoshkka@gmail.com> wrote:

...
Hi,

This is just a refining of idea that was once implemented and worked well. The idea was to make shared spin lock. Here is some pseudo-code:

When all the *_wait() and writer_*() functions do nothing we get a shared spinlock.

[snip]

...
Now if we add helper mutex and make all the *_wait() functions to lock and unlock mutex; writer_start() - lock the mutex; writer finished() - unlock the mutex... we get a simplified shared_mutex that is faster (in theory) than current implementation.

Is there an interest in such shared_spinlock_t? Any comments, suggestions?

Not sure, you generally want to use a spinlock when you expect contention to be very low. If you expect contention to be high enough that a r/w lock could be a win, you probably do not want to spin anyway.

Read/write locks are not cheap (at least in Boost.Thread because of interruptions). For cases when there is is high read contention and critical section is *very* small shared spinlocks could show a better performance. -- Best regards, Antony Polukhin

Niall Douglas

11:02 a.m.

On 2 May 2014 at 9:37, Antony Polukhin wrote:

...

This is just a refining of idea that was once implemented and worked well. The idea was to make shared spin lock. Here is some pseudo-code: [snip]

Is there an interest in such shared_spinlock_t? Any comments, suggestions?

Two points: 1. I have a TSX capable policy based compile time configurable spin lock in AFIO which might be useful. It really ought to enter Boost.Detail. It should work in shared memory. 2. Later today hopefully I'll ask this list for review of a new Boost.Thread object, the permit. The underlying POSIX permit written in C works in shared memory (the C++ object not so). Niall -- Currently unemployed and looking for work in Ireland. Work Portfolio: http://careers.stackoverflow.com/nialldouglas/

4083

Age (days ago)

4083

Last active (days ago)

List overview

Download

6 comments

6 participants

participants (6)

Andrey Semashev
Antony Polukhin
Giovanni Piero Deretta
Niall Douglas
Rob Stewart
Tim Blechmann