On 3/04/2020 08:41, Jan Hafer wrote:
On 02.04.20 20:59, Mathias Gaunard wrote:
So you're saying that circular_buffer is slower on a given thread when other threads are accessing their own circular_buffer in parallel? That sounds unlikely to be circular buffer's fault.
Yes and I dont know quite the reason for it. My Threads know their id to access a file-global data structure containing their queue/circular buffer. They start another after in a thread-safe way and exit on emptying the queue/circular buffer.
That sounds like you're allocating a single array of circular_buffers and then accessing them from different threads. That's basically the worst possible thing to do; as Mathias was saying, that will end up sharing cache lines between different cores and your performance will tank. At minimum, you should embed the circular_buffer into another struct that has sizeof() >= std::hardware_destructive_interference_size, and make an array of that. But better still, embed the circular_buffer into your processing classes and don't have arrays of them at all. (If you're pre-C++17 and don't have std::hardware_destructive_interference_size, then using 64 works for most modern platforms.) Ideally, the circular_buffer implementation itself should also separate all internal producer-thread members and consumer-thread members by std::hardware_destructive_interference_size and try very hard to not cross over. (Here the main thing that matters is write accesses.) If you want to try using a more modern circular buffer that gets this correct, have a look at Boost.Lockfree's spsc_queue. https://www.boost.org/doc/libs/1_72_0/doc/html/boost/lockfree/spsc_queue.htm... (While you're there, there's also an MPMC queue as well. Note that a lockfree queue will tend to be slower than std::queue in uncontended benchmarks, but the avoidance of locks can be superior in highly contended or other specialised scenarios.)