On Fri, Mar 13, 2009 at 2:32 PM, Rutger ter Borg
Dear all,
I have been testing asio's io_service in a threadpool setup for job dispatching. However, it seems as if adding threads doesn't improve performance; perhaps even the opposite with 1 thread having the best performance.
Someone more familiar with the implementation could comment, but just poking around through the implementation it appears there is one queue of handlers that will get shared by all threads; right there I think there'll be a lot of lock contention between threads on the single queue. I tried translating this example to Intel's TBB library, and start to see concurrency effects as I move up beyond 8 threads on my quad-core box (using parallel_for with a blocked_range that results in a single call to f per task). Increasing the amount of work done on a given task (by increasing the size of the blocked_range to parallel_for) speeds the run-time greatly, presumably because of the reduced number of tasks and reduced context switching. I'm guessing that the asio io_service isn't really geared towards effective use of multi-core CPUs where you're trying to schedule a large number of small computational tasks; I'll go out on a limb and say that this *wasn't* the intent of the library (as the name somewhat implies). Not sure if that was helpful, but it let me play around with TBB, which seems very nice. Cheers Oliver