
I did a quick hack and the code using fibers is 2-3 times faster than the threads. boost.fiber does not contain the suggested optimizations (like replacing stl containers)
I'd be disappointed if the overheads imposed by Boost.Fiber are only 2-3 times smaller than for kernel threads. I'd expect it to impose at least 10 times, if not 15-20 times less overheads than kernel threads (at least that's the numbers we're seeing from HPX).
Like any C++ probably Boost.Fiber makes many malloc calls per context switch. It adds up.
I don't think that things like a context switch require any memory allocation. All you do is to flush the registers, flip the stack pointer, and load the registers from the new stack.
If I ever had a willing employer, I could get clang to spit out far more malloc optimal C++ at the cost of a new ABI, but I never could get an employer to bite.
Sorry for sidestepping, are you sure compilers do memory allocation as part of their way to conform to ABI's? I was always assuming memory allocation is done only when explicitly requested by user code.
I think coming within 50% of the performance of Windows Fibers would be more than plenty. After all Boost.Fiber "does more" than Windows Fibers.
It might be sufficient for you but not for everybody else. It wouldn't be sufficient for us, for instance. If you build systems relying on fine grain parallelism, then efficiently implemented fibers are the only way to go. If you need to create billions of threads (fibers), then every microsecond of overhead counts billion-fold. Regards Hartmut --------------- http://boost-spirit.com http://stellar.cct.lsu.edu