On Mon, Dec 22, 2014 at 4:32 AM, Mathias Gaunard
Hi,
I do not know if I'll be doing a full review, but I have some comments from having skimmed through the code and documentation.
From a quick glance, from all the algorithms that are supported, quite a few are not easy to implement with the OpenCL programming model efficiently.
I think the documentation should say more about the implementation strategies of each algorithm, its complexity and how much it is parallelized.
Yes, implementing some algorithms in parallel can be much more challenging than implementing their sequential counterparts. But I don't think this challenge is insurmountable and I think we've had fairly good success in Boost.Compute (though, as always, there is never-ending work to be done on performance optimization). I'll work on improving the documentation for the algorithms as it relates to performance characteristics. For now the best resource would be the performance page [1] or running the performance benchmarks [2] manually on your system. Thanks for the feedback! -kyle [1] http://kylelutz.github.io/compute/boost_compute/performance.html [2] https://github.com/kylelutz/compute/tree/master/perf