Hi,
not a proper review, just some comments from a library user based on studying
documentation and writing a couple of simple programs using Boost.Compute on
Intel's and NVIDIA's OpenCL implementations (i7 CPU and GTX970 GPU
respectively). I used MSVC2013 for the tests.
Some of these comments I've shared previously on this list, but I'm repeating
them here for completeness.
I have some background in heterogeneous development and OpenCL in particular,
but I'm no expert. I found Boost.Compute easy to pick up and providing a nice
C++ layer, with a nice mix of low and high level.
Comments in no particular order:
1) As a library user and one who's not a C++ guru by any means, the error
messages thrown up when passing incorrect parameters can be quite confusing.
For example, first time I tried to use Boost.Compute I managed to not pass the
ouput iterator parameter to transform(), so that the lambda expression ended up
in that position instead. This resulted in several very confusing errors inside
result_of.hpp, with no errors anywhere near my or Boost.Compute source code.
It would be nice if the compilation could fail in more informative manners.
2) I did miss async versions of the algorithms, so it's possible to chain
together multiple calls. Even though all the data sits on the compute device,
the overhead of waiting for each operation to finish before queuing the next can
make the compute gains completely irrelevant.
3) I think relevant calls should have a non-throwing form returning an error
code, ala Boost.ASIO.
4) Threading has been mentioned in this list. At the very least the
documentation should be updated to clearly state the thread safety of each
call/class, ala Boost.ASIO. Beyond that I would like to be able to share a
context between several threads, each which could contain their own queue and
device.
5) If thread safety or similar important features requires a compiled part, I
wouldn't mind Boost.Compute being non-header-only. Though if possible making it
optional like other Boost libraries do would be great.
6) The tests, and preferably the performance tests as well, should have some way
of testing all devices across all platforms. Currently they only test the
default device on the default platform it seems. This is insufficient IMHO.
7) I'd like some way of defining user functions either as lambda expressions,
raw OpenCL-C code via make_function_from_source or both, which can be used in
lambda expressions for say transform(). Something like
function