[boost] [compute] review

16 Dec 2014

      Hi,

Here is my review of Boost.Compute:

1. What is your evaluation of the design?

The library is based upon OpenCL, a cross-platform cross-device open
standard that abstracts access to and provides a programming model for
many-core vector co-processors such as GPUs. These co-processors are usually
referred to as "devices".

The library provides a wrapper layer around the OpenCL C interface. It skips
the standard OpenCL C++ wrapper which I don't consider a problem because
except for destructors there is no added value in using this wrapper. In my
opinion Khronos should adopt Boost.Compute as their C++ layer for OpenCL.

Boost.Compute provides compatibility with the OpenCL C interface through
conversion operators that decay Boost.Compute types to their OpenCL C
equivalents. This can be quite useful.

On top of this wrapper Boost.Compute exhibits 3 core components:

* types to interact with and issue commands to devices:
these follow OpenCL concepts but are not necessary if defaults are used

* means of managing memory (allocate, copy) on devices:
this component contains also asynchronous operations which I consider
essential in a library that deals with co-processors

* a collection of parallel primitives and meta-functions with an STL interface:
this components contains powerful iterators to combine containers and
algorithms to implement more complex algorithms in an efficient manner

One thing I'm not clear about is how asynchrony is handled. Command queues
are exposed, issuing commands to different queues is a way to express
concurrency. At the same time copy_async returns a future which is another
way of exposing concurrency.

It is out of the scope of Boost.Compute to solve the challenges of
asynchronous/concurrent operations because it is a different and difficult
topic not yet solved for C++ in general either, but at least the
documentation should be more explicit about which commands are executed
when, which commands are synchronous, which are asynchronous and what is the
role of the command_queue in this regard.

2. What is your evaluation of the implementation?

I did not evaluate the implementation in detail but looked at a few of the
tricks Boost.Compute uses to generate kernels. The implementation of this
part of the library is good and instructive.

3. What is your evaluation of the documentation?

Boost.Compute documentation is of excellent quality. The recent addition of
performance data is helpful. I could not find any documentation about fancy
iterators, this should probably be added. Also, it would be great if the my
questions regarding asynchrony/concurrency could be addressed in the
documentation.

4. What is your evaluation of the potential usefulness of the library?

Boost.Compute is extremely useful. With this library a developer familiar
with the STL can utilize the processing power of GPUs without any knowledge
of vector co-processor programming. The documentation shows that for large
vector sizes, some Boost.Compute algorithms outperform the STL by an order
of magnitude.

5. Did you try to use the library? With what compiler? Did you have any
problems?

I tried the unit-tests on a 8x GeForce Titan system without any problems and
on a ARM Mali GPU with some unit tests failing. I'll be working with the
library author to fix tcomp.lib.boost.develhe problems in these unit tests.
I used gcc 4.8.2 for the tests on both GeForce and Mali.

6. How much effort did you put into your evaluation? A glance? Aquick
reading? In-depth study?

I reviewed the library a few months ago in-depth and reread the
documentation for this review as well as ran some unit tests.

7. Are you knowledgeable about the problem domain?

My job involves working with both CUDA and OpenCL. Furthermore I am the
author of the Aura library [0] a similar, albeit lower level library for
accelerator
programming.

8. Do you think the library should be accepted as a Boost library?

I think the library should be accepted into Boost. The interface is simple
and easy to understand for non-experts and the benefits of using this
library can be significant.

I'd like to add that Boost.Compute represents one level of abstraction for
accelerator programming. I'd like the Boost community to keep an open mind
when it comes to different levels of abstraction, either lower (i.e. my Aura
library) or higher (i.e. VexCL). Libraries with different levels of
abstraction can coexist, be compatible with one another or could even build
upon one another.

Best Regards,
Sebastian Schaetz

[0] https://github.com/sschaetz/aura