Re: [boost] Going forward with Boost.SIMD

24 Apr 2013


      Mathias Gaunard <mathias.gaunard@ens-lyon.org> writes:
...
Automatic parallelization will never beat code optimized by
experts. Experts program each type of parallelism by taking into
account its specificities.
That is hyperbole.  "Never" is a strong word.
...
An interesting point in favor of a library is also memory layout. A
C++ compiler cannot change the memory layout on its own to make it
more friendly to vectorize. By providing the right types and
primitives to the user, he is made aware of the issues at hand and
empowered with the ability to explicitly state how a given algorithm
is to be vectorized.
I agree that libraries to make data shaping easier are useful!
...
...
For specialized operations like horizontal add, saturating arithmetic,
etc. we will need intrinsics or functions that will be necessarily
target-dependent.
The proposal suggests providing vectorized variants of all
mathematical functions in the C++ standard (the Boost.SIMD library
covers C99, TR1 and more). That's quite a lot of functions.
But not the special ones I mentioned.
...
Should all these functions be made compiler built-ins? That doesn't
sound like a very scalable and extensible approach.
I dunno, we do a lot of that here.
...
...
Vector masks fundamentally change the model.  They drastically affect
control flow.
Some processors have had predication at the scalar level for quite
some time. It hasn't drastically changed the way people program.
Scalar predication hasn't changed the way people program because
compilers do the if-conversion.  As it should be with vectors.
...
It is similar to doing two instructions in one (any instruction can
also do a blend for free), and optimizing those instructions done
separately into one is something that a compiler should be able to do
pretty well. It doesn't sound very unlike what a compiler must do for
VLIW codegen to me, but then I have little knowledge of compilers.
I have trouble seeing how one would use the SIMD library to make it
easier to write predicated vector code.  Can you sketch it out?
...
The fact that it is the library doesn't mean that the compiler
shouldn't perform on vector types the same optimizations that it does
on scalar ones.
Of course it will.  But the library user has already made the choice of
what to vectorize.  Many times it will be the right choice, but not
always.
...
While I can see the benefit of this feature for a compiler that wants
to generate SIMD for arbitrary code, dedicated SIMD code will not
depend on this too much that it cannot be covered by a couple of
additional functions.
Predication allows much more effecient vectorization of many common
idioms.  A SIMD library without support for it will miss those idioms
and the compiler auto-vectorizer will get better performance.
...
...
Longer vectors can also dramatically change the generated code.  It is
*not* simply a matter of using larger strips for stripmined loops.  One
often will want to vectorize different loops in a nest based on the
hardware's maximum vector length.
I don't see what the problem is here.
This is C++. You can write generic code for arbitrary vector
lengths. It is up to the user to use generative programming techniques
to make his code depend on this parameter and be portable. The library
tries to make this as easy as possible.
So the user has to write multiple versions of loops nests, potentially
one for each target architecture?  I don't see the advantage of this
approach.
...
...
A library-based short vector model like the SIMD library is very
non-portable from a performance perspective.
From my experience, it is still fairly reliable. There are differences
in performance, but they're mostly due to differences in the hardware
capabilities at solving a particular application domain well.
Well yes, that's one of the main issues.

                            -David

Re: [boost] Going forward with Boost.SIMD

dag＠cray.com