Review Request : Boost.SIMD

Joel FALCOU

9 Feb 2017 9 Feb '17

6:24 p.m.

Can Boost.SIMD please be added to the review queue? Michael Caisse has volunteered to act as the review manager. Boost.SIMD is an efficient modern C++ wrapper for SIMD instructiosn sets and computations that aims at providing a standard way to write code taking advantages of SIMD enabled hardwares. Boost.SIMD repository https://github.com/NumScale/boost.simd Documentation https://developer.numscale.com/boost.simd/documentation/develop/ Thanks in advance Joel Falcou - CTO @ NumScale

Show replies by date

Oswin Krause

9 Feb 9 Feb

8:54 p.m.

On 2017-02-09 19:24, Joel FALCOU via Boost wrote:

...

Can Boost.SIMD please be added to the review queue? Michael Caisse has volunteered to act as the review manager.

Boost.SIMD is an efficient modern C++ wrapper for SIMD instructiosn sets and computations that aims at providing a standard way to write code taking advantages of SIMD enabled hardwares.

Boost.SIMD repository https://github.com/NumScale/boost.simd

Documentation https://developer.numscale.com/boost.simd/documentation/develop/

Thanks in advance

Joel Falcou - CTO @ NumScale

Hi, nice work! I have waited to see this going into review. I would be happy to try this out on my code that currently uses compiler vectorization. The documentation looks good. May I ask one or two things? 1. Is simd::pack a POD? i.e. can aligned memory be cast into a pack? float* aligned_memory=...; simd::pack<float>* packed_memory = (simd::pack<float>*) aligned_memory; (this would make my experiment a two-line change) 2. Is there a difference between the allocators in boost/align and what simd offers? Best, Oswin

Glen Fernandes

9:05 p.m.

...

2. Is there a difference between the allocators in boost/align and what simd offers?

Boost.SIMD does already make use of Boost.Align for the alignment-aware allocator and pointer alignment functions (and a few other things, and Joel and Charly are adding to Boost.Align as necessary). Also: Great. :-) I'm very excited to see that Boost.SIMD enter the review queue. Glen -- View this message in context: http://boost.2283326.n4.nabble.com/Review-Request-Boost-SIMD-tp4691571p46915... Sent from the Boost - Dev mailing list archive at Nabble.com.

Joel FALCOU

10 Feb 10 Feb

7:45 a.m.

On 09/02/2017 21:54, Oswin Krause via Boost wrote:

...

On 2017-02-09 19:24, Joel FALCOU via Boost wrote:

...
Can Boost.SIMD please be added to the review queue? Michael Caisse has volunteered to act as the review manager.

Boost.SIMD is an efficient modern C++ wrapper for SIMD instructiosn sets and computations that aims at providing a standard way to write code taking advantages of SIMD enabled hardwares.

Boost.SIMD repository https://github.com/NumScale/boost.simd

Documentation https://developer.numscale.com/boost.simd/documentation/develop/

Thanks in advance

Joel Falcou - CTO @ NumScale

Hi,

nice work! I have waited to see this going into review. I would be happy to try this out on my code that currently uses compiler vectorization.

Thanks for your interest.

...

The documentation looks good. May I ask one or two things?

1. Is simd::pack a POD? i.e. can aligned memory be cast into a pack?

float* aligned_memory=...; simd::pack<float>* packed_memory = (simd::pack<float>*) aligned_memory;

(this would make my experiment a two-line change)

It is. We however advise using our load/aligned_load functions so you can access more fine-tuned operations and we can guarantee that the proper intrinsic are emitted. We had cases where such reintrepret_cast lead to subtle bugs on some compilers/arch combination or sub-optimal performances.

...

2. Is there a difference between the allocators in boost/align and what simd offers?

The one we provide are just wrapper around Boost.Alignment so that their default status is compatible with current architecture SIMD alignment constraints, ie a boost::simd::allocator<float> will auto-align on the proper alignment for your CPU without having to remember it. We tried to move as much as we can of our alignment handling to Boost.Align where it belongs. This is the same for Boost.Predef that contains the SIMD macro detection.

Jason Roehm

3:25 p.m.

On 02/10/2017 02:45 AM, Joel FALCOU via Boost wrote:

...

Thanks for your interest.

...
The documentation looks good. May I ask one or two things?

1. Is simd::pack a POD? i.e. can aligned memory be cast into a pack?

float* aligned_memory=...; simd::pack<float>* packed_memory = (simd::pack<float>*) aligned_memory;

(this would make my experiment a two-line change)

It is. We however advise using our load/aligned_load functions so you can access more fine-tuned operations and we can guarantee that the proper intrinsic are emitted. We had cases where such reintrepret_cast lead to subtle bugs on some compilers/arch combination or sub-optimal performances.

Does Boost.SIMD support non-aligned loads/stores? Aligned data is ideal, but having the ability to load/store from addresses that aren't aligned (and emitting the appropriate instructions to do so safely, e.g. _mm_loadu_ps() instead of _mm_load_ps() with SSE) can be important for some applications also. That would allow the use of SIMD without having to do an explicit realignment first. On some architectures (e.g. post-Nehalem x86 CPUs with SSE), the aligned/unaligned penalty is very small also. The library looks very good in my cursory glance over the documentation; I'm looking forward to see it get a review. Jason

Joel FALCOU

6:20 p.m.

On 10/02/2017 16:25, Jason Roehm via Boost wrote:

...

Does Boost.SIMD support non-aligned loads/stores? Aligned data is ideal, but having the ability to load/store from addresses that aren't aligned (and emitting the appropriate instructions to do so safely, e.g. _mm_loadu_ps() instead of _mm_load_ps() with SSE) can be important for some applications also. That would allow the use of SIMD without having to do an explicit realignment first. On some architectures (e.g. post-Nehalem x86 CPUs with SSE), the aligned/unaligned penalty is very small also.

Yes we both have aligned_load, aligned_store and load,store to handle all cases.

...

The library looks very good in my cursory glance over the documentation; I'm looking forward to see it get a review.

Thanks

TONGARI J

10:18 a.m.

2017-02-10 2:24 GMT+08:00 Joel FALCOU via Boost <boost@lists.boost.org>:

...

Can Boost.SIMD please be added to the review queue? Michael Caisse has volunteered to act as the review manager.

Boost.SIMD is an efficient modern C++ wrapper for SIMD instructiosn sets and computations that aims at providing a standard way to write code taking advantages of SIMD enabled hardwares.

Boost.SIMD repository https://github.com/NumScale/boost.simd

Documentation https://developer.numscale.com/boost.simd/documentation/develop/

Question: why not provide load/store as member functions (like std::atomic)? I just tried some simple examples (e.g. saxpy) with Boost.SIMD but I got a bit disappointed since it performs slower than the normal scalar code (/arch:AVX2, with BOOST_SIMD_ASSUME_SSE4_2) on VS2015. The assembly shows that VC already vectorizes the scalar code, and does better than the handwritten SIMD code. That said, I believe it's a useful library.

Joel FALCOU

6:22 p.m.

On 10/02/2017 11:18, TONGARI J via Boost wrote:

...

Question: why not provide load/store as member functions (like std::atomic)?

I'm fond of having everything int he API as free functions. If you construct a pack from a pointer, it assuems it's aligned however.

...

I just tried some simple examples (e.g. saxpy) with Boost.SIMD but I got a bit disappointed since it performs slower than the normal scalar code (/arch:AVX2, with BOOST_SIMD_ASSUME_SSE4_2) on VS2015. The assembly shows that VC already vectorizes the scalar code, and does better than the handwritten SIMD code.

SAXPY is a trivial case where autovectorizer works very well so I'm not surprised. I would suggest using some more complex functions (like exp2, cosh or w/e) or revise the Boost.SIMD manual code you wrote :) Did you used raw loops or loop over one of our range adaptors ? MSVC is also one compiler we fight hard to get good codegen casue sometimes, it is a bit icky :/ Best regards

Mathias Gaunard

11:15 p.m.

On 10 February 2017 at 10:18, TONGARI J via Boost <boost@lists.boost.org> wrote:

...

Question: why not provide load/store as member functions (like std::atomic)?

There is a similar library being worked on by people involved with the C++ standards committee, currently being reviewed by LEWG. The interface for it is yet again different but the lack of consistency with atomic was noted and will probably be addressed soon. I suggest following those efforts and aligning with them.

3080

Age (days ago)

3081

Last active (days ago)

List overview

Download

8 comments

6 participants

participants (6)

Glen Fernandes
Jason Roehm
Joel FALCOU
Mathias Gaunard
Oswin Krause
TONGARI J