On 10/02/2017 11:18, TONGARI J via Boost wrote:
Question: why not provide load/store as member functions (like std::atomic)?
I'm fond of having everything int he API as free functions. If you construct a pack from a pointer, it assuems it's aligned however.
I just tried some simple examples (e.g. saxpy) with Boost.SIMD but I got a bit disappointed since it performs slower than the normal scalar code (/arch:AVX2, with BOOST_SIMD_ASSUME_SSE4_2) on VS2015. The assembly shows that VC already vectorizes the scalar code, and does better than the handwritten SIMD code.
SAXPY is a trivial case where autovectorizer works very well so I'm not surprised. I would suggest using some more complex functions (like exp2, cosh or w/e) or revise the Boost.SIMD manual code you wrote :) Did you used raw loops or loop over one of our range adaptors ? MSVC is also one compiler we fight hard to get good codegen casue sometimes, it is a bit icky :/ Best regards