Hi, I’m implementing the safe_float library (on c++11) proposed for the GSOC . I was planning to access FENV for detecting unsafe operations on the native floating point types and include some custom pre-post conditions for testing other floating types safety conditions. Conditions come from policies passed as template parameter to give users the possibility of chose their own set of safety conditions to be enforced. An example of the conditions that can be asked is enforcing overflows to infinite does not happen. An overflow to infinite can be detected by the FE overflow flag or because the addition of two finites numbers result in an infinite one. I just found today that clang++ does not support the pragma to use FENV yet. warning: pragma STDC FENV_ACCESS ON is not supported Googling a little I found out that neither does gcc. How are c++11 boost libraries dealing with these compiler issues? I got some options in mind: — Implement everything using custom detections, this will work in every compiler, but it will run slower for everyone too. — Finding a compiler with the support, implementing the library using the FENV, and assuming compilers will support it when the library is finished and reviewed. — Detect somehow the FENV pragma was not processed and have 2 implementations. No idea how to detect something like this. Had someone dealt with similar problem already? Is there a common approach? Best regards, Damian Vicino
On 6 May 2015 at 4:25, Damian Vicino wrote:
I was planning to access FENV for detecting unsafe operations on the native floating point types and include some custom pre-post conditions for testing other floating types safety conditions.
The only compiler I've ever known fenv to work right on is MSVC and ICC. But MSVC's FP support is extremely conservative by default, and a lot of people needing FP performance use /fp:fast which completely eliminates fenv. I just looked up the docs and even MSVC default doesn't enable all of fenv. For that you need /fp:strict or even /fp:except.
I just found today that clang++ does not support the pragma to use FENV yet. warning: pragma STDC FENV_ACCESS ON is not supported Googling a little I found out that neither does gcc.
This doesn't surprise me. fenv is very hard to optimise well. Most FP users would prefer undefined behaviour than very slow performance. I would doubt clang nor GCC will ever fully implement fenv due to insufficient demand.
How are c++11 boost libraries dealing with these compiler issues? I got some options in mind: — Implement everything using custom detections, this will work in every compiler, but it will run slower for everyone too. — Finding a compiler with the support, implementing the library using the FENV, and assuming compilers will support it when the library is finished and reviewed. — Detect somehow the FENV pragma was not processed and have 2 implementations. No idea how to detect something like this.
Had someone dealt with similar problem already? Is there a common approach?
Your mentor will surely advise, however I would recommend that your library perform bit checks of the floating point values for every operation. This will turn each single fp opcode into about eight opcodes, all of which stall the CPU for a good 30 cycles each. But correctness is more important than speed, and if the programmer is employing safe_float they surely really need correctness over performance. I would also examine how ubsan implements its safe float checks to see if you can lift some ideas. One x86 specific trick is to reinterpret SSE2 registers as integers for the bit checks, that way you don't force FP values back into memory and reload into GP registers every single operation. Performance might actually be tolerable. I would suspect you'll need to drop into assembler for that though, and MSVC doesn't permit inline assembler in x64. I'd also loop David Bellot into this and ask what he thinks. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
On 05/06/2015 05:48 AM, Niall Douglas wrote:
One x86 specific trick is to reinterpret SSE2 registers as integers for the bit checks, that way you don't force FP values back into memory and reload into GP registers every single operation. Performance might actually be tolerable. I would suspect you'll need to drop into assembler for that though, and MSVC doesn't permit inline assembler in x64. I'd also loop David Bellot into this and ask what he thinks. Niall
You should be able to implement all of the SSE operations you need using intrinsics, which are well-supported on all recent x86 compilers. Granted, you don't get direct control over whether values get spilled from registers back to memory (as the compiler still maintains control over that), but it's a lot easier to implement than inline assembly (especially with MSVC as a requirement). Intel has a great online reference of all intrinsics here: https://software.intel.com/sites/landingpage/IntrinsicsGuide/ While it says that the list is for Intel C++, in practice, gcc/clang/MSVC are almost fully compatible with Intel's set of SSE/AVX/AVX2 intrinsics (and probably AVX-512, which is coming soon to real hardware). Jason
On 6 May 2015 at 7:45, Jason Roehm wrote:
You should be able to implement all of the SSE operations you need using intrinsics, which are well-supported on all recent x86 compilers.
I had thought that MSVC does not permit reinterpret casting from FP to integer without register store and reload, however this stackoverflow https://stackoverflow.com/questions/13631951/bitwise-cast-from-m128-to -m128i-on-msvc says you can tell MSVC to not error out during the cast using the magic _mm_castpd_si128 intrinsic. Useful to know. Thanks. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/
How are c++11 boost libraries dealing with these compiler issues? I got some options in mind: — Implement everything using custom detections, this will work in every compiler, but it will run slower for everyone too. — Finding a compiler with the support, implementing the library using the FENV, and assuming compilers will support it when the library is finished and reviewed. — Detect somehow the FENV pragma was not processed and have 2 implementations. No idea how to detect something like this.
Had someone dealt with similar problem already? Is there a common approach?
No, but you've had some good suggestions, and I would just add my +1
that FENV is unlikely to be very widely available due to the issues
outlined.
Some other resources that may help you:
* If you're targeting C++11 then you can reasonably rely on
std::isfinite and friends.
* If as suggested you want to try bit-fiddling to get the status of an
FP value, then that code is already in Boost.Math (see fpclassify.hpp
and includes thereof) and you can steal as required ;-) Obviously any
software solution will be *much* slower than getting the hardware to
raise an exception for you.
* When it comes to testing, libs/multiprecision/test/test_arithmetic.hpp
has a template function "test_arithmetic" that's designed to test every
conceivable arithmetic operator (and permutations thereof).
* In fact it occurs to me that you could implement what you had in mind
in about 10 minutes using Boost.Multiprecision - albeit in a more
heavyweight and less flexible manner than a dedicated solution.... in
fact I've attached sample code below. Given that we known that the
"abstraction overhead" of boost::multiprecision::number with
arithmetic_backend is very small, this might make a quick/easy/dirty way
to test various FP-testing methods?
HTH, John.
Here's the test code:
#include
Some other resources that may help you:
* If you're targeting C++11 then you can reasonably rely on std::isfinite and friends.
This is how I implemented the first draft version. The idea was to have something better for natives and fallback to this implementation when using multi precision or other floating-point implementation.
* If as suggested you want to try bit-fiddling to get the status of an FP value, then that code is already in Boost.Math (see fpclassify.hpp and includes thereof) and you can steal as required ;-) Obviously any software solution will be *much* slower than getting the hardware to raise an exception for you.
Good to know.
You should be able to implement all of the SSE operations you need using intrinsics, which are well-supported on all recent x86 compilers.
I never worked with intrinsics. They look cool, I will give them a read and see what I can use.
participants (4)
-
Damian Vicino
-
Jason Roehm
-
John Maddock
-
Niall Douglas