Re: [boost] [Math] float16 on ARM

15 Oct 2019

      Matt Hurd wrote:
...
IEEE 16bit (fp16) and bfloat16 are both around, but bfloat16 seems to be
the new leader in modern implementations thanks to ML use. I haven't
experienced both used together but I wouldn't rule it out given bfloat16
may be accelerator specific.  Google and intel have support for bfloat16 in
some hardware. bfloat16 makes it easy to move to fp32 as they have the same
exponent size.
Refs: https://en.wikipedia.org/wiki/Bfloat16_floating-point_format
https://nickhigham.wordpress.com/2018/12/03/half-precision-arithmetic-fp16-v...
According to section 4.1.2 of this ARM document:
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0053d/IHI0053D_acle_2_1....

implementations support both the IEEE format (1 sign, 5 exponent and 10
mantissa) and an alternative format which is similar except that it doesn't
support Inf and NaN, and gains slightly more range.  Apparently the bfloat16
format is supported in ARMv8.6-A, but I don't believe that is deployed anywhere
yet.

The other place where I've used 16-bit floats is in OpenGL textures,
(https://www.khronos.org/registry/OpenGL/extensions/OES/OES_texture_float.txt),
which use the 1-5-10 format.

I was a bit surprised by the 1-5-10 choice; the maximum value that can
be represented is only 65504, i.e. less than the maximum value for an
unsigned int of the same size.

bfloat16 can be trivially implemented (as a storage-only type) simply
by truncating a 32-bit float; perhaps support for that would be useful
too?

Regards, Phil.

Re: [boost] [Math] float16 on ARM

Phil Endecott