I would like to further investigate your sqrt implementation.
It seems the link to the whitepaper for sqrt has broken over the years.
I found it again here:>> http://www.realitypixels.com/turk/computergraphics/FixedSqrt.pdf OK. Thank you. I will try to bench the iterative sqrt algorithm for,> let's say, 32-bit signed fixed-point on a 32-bit microcontroller.> If I get any sensible results compared with our polynomial> approximation, I will forward them to the thread. I have benchmarked the iterative shift-and-add sqrt functioncompared with the polynomial approximation used inthe proposed Boost.Fixed_point. The benchmark usedbare-metal embedded microcontrollers.
I used a 32-bit microcontroller at 24MHz and also an 8-bitmicrocontroller at 16MHz. I used an argument of 6/10 forthe sqrt function. The fixed_point type used was:boost::fixed_point::negatable<7, -24>. Timing measurementshave been done with a digital ascilloscope. GCC 5.2 and 5.3have been used.
On 32-bit microcontroller:
* polynomial approx : 6 micro-sec, sqrt(6/10) = 0.7745966
* Shift-and-add :15 micro-sec, sqrt(6/10) = 0.7745967
On 8-bit microcontroller:
* polynomial approx. : 180us* Shift-and-add : 450us
Here, the unit [us] means microseconds.
The benchmark code is available at develop branch here:
https://github.com/BoostGSoC15/fixed_point/blob/develop/example/fixed_point_...
Best regards, Chris.
On Sunday, October 16, 2016 9:20 PM, Michael Marcin
I would like to further investigate your sqrt implementation.
It seems the link to the whitepaper for sqrt has broken over the years. I swear archive.org is the only thing keeping us from another dark age.
I found it again here: OK. Thank you. I will try to bench the iterative sqrt algorithm for,let's say, 32-bit signed fixed-point on a 32-bit microcontroller.If I get any sensible results compared with our polynomialapproximation, I will forward them to the thread. One disadvantage of polynomial approximation is digit lossdue to argument reduction. Iterative and CORDIC seem tonot suffer from this, and cam be good options. Thanks for the suggestions.
Two things of note: I'm pretty sure that sqrt is missing a bit of code to handle an odd number of fractional bits (I never used any types that had these but is probably important for a fully generic lib). I'm pretty sure there is a low hanging fruit optimization opportunity to use CLZ to skip a bunch of iterations before starting the loop. _______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost