Hi,
My code, when run as a 2048-point FFT, agrees with the MIT FFTW one to a max error margin of 6 parts per million.
This might be of interest: I've ported FFTW's arbitrary-precision FFT to boost::multiprecision https://github.com/neapel/vexcl-fft-accuracy/blob/master/multi_precision_fft..., if you want to compare the numerical performance of your implementation with FFTW as in http://www.fftw.org/accuracy/Pentium4-3.60GHz-icc/. It's more meaningful than comparing two low-precision results. Generally, I think it would be better if a boost::fft library would primarily be a wrapper around existing FFT libraries, with the C++ implementation only used as a fallback for multiprecision or licensing issues since it's unlikely a template implementation would catch up with the years of optimization work that went into single- and double-precision libraries like MKL and FFTW, especially FFTW's kernel generator and planner. But a reasonably fast (and open) multidimensional multiprecision FFT implementation doesn't seem to exist yet. Cheers, -- Pascal Germroth