But not on ARM or PowerPC says the embedded guy. Though both Gcc and clang support __builtin_popcount() across all processors, which is what we are using these days. I did notice one library does a call out openCl for popcount () I honestly was thinking about the same sort approach, though all compile time based. But I wasn’t sure what you would do with a signed value? As the builtin treats everything as unsigned. Do you mask out the sign bit or just cast it, or consider it an error to take a bit count of a signed value? Sent from my iPhone
On Aug 22, 2018, at 10:08 PM, Gavin Lambert via Boost
wrote: On 23/08/2018 09:16, Andrey Semashev wrote: I think such an optimization would be useful. Note that MSVC also has intrinsics for popcount[1], although I don't think those are supported when the target CPU doesn't implement the corresponding instructions. You would have to check at compile time whether the target CPU supports it (e.g. by checking if __AVX__ is defined).
While compile-time detection is better, if you can do it (because it lets it be completely inlined); if the compile-time detection fails, you can still do runtime detection, eg. by defining something like:
// header file extern int (*popcnt64)(uint64_t);
// source file static bool is_popcnt_supported() { int info[4] = { 0 }; __cpuid(info, 1); return (info[2] & 0x00800000) != 0; }
static int popcnt64_intrinsic(uint64_t v) { return /* _mm_popcnt_64(v) or __builtin_popcountll(v) */; }
static int popcnt64_emulation(uint64_t v) { // code that calculates it with bit twiddling }
static int popcnt64_auto(uint64_t v) { popcnt64 = is_popcnt_supported() ? &popcnt64_intrinsic : &popcnt64_emulation; return popcnt64(v); }
int (*popcnt64)(uint64_t) = &popcnt64_auto;
Repeat for other argument sizes as needed. You could probably do something fancier with C++11 guaranteed static initialisation, but this will work on all compilers.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost