Can SIMD in boost unordered be disabled without disabling SIMD for full compile.
Hi, I have some small maps where I am pretty sure SIMD tricks that are amazing in general will just slow down the lookups since I never lookup items not in the map and map is so small that benefits of not looking up into the big array and trashing the cache are miniscule. I know I can force compiler to not support SIMD with march, but I want to keep the SIMD optimizaitons for rest of the code, including large boost unordered objects. If you do not believe me I can write a custom benchmark, from what I see all benchmarks in docs start at 1E+4. P.S. I know I can just use an array, but I am in that "size range" where iteration becomes slower than a hash map. regards, Ivan
El 10/03/2024 a las 22:40, Ivan Matek via Boost-users escribió:
Hi, I have some small maps where I am pretty sure SIMD tricks that are amazing in general will just slow down the lookups since I never lookup items not in the map and map is so small that benefits of not looking up into the big array and trashing the cache are miniscule. I know I can force compiler to not support SIMD with march, but I want to keep the SIMD optimizaitons for rest of the code, including large boost unordered objects.
Hi Ivan, Boost.Unordered support can be disabled by globally defining the macro BOOST_UNORDERED_DISABLE_SSE2, but this is probably not a solution for you because: * It disables SSE2 for *all* flat containers in Boost.Unordered. * The alternative implementation to SSE2 is likely to be slower anyway. May I suggest that you use boost::unordered_map instead of boost::unordered_flat_map for those small maps and see if this improves the performance of your program? Best, Joaquín M López Muñoz
May I suggest that you use boost::unordered_map instead of boost::unordered_flat_map for those small maps and see if this improves the performance of your program?
Thank you for the suggestion, I was hoping to keep all the elements sequentially in memory(general preference to make efficient use of cache, hard to microbenchmark), I will probably just stick to linear search or
On Mon, Mar 11, 2024 at 9:12 AM Joaquin M López Muñoz < joaquinlopezmunoz@gmail.com> wrote: try my luck with flat_map(that in my experience has quite bad performance, but might work fine for this case). regards, Ivan
participants (2)
-
Ivan Matek
-
Joaquin M López Muñoz