On 16 October 2016 at 21:45, Michael Marcin
You can get a decent win purely through improved cache utilization by keeping this cold data out of the way.
And stick all the cold data in an AoS?
This is still a toy example but it's closer to something real.
Yes, but is 1M particles common? AoS in 6.54421 seconds
SoA in 5.91915 seconds SoA SSE in 3.58603 seconds
1M particles on my Ci3 5005U 2.0GHZ/AVX2/4GB laptop / WIN10 / Clang/LLVM 4.0: AoS in 14.7198 seconds SoA in 13.5969 seconds SoA SSE in 8.78095 seconds I've run this with a count of 25'000 and it shows something(s) interesting: AoS in 0.274145 seconds SoA in 0.312875 seconds SoA SSE in 0.0768812 seconds 1. SoA slower than AoS. 2. SoA SSE way faster (relatively) than either SoA and AoS. You've definitely made your case, when using SSE. I'll have a rethink. degski