On 2/19/2016 12:48 AM, Benedek Thaler wrote:
On Thu, Feb 18, 2016 at 8:51 AM, Michael Marcin
wrote: I think there is definitely something not quite right.
One thing I notice is that the line: sample.push_back(get_clock() - base_clock); Is being timed. Which could potentially access a new memory page I think.
In pin_thread you have _MSVC_VER instead of _MSC_VER
Good catch, fixed this.
Hacking up std::vector and adding a push_back_reserved that does no bounds checks I get consistently very strange timings. (Code included at the bottom to make it obvious that the latter should be no slower).
My simpler, and I'm sure in many ways inferior benchmark, shows much more believable results.
Still, there are huge differences between essentially same programs.
http://www.intel.com/content/dam/www/public/us/en/documents/white-papers/ia-...
I gave a spin to rdtscp, no success, but I continue looking at this doc.
Thanks, Benedek
FWIW (probably not much this time) I took a shot at porting my test to the google/benchmark library. Mostly just to learn the library but also to see if there were any glaring discrepancies. Not sure I'm using it right but the relative results seem in line with my expectations. results: http://codepad.org/U66O82X8 source: http://codepad.org/rii3BHly