On Fri, Mar 20, 2015 at 5:19 AM, Giovanni Piero Deretta
On 19 Mar 2015 19:51, "Niall Douglas"
wrote: On 19 Mar 2015 at 18:05, Giovanni Piero Deretta wrote:
Your future still allocates memory, and is therefore costing about 1000 CPU cycles.
1000 clock cycles seems excessive with a good malloc implementation.
Going to main memory due to a cache line miss costs 250 clock cycles, so no it isn't. Obviously slower processors spin less cycles for a cache line miss.
Why would a memory allocation necessarily imply a cache miss. Eh you are even assuming an L3 miss, that must be a poor allocator!
Anyways, the plan is to add support to a custom allocator. I do not
think
you can realistically have a non allocating future *in the general case* ( you might optimise some cases of course).
We disagree. They are not just feasible, but straightforward, though if you try doing a composed wait on them then yes they will need to be converted to shared state. Tony van Eerd did a presentation a few C++ Now's ago on non-allocating futures. I did not steal his idea subconsciously one little bit! :)
I am aware of that solution My issue with that design is that it require an expensive rmw for every move. Do a few moves and it will quickly dwarf the cost of an allocation, especially considering that an OoO will happily overlap computation with a cache miss, while the required membar will stall the pipeline in current CPUs (I'm thinking of x86 of course). That might change in the near future though.
Just to be clear, my non-allocating promise/future was meant to be more of a proof-of-concept, not necessarily optimal efficiency. I would need to recheck, but I think most of it requires only acquire or release, not full sequential consistency, if that helps. I rarely write anything that requires full sequential consistency. Anyhow, carry on. I like that there are people looking into this stuff. Looking forward to the outcomes. Tony