It can be important for O_DIRECT AIO operations. I agree that for buffered I/O, the filesystem overhead will dominate (and, on Linux, you don't have a way to implement futures over buffered I/O without resorting to threads, which will slow things down further).
Actually, on recent Linuces with ext4 reads from page cached data are now kaio wait free. It makes a big difference for warm cache filesystem when you're doing lots of small reads. Writes unfortunately still lock, and moreover exclude readers. Linux has a very long way to go to reach BSD and especially Windows for async i/o.
Optimizing away one allocation to create a promise/future pair (or just a future for make_ready_future) will have no measurable impact in the context of any I/O, be it wait free or asynchronous or both. In general, all I'm hearing on this thread is 'it could be helpful', 'it should be faster', 'it can be important', or 'makes a big difference', etc. I was hoping that we as a Boost community can do better! Nobody so far has shown the impact of this optimization technique on a real world applications (measurements). Or at least, measurement results from artificial benchmarks under heavy concurrency conditions (using decent multi-threaded allocators like jemalloc or tcmalloc). I'd venture to say that there will be no measurable speedup (unless proven otherwise). Regards Hartmut --------------- http://boost-spirit.com http://stellar.cct.lsu.edu