[boost] Universal async i/o (was: Re: [Fibers] Performance)

24 Jan 2014

      On 24 Jan 2014 at 13:13, Bjorn Reese wrote:
...
...
I'm having trouble understanding this.  A chained operation must by
definition be one operation being called as some other operation
completes, and can never possibly refer to operations running in parallel.
Think of the execution of chained operations as analogous to the
execution of CPU instructions.
Niall has already explained the situation where all chained operations
should be passed to the scheduler to avoid latency. This is analogous
to avoid flushing the CPU pipeline.
That's a good analogy, but there are significant differences in 
orders of scaling. Where a pipeline stall in a CPU may cost you 10x, 
and a main memory cache line miss may cost you 200x, you're talking a 
50,000x cost to a warm filing system cache miss. There are also very 
different queue depth scaling differences, so for example the SATA 
AHCI driver on Windows gets exponentially slow if you queue more than 
a few hundred ops to it simultaneously, whereas the Windows FS cache 
layer will happily scale to tens of thousands of simultaneous ops 
without blinking. How many FS cache layer ops turn into how many SATA 
AHCI driver ops is very non-trivial, and essentially it becomes a 
statistical analysis of black box behaviour which I would assume is 
not even static across OS releases.
...
You can also have chained operations that are commutative, so the
scheduler can reorder them for better performance. This is 
analogous
to out-of-order CPU execution.
Indeed that is the very point of chaining: you can say to AFIO that 
this group A here of operations can complete in any order and I don't 
care, but I don't want that this group B here of operations to occur 
until the very last operation in group A completes. This affords 
maximum scope to the OS kernel to reorder operations to complete as 
fast as possible without losing data integrity/causing races. It's 
this sort of metadata that the ASIO callback model simply doesn't 
specify.

It's actually really unfortunate that more of this stuff isn't 
documented explicitly in OS documentation. If you're into filing 
systems, then you know it, but otherwise people just assume that 
reading and writing persistent data is just like any other kind of 
i/o. The Unix abstraction of making fd's identical for any kind of 
i/o when there are very significant differences underneath in 
semantics is mainly to blame I assume.

Niall

-- 
Currently unemployed and looking for work in Ireland.
Work Portfolio: http://careers.stackoverflow.com/nialldouglas/

[boost] Universal async i/o (was: Re: [Fibers] Performance)

Niall Douglas