Re: [boost] [afio] AFIO review postponed till Monday

22 Jul 2015

      On 21 Jul 2015 at 20:07, Glen Fernandes wrote:
...
I hope you're not making this harder on yourself than it needs to be.
Perhaps I need to understand better: You must have felt the library
was functionally stable and maybe even fit for production code at some
point (because it was in the review queue for long time), right?.
Firstly thank you for your comments.

End of summer 2013 after GSoC until start of 2015 the API barely 
changed. I did rewrite the internal engine twice and kept adding more 
testing, but externally all you would have seen was a lot more 
performance and much fewer faults (zero eventually) in the thread 
sanitiser.
...
At
some point in the last year you decided on a major update (i.e.
involving the "lightweight future promises" you mentioned) that alter
the interface dramatically, that would require updating the tutorial
and documentation.
Strictly speaking, if you did a find and replace in all your files 
replacing "async_io_op" with "future<>", and replacing all use of 
async_io_op::get() with get_handle(), your previous source code would 
now work with the "majorly updated" API.

This is the present tutorial code examples.
...
Without going into too much detail about that
change: Does it significantly make using AFIO easier?
Quite a few people have disliked (a) the choice of future 
continuations over ASIO's async_result pattern and (b) the batch API. 
Those two observations have come up repeatedly - Bjorn and Robert on 
this list have both publicly found issue there, and neither was alone 
in their opinion.

I'm not dropping futures in favour of async_result - I don't think 
that helps ease of use because in file i/o you really do want strong 
i/o ordering, and you usually don't care about ordering much for 
network i/o. Forthcoming C++ 1z coroutines are also futures based, 
and that decision is not going to be reversed now. Futures are our 
future as it were.

However I could do something about the performance penalty that 
futures have over async_result. I believe I have eliminated it 
(untested claim) and you can emulate async_result easily in 
lightweight futures by a const lvalue ref consuming continuation 
(which you can add infinite numbers of). That should make those 
preferring async_result happy.

That leaves the batch API. I personally quite like it, but I can see 
it confuses people. So the new API presents a more traditional 
unsurprising API which I think Robert amongst others should prefer.
...
Does it significantly improve AFIO  performance?
In no meaningful way, no. The cost of the i/o is many orders of 
magnitude higher than anything I could do in AFIO. I could stick a 
for loop counting to a million in there and I doubt anyone would 
notice.

Where there was a performance problem was in the continuations. If 
you were scheduling non-i/o continuations, the overhead of the 
continuations machinery was onerous because we were vectoring through 
the stable ABI layer, so everything had to be type erased and 
reconstituted at least three times. Lightweight future-promise is I 
believe very close to optimally lightweight now as it all happens in 
the TU at the point of compilation, no ABI traversals.
...
...
All that said, apart from the tutorial any early observations about anything within https://boostgsoc13.github.io/boost.afio/ are welcome.
Some notes from a brief first glance at the code yesterday:
1. I might be mistaken, but are you using undocumented NT APIs for the
Windows specific implementation?
Exclusively yes. As of v1.3 I avoid Win32 entirely. There were bugs 
when I tried to use both together, and since I dropped Win32 
completely things are much better.
...
I was under the impression that you
wanted AFIO to be used in production code; i.e. this is intended to be
a more practical library than an experimental one. I'm surprised the
use of undocumented APIs has not backfired yet in your testing.
The NT kernel API is exceptionally stable as any changes to it cost 
Microsoft and anyone who writes device drivers dearly. Before I use 
any NT kernel API I examine when it entered NT and if it has ever 
changed in any release since, with the Windows XP kernel being my 
minimum supported kernel. To my knowledge, apart from a few small 
easily removed places AFIO should even work perfectly on NT 4.0 - I 
have been very conservative in my choices of what kernel APIs to use.

The only backfire found to date is the asynchronous directory 
enumeration via the WOW64 layer where Microsoft has a bug in their 
WOW64 syscall parameter repacking code, so this only affects x86 
binaries on a x64 kernel. I had a buddy in Microsoft ask around about 
it, and it turns out that precisely nowhere in any Microsoft code is 
anyone ever doing an asynchronous directory enumeration, and hence 
that bug (which is confirmed) was never noticed till now. They were 
actually quite surprised that asynchronous directory enumeration 
works.

It's a wontfix unfortunately, but it's very easy for me to workaround 
because the segfault is caused by ASIO expecting a pointer to come 
back from IOCP and it's randomly getting a truncated invalid pointer 
it tries to dereference. If I don't hand off the async to ASIO I work 
around the problem.
...
2. Examples are a little alarming: If an example in the documentation
contain #ifdef WIN32 or #ifdef __FreeBSD__, it makes someone wonder
how portable AFIO really is. Your examples should not make using the
library look complicated by being longer than they need to be: If they
contain #if 0 blocks, they are just that much harder to read.
This is a very thought provoking observation which has changed what 
I'm going to do about the tutorial. Thank you.

Unlike some other C++ libraries, this is a platform abstraction 
library rather than a C++ abstraction library the same way ASIO is. 
Where it is possible without too much performance loss, I hide 
platform specific quirks as ASIO does. However, where those quirks 
are unavoidable, it's really best to document and push the problem 
onto the end user (also same as ASIO). For example, NTFS has lazy 
metadata flushing across handles, so some of the code examples you'll 
see in the docs quite literally sleep for five seconds to let the 
NTFS lazy flusher synchronise metadata across multiple file handles. 
Another example is that FreeBSD cannot track file renames, only 
directory renames, so you may see me ifdef in an otherwise 
unnecessary shim directory to make the code example work properly. 
Why not refactor the code example to remove file tracking? Because 
the FreeBSD kernel folk are still actively considering fixing BSD, so 
for all I know it could get fixed next BSD release.

I personally think that if you are a person who needs something as 
niche as asynchronous file i/o, then you are *very* interested in 
what platform specific quirks you'll need to know about if you're 
writing high performance filing system code. Hence I chose to leave 
in ifdef quirks workarounds in the tutorial and code examples, and 
for those type of people with that use case those are very valuable 
to know about. The same rationale applies to choice of filesystem 
traversal algorithm, I think that's why I left in #if 0 alternatives 
for people to experiment with themselves (I know they do, I've had 
email from people who switch on the other ifdef branches and then 
email me to ask why I think they are so much slower, I tell them my 
best speculations, but in the end who really knows?).

All that said, it makes the tutorial much less of a tutorial and much 
more of a cheatsheet on filing system quirks. I am kinda assuming the 
reader is only here because they absolutely need async file i/o, and 
are therefore highly skilled in that topic much as someone 
approaching uBLAS probably has a maths degree.

What your comment has made me realise is that it would make much more 
sense if 
there were a "normal persons tutorial" and an "advanced users 
tutorial" where 
the former is a nice hand holding all-portable baby steps thing, and 
the latter is stuff like writing distributed mutual exclusion 
algorithms solely via atomic append onto the filesystem like is in 
the current tutorial.

How does that plan sound?

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/

Re: [boost] [afio] AFIO review postponed till Monday

Niall Douglas