Re: [boost] [afio] AFIO review postponed till Monday

22 Jul 2015

      On 22 Jul 2015 at 7:58, glenfernandes wrote:
...
...
Really with async file i/o it's always more a question of control rather
than performance. In the naïve case performance will usually be higher by
going synchronous.
I might be misunderstanding you, so I will try harder to get on the same
page. Even though English is the only language I know, I know it poorly. :-)
No problem.
...
Why would anyone interested in reviewing AFIO care about getting more
performance by going synchronous? The reason they're interested in an
asynchronous file I/O library is because they need asynchronous file I/O,
right?
Most people who think they need async file i/o don't need it and 
shouldn't use it.

Conversely, most people who write code which uses a filesystem path 
don't realise they've just written buggy racy code which could 
destroy other people's data and open security holes.
...
Control over performance sounds great, but it's not control over performance
if it comes at a cost of performance, right? [Example: I see the support of
the C++ allocator model as something which can sometimes offer control over
performance, but in no way does it make things any slower at runtime when
the allocator supplied is std::allocator than if the code just used 'new'
and 'delete'. --end example]
You've got it exactly right: you sacrifice some average performance 
in exchange for control over worst case performance.

Same goes for the race free filesystem code - they come with a 
performance cost due to POSIX not providing a race free sibling file 
open API, so AFIO must iterate parent directory opens with inode 
lookups and back off and retry until success. Windows doesn't have 
this problem. It's been raised with the AWG, and there was some 
sympathy for the omission.
...
I thought motivation to use your library would be one or more of:
- Simplicity (makes it easier to write maintainable file I/O code)
- Portability (saves me time from writing platform specific code)
- Performance (it is faster than code I would write by hand)
I would say all three of these yes.
...
On simplicity: If someone does not care about portability, can they write
smaller/cleaner/more-maintainable code if they choose to use AFIO versus
using overlapped I/O with IOCPs or KAIO? Does it sacrifice any simplicity
for portability?
If you didn't care about portability, if you write your code in WinRT 
then all your i/o is async and that is probably the nicest way of 
writing 100% async i/o code using a mainstream language that I am 
aware of.

Once C++ 1z coroutines are in there, I believe AFIO will let you come 
close to that clarity and simplicity of coding async as on WinRT, 
except it's portable.

My long term goal here is that C++ becomes like Erlang and your 
apparently synchronous C++ code magically coroutinises at any point 
it could block because under the bonnet it's using ASIO for 
networking and AFIO for file and filesystem, so whenever your legacy 
C++ codebase "blocks" it is actually off executing other stuff and 
it'll correctly resume when the i/o completes. That's a long way away 
though.
...
On performance: Is it faster or at least no slower than any other libraries?
(e.g. libuv) Does it sacrifice any performance for portability?
I haven't compared it to libuv, but libuv does nothing about 
filesystem races and ought therefore to be quicker. That said, Rust 
started out with libuv as its i/o layer and they ended up recently 
dropping it due to poor i/o performance which is why Rust's i/o 
library is so immature relative to its other standard libraries.

Note you can ask AFIO to disable the race freedom code for a specific 
case, and it does.
...
On portability: Does it entirely abstract away any platform specific issues?
As much as it can.
...
(e.g. Do you believe a user of AFIO will be required to write
platform-specific code as in your examples?)
I think for any serious use of the filesystem some platform specific 
code is inevitable. For example, concurrent atomic appends to a file 
are extremely quick on NTFS and ext4 but exceptionally slow on ZFS. 
Conversely, fsyncing or especially O_DIRECT on ZFS is lightening 
quick compared to NTFS or ext4. I can't abstract out those sorts of 
differences because I can't know if they're important to the end user 
or not.

In a future AFIO I'll provide a high level abstracted API for locking 
ranges in many files at once, and you won't need to care how that is 
implemented under the bonnet where it'll use very different solutions 
depending on your situation. If you look at 
https://boostgsoc13.github.io/boost.afio/doc/html/afio/quickstart/atom
ic_logging.html you'll see lots of benchmarks for various multi-file 
locking solutions on many platforms, this was me testing the waters 
for a future high level solution.

I'm not against similar high level abstractions like that in the 
future, but I suspect I won't be writing any I won't be using myself 
personally. This stuff is very hard to write, much harder than 
acquire-release atomics for memory race freedom which I used to once 
think were hard and tricky. They aren't relative to filesystem based 
algorithms.
...
...
Now with the race free filesystem extensions [...] things have changed.
If you were wanting to write portable code capable of working under a
changing filesystem you get no choice but AFIO right now  in any language
(that I am aware of).
Does the documentation show (with examples) how AFIO helps here?
A good point. It's why I submitted that topic to CppCon.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/

Re: [boost] [afio] AFIO review postponed till Monday

Niall Douglas