On 22 Jul 2015 at 7:58, glenfernandes wrote:
Really with async file i/o it's always more a question of control rather than performance. In the naïve case performance will usually be higher by going synchronous.
I might be misunderstanding you, so I will try harder to get on the same page. Even though English is the only language I know, I know it poorly. :-)
No problem.
Why would anyone interested in reviewing AFIO care about getting more performance by going synchronous? The reason they're interested in an asynchronous file I/O library is because they need asynchronous file I/O, right?
Most people who think they need async file i/o don't need it and shouldn't use it. Conversely, most people who write code which uses a filesystem path don't realise they've just written buggy racy code which could destroy other people's data and open security holes.
Control over performance sounds great, but it's not control over performance if it comes at a cost of performance, right? [Example: I see the support of the C++ allocator model as something which can sometimes offer control over performance, but in no way does it make things any slower at runtime when the allocator supplied is std::allocator than if the code just used 'new' and 'delete'. --end example]
You've got it exactly right: you sacrifice some average performance in exchange for control over worst case performance. Same goes for the race free filesystem code - they come with a performance cost due to POSIX not providing a race free sibling file open API, so AFIO must iterate parent directory opens with inode lookups and back off and retry until success. Windows doesn't have this problem. It's been raised with the AWG, and there was some sympathy for the omission.
I thought motivation to use your library would be one or more of: - Simplicity (makes it easier to write maintainable file I/O code) - Portability (saves me time from writing platform specific code) - Performance (it is faster than code I would write by hand)
I would say all three of these yes.
On simplicity: If someone does not care about portability, can they write smaller/cleaner/more-maintainable code if they choose to use AFIO versus using overlapped I/O with IOCPs or KAIO? Does it sacrifice any simplicity for portability?
If you didn't care about portability, if you write your code in WinRT then all your i/o is async and that is probably the nicest way of writing 100% async i/o code using a mainstream language that I am aware of. Once C++ 1z coroutines are in there, I believe AFIO will let you come close to that clarity and simplicity of coding async as on WinRT, except it's portable. My long term goal here is that C++ becomes like Erlang and your apparently synchronous C++ code magically coroutinises at any point it could block because under the bonnet it's using ASIO for networking and AFIO for file and filesystem, so whenever your legacy C++ codebase "blocks" it is actually off executing other stuff and it'll correctly resume when the i/o completes. That's a long way away though.
On performance: Is it faster or at least no slower than any other libraries? (e.g. libuv) Does it sacrifice any performance for portability?
I haven't compared it to libuv, but libuv does nothing about filesystem races and ought therefore to be quicker. That said, Rust started out with libuv as its i/o layer and they ended up recently dropping it due to poor i/o performance which is why Rust's i/o library is so immature relative to its other standard libraries. Note you can ask AFIO to disable the race freedom code for a specific case, and it does.
On portability: Does it entirely abstract away any platform specific issues?
As much as it can.
(e.g. Do you believe a user of AFIO will be required to write platform-specific code as in your examples?)
I think for any serious use of the filesystem some platform specific code is inevitable. For example, concurrent atomic appends to a file are extremely quick on NTFS and ext4 but exceptionally slow on ZFS. Conversely, fsyncing or especially O_DIRECT on ZFS is lightening quick compared to NTFS or ext4. I can't abstract out those sorts of differences because I can't know if they're important to the end user or not. In a future AFIO I'll provide a high level abstracted API for locking ranges in many files at once, and you won't need to care how that is implemented under the bonnet where it'll use very different solutions depending on your situation. If you look at https://boostgsoc13.github.io/boost.afio/doc/html/afio/quickstart/atom ic_logging.html you'll see lots of benchmarks for various multi-file locking solutions on many platforms, this was me testing the waters for a future high level solution. I'm not against similar high level abstractions like that in the future, but I suspect I won't be writing any I won't be using myself personally. This stuff is very hard to write, much harder than acquire-release atomics for memory race freedom which I used to once think were hard and tricky. They aren't relative to filesystem based algorithms.
Now with the race free filesystem extensions [...] things have changed. If you were wanting to write portable code capable of working under a changing filesystem you get no choice but AFIO right now in any language (that I am aware of).
Does the documentation show (with examples) how AFIO helps here?
A good point. It's why I submitted that topic to CppCon. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/