Re: [boost] [AFIO] Formal review

2 Sep 2015

      On 02.09.2015 01:42, Niall Douglas wrote:
...
On 31 Aug 2015 at 20:39, Andrey Semashev wrote:
...
...
Before people go "aha!", remember that afio::handle has proper error
handling and has to configure a number of other items, including
asking the OS for what its "real" path is and other housekeeping.
Why does the library do that when not asked? Why not collect this data
only when the user requests?
The problem is the race free filesystem support which isn't as
simple, unfortunately, as simply using the POSIX *at() functions.
You need a minimum of the true canonical path of the fd, and its
stat_t struct (specifically the inode, created and modified
timestamps). That's at least two additional syscalls just for those,
and AFIO may do as many as seven syscalls per handle open depending.
But if my program doesn't need these properties, why do I have to pay to 
obtain them from the system? The argument that opening a file is already 
expensive doesn't work as you're just making it yet more expensive for 
no benefit for the user.

Also, I don't think I understand how obtaining this additional data 
helps to achieve race free semantics. But that's probably because I'm 
not an expert in the field.
...
You can disable the race free semantics, which are slow, using
afio::file_flags::no_race_protection.
Will this avoid obtaining any unnecessary data? I.e. will opening a file 
then be equivalent to a single syscall, which is to open the file? If 
so, I think this should be the default and any additional data to obtain 
should be an opt-in.
...
If you have a need to do a lot of file opens and closes and don't
care about races on the file system, you are better off not using
AFIO. STL iostreams is perfectly good for this as the only OSs which
allow async file opens and closes is QNX and the Hurd.
It's all about the total total effect of the optimization. I could use 
std::iostreams to process a 1000 files in serial, or I could spawn 1000 
threads and do the same somewhat faster, or I could use Boost.AFIO, 
hoping to save some development time and avoid CPU overcommit for 
roughly the same amount of benefit in terms of performance as my naive 
1000-thread code. IMO, Boost.AFIO should live up to this expectation, 
otherwise I don't see much use for it and it should probably not be 
called AFIO.

I think you target two very different goals - provide an async API for 
file system operations and provide a race-free API to file system. One 
of the key benefits of making things async is to make it more 
responsive, if not plain faster. If that is not achieved then there is 
no point in being async as it just complicates the code. I think many 
reviewers who took a look at Boost.AFIO were looking for this kind of 
benefit and seeing you saying that Boost.AFIO should not be used for 
that is surprising and discouraging. (This is why I questioned the name 
AFIO above.)

Race-free API may be interesting on its own, but apparently being 
race-free hampers performance, which is essential for the first goal. 
Additionally, race-free does not mean async, so the race-free filesystem 
library could be synchronous and as a consequence - simpler to implement 
and use.

Combining these two goals in one library you're making compromises that 
result in neither a good async library nor a simple race-free one. 
Perhaps, you should decide what you want to achieve in your library.
...
...
...
...
- I'd much prefer native_handle() to return the actual appropriate native handle
    type for the platform, e.g. int on POSIX, rather than a void *.  I'm guessing
    this is done in order to support alternative dispatchers, that aren't
    necessarily based on OS files at all, in the future, but perhaps a better way
    of doing this that doesn't require casting the void* to an int, can be found.
It's more ABI stability.
Stability with respect to what? Do you expect native handles to be
somehow different on the same platform?
AFIO's design allows many backends e.g. ZIP archives, HTTP etc. So
yes, native handles can vary on the same platform and I needed to
choose one for the ABI.
I wouldn't call that 'native' handles. In my understanding native means 
the underlying OS primitive.

If you intend to emulate a file system API on top of non-file system 
entities, such as archives and remote services, then you probably need 
to design the library very differently. Something closer to ASIO comes 
to mind - each backend should define its own set of types, including the 
underlying primitives, and those types should be propagated up to the 
user's interface. The whole library should be much more flexible and 
concept-oriented.
...
...
...
Casting void * to int is fine.
No, it's not. It's a hack that you impose on the library users.
According to [expr.reinterpret.cast]/4 it's not even required to
compile, as int may not be large enough to hold a pointer.
The native handle on POSIX is stored internally as an int. In
native_handle(), it is cast to a size_t, and then to a void *.
Please do correct me if I am wrong, but I had thought that this is
defined behaviour:
int a=5;
void *b=(void *)(size_t) a;
int c=(int)(size_t) b;
assert(c==a);
This is certainly a very common pattern in C.
The standard defines conversion of a pointer to an integer of sufficient 
size (which is (u)intptr_t, and not necessarilly size_t) and back and 
that this roundtrip looses no information. The bit mapping function 
between pointers and integers is not defined and hence there is no 
guarantee that any integer can be represented in a pointer. The other 
way is true though, as long as there exists (u)intptr_t, which is also 
not guaranteed (i.e. pointers are allowed to be something completely 
different from integers).

Besides all that, requiring your users to do tricks like that is not 
acceptable, IMO. We're in C++, use the type system it offers.
...
...
...
It's more that pwrite off the end of the file is inherently racy to
concurrent read/writers. If you want safe automatic file length
extension on writes, open a handle with the append flag and use that.
AFIO fatal exits the process if you try to read or write past the
current file size, it's a logic error and your program is inherently
broken.
This is unacceptable behavior IMO. I would expect an exception in such case.
AFIO assumes the only reason you are reading and writing past the
maximum file size is either a logic error or memory corruption. It
cannot tell which is which quickly, so we assume it's memory
corruption as your program should never have this logic error. And as
I explained before, if AFIO thinks memory corruption has occurred it
can no longer know if what it has been told to do is correct or
wrong, and therefore not terminating the process could cause damage
to other people's data.
I'll repeat myself and say this is being too smart. This situation is 
perfectly recoverable and does not mean memory corruption (at least, not 
100%, by far). It should not result in program termination.
...
I do have a very big concern that it is a denial of service attack on
anything using AFIO by simply shrinking a file it is using from
another process to cause the AFIO using process to fatal exit, so I
will be trying to improve on the current situation in the engine
rewrite. It's not as easy as it looks if you want to keep reads and
writes fast, believe me, but again the ASIO reactor is in my way
because that is code I can't control. Once I've written a replacement
reactor I am hoping it can track scatter-gather writes a lot better
than ASIO can and therefore handle this situation better without
adding much overhead for tracking the individual operations.
I don't think I understood much of this, but if this makes the library 
not terminate the process in this situation then go for it.