[AFIO] Formal review

31 Aug 2015

      ...
1. Should Boost.AFIO be accepted into Boost? Please state all conditions for
acceptance explicity.
In its current state, Boost.AFIO should NOT be accepted into Boost.  However, I
am confident that Niall will be able to deliver a future version of AFIO that
should be accepted into Boost after another review.  There is a great need for
more powerful and efficient file handling functionality in C++ than is provided
by iostreams and Boost.Filesystem/Filesystem TS.  I appreciate the enormous
complexity of trying to define a multi-platform filesystem interface, and it is
clear that Niall has done a ton of very valuable research on this topic.
...
2. What is your evaluation of the design?
A key strength of the design is that it provides a common interface for
asynchronous operations regardless of whether they are internally implemented
using Windows IOCP support, async IO on FreeBSD, or regular POSIX operations via
a thread pool.

Another key strength is that it exposes handle-based (i.e. *at), rather than
path-based APIs where possible in order to minimize race conditions, which tend
to be difficult or impossible to avoid completely with file system operations.

I do have a number of concerns about several aspects of the design, however:

Batch API:

- It is not clear what, if any, purpose the batch API serves.  The significant
  added complexity of using it would have to be justified by a significant
  performance advantage, but it is not at all clear that there is even any
  performance advantage at all.

- I suggest removing it unless there is a strong argument otherwise.

File handle API:

- The file handle abstraction in AFIO is very heavy weight, involving a
  shared_ptr to a structure containing a whole bunch of fields in addition to
  the native file handle, including a mutex, numerous flags, the path used to
  open it, a pointer to a handle to the parent directory, a chrono time_point
  specifying the time the file was opened, the total number of bytes read,
  written, and written since last fsync using this file handle.

- Because file handles are at the core of AFIO, it is critical that users
  precisely understand what this abstraction involves, and specifically how it
  differs from a native file handle.

- I'd be much more comfortable with a much leaner abstraction, like a
  unique_handle type that just contains the bare native handle, and uses unique
  ownership semantics.  The fat handles impose both computational/memory
  overhead as well as mental overhead on the programmer.  I understand these
  extra fields are probably necessary to provide various functionality,
  particularly emulation of behavior provided by other operating systems, but it
  would be nicer to be able to opt-in to such functionality and only pay for it
  when needed.

- At the very least, if these fat file handles are kept, a very strong argument
  needs to be given as a rationale.  (Perhaps you have found the overhead to be
  negligible relative to the cost of the native file handle.)

- It should be possible to construct an AFIO handle from a native file
  descriptor.

- On POSIX, AFIO should support supplying platform-specific open flags, like
  O_NOATIME.

- I'd much prefer native_handle() to return the actual appropriate native handle
  type for the platform, e.g. int on POSIX, rather than a void *.  I'm guessing
  this is done in order to support alternative dispatchers, that aren't
  necessarily based on OS files at all, in the future, but perhaps a better way
  of doing this that doesn't require casting the void* to an int, can be found.

Future-based API:

- I agree with Thomas Heller regarding the Future-based API:

- The implicit future.then built into the API should be eliminated.  Instead,
  API functions should take a file handle directly.

- future<T> should not implicitly store a file handle.  The API functions that
  open a file can return a future to a file_handle.  API functions that don't
  open a new file should just return a future to the actual result.

Inconsistency in how operations are defined:

- Some operations are methods of the dispatcher, some are methods of the file
  handle, and others are free functions.

- I suggest that all file system operations be free functions, and take a
  dispatcher as an argument.

Thread-local current dispatcher:

- Some operations involve an explicitly specified dispatcher, others seem to
  obtain it from the file handle, and others default to a thread-local
  dispatcher that has been set.

- Given that it is extremely likely for user code to execute from multiple
  threads given the nature of AFIO, relying on a thread-local current dispatcher
  seems very error prone.

- I suggest that the thread-local current dispatcher be eliminated, and that the
  dispatcher always be specified explicitly to any API call that requires one.

Directory handle cache:

- A cache of open handles to directories is kept by AFIO by default.  This is
  particularly problematic because (as is documented) it makes reading directory
  entries thread-unsafe by default.  If this is to be kept, a very strong
  rationale needs to be given, and the behavior must be changed somehow to
  ensure that directory reading is reliable by default.

Enumerate API:

- The future-based API of async_enumerate seems to be unnecessarily complicated,
  due to the need to specify maxitems and deal with making repeated calls, and
  all of the copying and memory allocation required to return a std::vector of
  directory_entry.  I suggest instead something like passing in a
  std::function<bool (expected<directory_entry> const &next)> callback (possibly
  replacing expected by the new name for your monad type).  An empty next value
  means end of directory has been reached.  Returning true means continue
  listing the directory, returning false means stop listing the directory.

Miscellaneous comments:

- Operations including rmdir, rmfile, symlink, rmsymlink (as well as async
  variants) that normally would not necessarily even operate on file handles,
  actually return a handle_ptr.  Perhaps if on some platforms a handle has to be
  opened, then there could be a way for the user to specify that a handle is
  desired.  Otherwise, it would make sense to just return void or
  std::future<void> equivalent for async variants.

- The documentation notes that on MS Windows, DOS short names are not supported.
  I don't have much familiarity with Windows, but if programs generally accept
  DOS short names when a filename is required, then AFIO should probably support
  these names as well.

- The documentation notes: "Don't do directory enumerations from an x86 binary
  on a x64 Windows system. This is due to a bug in Microsoft's WOW64 syscall
  translation layer which Microsoft have decided is wontfix. "  Given that a
  major purpose of AFIO is to provide a portable abstraction that works around
  OS limitations, AFIO needs to support directory enumeration even in this case
  (e.g. by detecting WOW64 and using a synchronous call via a threadpool in that
  case).

- It is noted that to do an asynchronous stat you have to use enumerate with an
  appropriate glob specified, but constructing such a glob may be non-trivial if
  the path contains characters that are interpreted specially by the glob
  syntax, and impossible to do in a race-free manner.

- As a general matter, it is important to provide leanest possible API that
  offers the most flexibility to users, in order to avoid the case that users
  choose not to use the library despite it offering a lot of functionality that
  is useful to them, because of excessive overhead or insufficiently powerful
  interfaces.  I often forgo use of Boost Filesystem and C++ iostreams for this
  reason.

3. What is your evaluation of the implementation?

- I understand that Niall already intends to rewrite much of the implementation,
  making much of the current implementation irrelevant.

- As has been noted by other reviewers, there are a bunch of #if 0 blocks, many
  undocumented.  Some seem to contain work in progress, which might better be
  placed in a separate branch.  Some contain debugging code, in which case it
  might be better to replace #if 0 with #ifdef BOOST_AFIO_DEBUG_XXX where XXX
  specifies the particular feature being debugged.  That way it is
  self-documenting.  Some seem to contain alternate implementations.  Ideally
  these would also be guarded by a descriptive macro, and a unit test case be
  defined that tests it as well to make sure it is still meaningful.  Otherwise
  #if 0 code tends to bitrot, which admittedly is not such a great problem if it
  is just for debugging anyway, but I'd say it is less than optimal.

- The implementation might be better split up into multiple independent files.
  There are various independent facilities, like secded_ecc, that could be
  placed in a separate file.

- Although this tends to be a problem with most Boost libraries, it would be
  nice if the compile times could be reduced, particularly when not in
  header-only mode.  Some libraries like Boost Spirit have an interface
  inherently based on template metaprogramming, such that long compile times are
  pretty much unavoidable, but the Boost AFIO interface seems to be mostly
  non-template based, which suggests that it may be possible to get pretty good
  compile times.
...
4. What is your evaluation of the documentation?
- The extensive documentation about race guarantees is extremely useful.

- I appreciate the in-depth examples.

- Much more design rationale is needed.

- The term "filing system" is used in place of the more common "file system",
  leading me to wonder if there is a difference in meaning.

- Every instance where there is a platform-specific #ifdef in an example should
  be evaluated, so that you can figure out the appropriate platform-independent
  abstraction that could be added to AFIO in order to move that #ifdef into the
  library and out of user code.

- There is an attempt to document differences in behavior on different
  platforms, but this is not done sufficiently.  For example, it is stated that
  async_write requires that the destination region does not go off the end of
  the file, but on POSIX pwrite supports this just fine as a way to extend the
  file.  I assume that this restriction applies to Windows only.

- The semantics of the Batch API are not precisely documented, nor is any
  rationale given for it.  For instance, I understand that the operations
  themselves cannot happen atomically, since that is not supported by operating
  systems, though this is not clearly stated.  It is stated: "Once a batch of
  input ops has been verified at the point of entry as not errored, you are
  guaranteed that the batch is atomically scheduled as a whole, unless a failure
  to allocate memory occurs."  Does this means that if the *_req object involves
  futures, it waits for all of them to be ready before launching any of the
  operations in the batch?  Is that the purpose of the Batch API, to do a
  wait_all on all of the relevant futures, then to launch each operation?  This
  issue goes away though if the Batch API is eliminated, as seems to make sense.

- It would be useful to document for each operation which underlying system
  calls it invokes on each platform.

- Many operations other than open (e.g. rmfile) take file_flags, but it is not
  specified which flags are supported.

- References to historical versions of AFIO (prior to the review) should be
  removed from the main parts of the documentation, since they only distract
  from the useful information.  This information could be kept in a different
  section at the end for the extra curious.

- Formatting of the "description" in the Boost Quickbook format function
  documentation has some problems: the lack of line breaks between the general
  description, other notes, and the common note about the use of NT kernel paths
  make it difficult to read.
...
5. What is your evaluation of the potential usefulness of the library?
The library is potentially extremely useful.  C++ and Boost iostreams and
Boost/TS Filesystem provide only very limited functionality.  The C library
libuv is a close competitor but seems to offer less functionality and a less
convenient API.  For achieving similar functionality, the status quo in C++ is
to rely on platform-specific C APIs that are much more cumbersome to use.
...
6. Did you try to use the library? With what compiler? Did you have
   any problems?
I tried some very simple usage of the library with GCC 5.2.0 on Linux.  There
were a few warnings (variable set but not used) with -Wall that would be best
eliminated.

- The error messages are not ideal: for instance, in the case that there was an
  error opening a file, this is the result:

Dynamic exception type: boost::system::system_error std::exception::what: File
'/home/jbms/tmp/afio-test/test.txt' not found [Host OS Error: No such file or
directory (2) in
'/home/jbms/tmp/boost.afio/include/boost/afio/v2/detail/impl/afio.ipp':
async_io_handle_posix:805]: No such file or directory. Op was scheduled at:
    1. 0x42fb4f: ./test (+0x2fb4f) : No such file or directory

Some of this is only useful when debugging AFIO, some only useful when debugging
the application, and there is a lot of redundancy, for instance the actual error
"No such file or directory" is shown in 3 places.  Ideally there would be a way
to get a simple message along the lines of "Error opening file
'/home/jbms/tmp/afio-test/test.txt': No such file or directory" that could be
shown directly to the user.

I tried to #define BOOST_AFIO_EXCEPTION_DISABLESOURCEINFO to see if that would
make the error message better, but instead that seemed to trigger a bug:

Dynamic exception type: std::logic_error
std::exception::what: basic_string::_S_construct null not valid

- A separate issue is that attempting to read a range that is not present in a
  file, something that the library should handle elegantly, leads to the program
  aborting immediately with an error message, rather than allowing the error to
  be handled:

FATAL EXCEPTION: Failed to read all buffers
    1. 0x40f0f4: ./test (+0xf0f4) terminate called after throwing an
       instance of 'std::runtime_error'
  what():  Failed to read all buffers
zsh: abort (core dumped)  ./test
...
7. How much effort did you put into your evaluation? A glance? A quick
   reading? In-depth study?
I gave the library a fairly in-depth study, spending at least 8 hours in total.
I looked at the documentation fairly thoroughly, looked through parts of the
implementation, and tried running a couple small programs that use the library.
...
8. Are you knowledgeable about the problem domain?
I have a fair amount of experience with asynchronous programming in C++, and I
have pretty extensive knowledge of the file system APIs on Linux.

As a final note, I'd like to thank Niall for all of the work he has done to
get the library this far.  A powerful, elegant, cross-platform interface for
file system operations is very much needed in C++, I really do want to see
(a future version of) this library in Boost and use it myself.

Jeremy Maitin-Shepard

Niall Douglas

Andrey Semashev

Niall Douglas

Jeremy Maitin-Shepard

Niall Douglas

Jeremy Maitin-Shepard

Niall Douglas

Jeremy Maitin-Shepard

Niall Douglas

Hartmut Kaiser

Niall Douglas

Hartmut Kaiser

Manu343726

Michael Marcin

Michael Marcin

John Bytheway

Niall Douglas

Asbjørn

Niall Douglas

Asbjørn

Andrey Semashev

John Bytheway

John Bytheway

Andrey Semashev

Niall Douglas

Andrey Semashev

Niall Douglas

Giovanni Piero Deretta

Niall Douglas

Gavin Lambert

Roland Bock

Niall Douglas

tags

participants (11)