On 04/03/2021 09:41, Andrey Semashev via Boost wrote:
On 3/3/21 11:00 PM, Nicholas Neumann via Boost wrote:
I recently moved a project to boost.log from a homemade logger. I had something like text_multifile_backend, so finding that as a drop-in replacement was awesome.
Unfortunately, the performance when using text_multifile_backend on windows is really bad, because the repeated file close operations (one per log record) are unusually slow on windows. Repeatedly logging a string to the same file via text_multifile_backend results in throughput of about 200 log entries per second.
Just to quickly prove it is unique to windows, I made a simple program that just opens, appends a single line, and then closes, an ofstream in a loop. On a high-end windows machine with nvme ssd, 1000 iterations takes 2600ms. On an older linux box with a sata ssd, the same takes 16ms.
Opening a file for read/write on Windows using the NT kernel API is approx 76x slower than opening a file on Linux. Using the Win32 API is approx 40% slower again (~106x slower). That's without any antivirus. You're seeing things slower than that again, which is almost certainly due to the file handle close. On Linux this doesn't do work, whereas on Windows it induces a blocking metadata flush plus flush of the containing directory i.e. an fsync of metadata. Windows is competitive to Linux for file i/o, once the file is opened. It is highly uncompetitive for file open/close. This is why compiling a large codebase is always much slower on Windows than elsewhere, because compiling C++ involves opening an awful lot of files for read.
What do others think about adding a note in the documentation about this performance issue? It's bad enough that I think anyone on windows would want to avoid the backend. It's not the backend's "fault" at all; I could see some options for improving performance of the backend on windows, but they definitely complicate the simplicity of the current approach.
I'm not sure I feel good about documenting it, as it doesn't really solve the problem. I suppose, I could add a cache of the least recently used files to keep them open, but that cache would be ineffective if exceeded, which can happen in some valid cases (e.g. writing logs pertaining to a given dynamically generated "id" to a separate log). And it would only be useful on Windows.
Windows, unlike POSIX, has no low soft limit on total open file descriptors such that you need to care about fire and forget HANDLE allocation. You can open at least 16 million files on Windows per process without issue. Just make sure that when opening the HANDLE you do not exclude other programs also opening the file, or deleting or renaming the file. Be aware that so long as the file is open, any directory in the hierarchy above is locked, and cannot be renamed. Be aware if you map the file, other programs will be denied many operations, such as shrinking the file, which does not occur on POSIX. A lot of people coming from POSIX don't realise this, but on Windows opening a file with more permissions than you actually need is expensive. For example, if you only need to atomically append to a file, opening that file for append-only, with no ability to read nor write nor query metadata, is much quicker than opening it with additional privileges. If you don't mind using NtCreateFile() instead of Win32 CreateFile(), that's 40% quicker, as you save on the dynamic memory allocation and Unicode path reencode all the Win32 path APIs do. In our work custom DB in which every new query opens several files on the filesystem, on Windows it is many times slower than on Linux. However, overall benchmarks are within 15% of Linux, because the hideous high cost file open/close gets drowned out by other operations. We also heavily cache open file handles on all platforms (after raising the soft fd limit on POSIX to 1 million), so we avoid file open/close as much as we can, which helps Windows particularly. (All number claims above come from LLFIO which makes the Windows filesystem about twice as fast, and should be considered anecdata) Niall