Poor performance on windows with logging sinks::file_collector when two file logs use same directory. Thoughts?
I've got two different rotating file logs that point to the same directory (with different file name patterns). Is this in general just a bad idea? They end up sharing the same file_collector, which seems wrong, so perhaps that is a clue that I shouldn't have my logs set up like this. In production I've got a service that compresses, archives, and manages the size of the logs. But in dev, I don't, so the number of files in the directory slowly grew. But the startup time for my program grew much faster. On windows the scan_for_files function in the collector has a loop that is O(mn), where m is the number of files in the directory, and n is the number that matched in previous calls to the scan_for_files function This means the scan_for_files for the first rotating file log in the directory has no issue (n is 0), but the second can be problematic. It iterates over the files in the directory and for each file in the directory, it calls filesystem::equivalent on all of the matches from previous scan_for_files calls. On windows, filesystem::equivalent is particularly heavy, opening handles to both files. Thoughts? Is the two file logs getting the same collector the real issue? Or is it my pointing two file logs to the same directory? I see some ways to mitigate the slowdown in scan_for_files - e.g., filesystem::equivalent could be called after all of the method/match_pattern check, but the two file logs sharing the same collector feels like the real issue.
On 6/23/21 8:39 PM, Nicholas Neumann via Boost wrote:
I've got two different rotating file logs that point to the same directory (with different file name patterns). Is this in general just a bad idea? They end up sharing the same file_collector, which seems wrong, so perhaps that is a clue that I shouldn't have my logs set up like this.
In production I've got a service that compresses, archives, and manages the size of the logs. But in dev, I don't, so the number of files in the directory slowly grew. But the startup time for my program grew much faster. On windows the scan_for_files function in the collector has a loop that is O(mn), where m is the number of files in the directory, and n is the number that matched in previous calls to the scan_for_files function
This means the scan_for_files for the first rotating file log in the directory has no issue (n is 0), but the second can be problematic. It iterates over the files in the directory and for each file in the directory, it calls filesystem::equivalent on all of the matches from previous scan_for_files calls. On windows, filesystem::equivalent is particularly heavy, opening handles to both files.
Thoughts? Is the two file logs getting the same collector the real issue? Or is it my pointing two file logs to the same directory? I see some ways to mitigate the slowdown in scan_for_files - e.g., filesystem::equivalent could be called after all of the method/match_pattern check, but the two file logs sharing the same collector feels like the real issue.
One file collector per target directory is the intended behavior - that is what allows to maintain limits on the log files to keep. That you have to call scan_for_files separately for each sink is unfortunate, but necessary, since each sink uses its own filename pattern, and needs to initialize its own file counter value. So, in the nutshell, what you're seeing is the expected behavior, and expected performance cost. It could probably be optimized if Boost.Log used POSIX API and WinAPI directly, but (a) that would not eliminate the fundamental O(M*N) complexity of scanning and (b) Boost.Log is using Boost.Filesystem precisely to avoid dealing with the underlying API directly, as there are quite a few portability quirks. My recommendation to you is to limit the number of files you keep in the target directory to a reasonable value.
Got it, thank you very much for the detailed explanation and guidance. On Wed, Jun 23, 2021 at 2:08 PM Andrey Semashev via Boost < boost@lists.boost.org> wrote:
I've got two different rotating file logs that point to the same
(with different file name patterns). Is this in general just a bad idea? They end up sharing the same file_collector, which seems wrong, so
that is a clue that I shouldn't have my logs set up like this.
In production I've got a service that compresses, archives, and manages
On 6/23/21 8:39 PM, Nicholas Neumann via Boost wrote: directory perhaps the
size of the logs. But in dev, I don't, so the number of files in the directory slowly grew. But the startup time for my program grew much faster. On windows the scan_for_files function in the collector has a loop that is O(mn), where m is the number of files in the directory, and n is the number that matched in previous calls to the scan_for_files function
This means the scan_for_files for the first rotating file log in the directory has no issue (n is 0), but the second can be problematic. It iterates over the files in the directory and for each file in the directory, it calls filesystem::equivalent on all of the matches from previous scan_for_files calls. On windows, filesystem::equivalent is particularly heavy, opening handles to both files.
Thoughts? Is the two file logs getting the same collector the real issue? Or is it my pointing two file logs to the same directory? I see some ways to mitigate the slowdown in scan_for_files - e.g., filesystem::equivalent could be called after all of the method/match_pattern check, but the two file logs sharing the same collector feels like the real issue.
One file collector per target directory is the intended behavior - that is what allows to maintain limits on the log files to keep.
That you have to call scan_for_files separately for each sink is unfortunate, but necessary, since each sink uses its own filename pattern, and needs to initialize its own file counter value.
So, in the nutshell, what you're seeing is the expected behavior, and expected performance cost. It could probably be optimized if Boost.Log used POSIX API and WinAPI directly, but (a) that would not eliminate the fundamental O(M*N) complexity of scanning and (b) Boost.Log is using Boost.Filesystem precisely to avoid dealing with the underlying API directly, as there are quite a few portability quirks.
My recommendation to you is to limit the number of files you keep in the target directory to a reasonable value.
_______________________________________________ Unsubscribe & other changes: http://lists.boost.org/mailman/listinfo.cgi/boost
On 23/06/2021 18:39, Nicholas Neumann via Boost wrote:
I've got two different rotating file logs that point to the same directory (with different file name patterns). Is this in general just a bad idea? They end up sharing the same file_collector, which seems wrong, so perhaps that is a clue that I shouldn't have my logs set up like this.
In production I've got a service that compresses, archives, and manages the size of the logs. But in dev, I don't, so the number of files in the directory slowly grew. But the startup time for my program grew much faster. On windows the scan_for_files function in the collector has a loop that is O(mn), where m is the number of files in the directory, and n is the number that matched in previous calls to the scan_for_files function
This means the scan_for_files for the first rotating file log in the directory has no issue (n is 0), but the second can be problematic. It iterates over the files in the directory and for each file in the directory, it calls filesystem::equivalent on all of the matches from previous scan_for_files calls. On windows, filesystem::equivalent is particularly heavy, opening handles to both files.
Thoughts? Is the two file logs getting the same collector the real issue? Or is it my pointing two file logs to the same directory? I see some ways to mitigate the slowdown in scan_for_files - e.g., filesystem::equivalent could be called after all of the method/match_pattern check, but the two file logs sharing the same collector feels like the real issue.
The Windows filesystem is much slower than the filesystems on POSIX, about 70x slower for non-i/o operations. On top of that, the Win32 API adds considerable overhead above the kernel, and on top of that again, filesystem abstractions such as C++ Filesystem add a lot more again. It would not be surprising if naive C++ were 200x-500x slower on Windows for some filesystem traversal operations than on Linux. No matter what anyone (apart from the Windows kernel team) does will Windows filesystem ever be performance competitive with any POSIX implementation. However, you can close the gap very significantly if you employ a bare metal abstraction such as LLFIO (disclaimer: it's mine, but parts are up before WG21 for standardisation). In my work where there is a Petabyte scale custom DB which is LLFIO based, Windows is between 20-40% slower than Linux on the same hardware, which is pretty good given it's a really stupid DB which relies heavily on the filesystem to be fast. All modern filesystems perform well with LLFIO with > 10 million files in the same directory. We somewhat cheat on Windows because we avoid per-file handle open-close for metadata ops (e.g. filesystem::equivalent) by using a single syscall kernel globbed directory enumeration of the parent directory instead, this yields similar performance to POSIX. Anyway I suspect all this is overkill for your use case, but I mention it anyway. Bad filesystem performance on Windows is very fixable, with some refactoring. Niall
participants (3)
-
Andrey Semashev
-
Niall Douglas
-
Nicholas Neumann