Re: [boost] Poor performance on windows with logging sinks::file_collector when two file logs use same directory. Thoughts?

24 Jun 2021

      Got it, thank you very much for the detailed explanation and guidance.

On Wed, Jun 23, 2021 at 2:08 PM Andrey Semashev via Boost <
boost@lists.boost.org> wrote:
...
...
I've got two different rotating file logs that point to the same
...
(with different file name patterns). Is this in general just a bad idea?
They end up sharing the same file_collector, which seems wrong, so
...
that is a clue that I shouldn't have my logs set up like this.
In production I've got a service that compresses, archives, and manages
On 6/23/21 8:39 PM, Nicholas Neumann via Boost wrote:
directory
perhaps
the
...
size of the logs. But in dev, I don't, so the number of files in the
directory slowly grew. But the startup time for my program grew much
faster. On windows the scan_for_files function in the collector has a
loop
that is O(mn), where m is the number of files in the directory, and n is
the number that matched in previous calls to the scan_for_files function
This means the scan_for_files for the first rotating file log in the
directory has no issue (n is 0), but the second can be problematic. It
iterates over the files in the directory and for each file in the
directory, it calls filesystem::equivalent on all of the matches from
previous scan_for_files calls. On windows, filesystem::equivalent is
particularly heavy, opening handles to both files.
Thoughts? Is the two file logs getting the same collector the real issue?
Or is it my pointing two file logs to the same directory? I see some ways
to mitigate the slowdown in scan_for_files - e.g., filesystem::equivalent
could be called after all of the method/match_pattern check, but the two
file logs sharing the same collector feels like the real issue.
One file collector per target directory is the intended behavior - that
is what allows to maintain limits on the log files to keep.
That you have to call scan_for_files separately for each sink is
unfortunate, but necessary, since each sink uses its own filename
pattern, and needs to initialize its own file counter value.
So, in the nutshell, what you're seeing is the expected behavior, and
expected performance cost. It could probably be optimized if Boost.Log
used POSIX API and WinAPI directly, but (a) that would not eliminate the
fundamental O(M*N) complexity of scanning and (b) Boost.Log is using
Boost.Filesystem precisely to avoid dealing with the underlying API
directly, as there are quite a few portability quirks.
My recommendation to you is to limit the number of files you keep in the
target directory to a reasonable value.
_______________________________________________
Unsubscribe & other changes:
http://lists.boost.org/mailman/listinfo.cgi/boost