On 3/3/21 11:28 PM, Peter Dimov via Boost wrote: That's probably a Windows Defender (or another antivirus) "feature". Not
On 3/4/21 03:35 AM, Andrey Semashev via Boost wrote: that
this helps.
It does help. Nicholas, could you verify this by adding the directory where log files are written to the excluded ones in the AV software you have? Or by temporarily disabling the software?
Nice catch Peter. Simply turning off the built in realtime AV takes me from throughput on order of 200-300 messages per second to 5700 messages per second. So better, but still not great. The (imho) sad thing on windows is that disabling A/V has gotten progressively harder with each windows 10 release, especially for the average user. Realistically it's bad enough that developers might as well assume it will always be on for end users, and even fellow developers. On 3/4/21 03:42 AM, Andrey Semashev via Boost wrote:
I'm not sure I feel good about documenting it, as it doesn't really solve the problem. I suppose, I could add a cache of the least recently used files to keep them open, but that cache would be ineffective if exceeded, which can happen in some valid cases (e.g. writing logs pertaining to a given dynamically generated "id" to a separate log). And it would only be useful on Windows.
Just to get a feel for the performance improvement, I quickly implemented caching all of the destination paths with an unordered_map. Every log record consume does a flush. Throughput went to about 53000 log records per second (with or without A/V). Quick performance profiling of that shows obvious bottlenecks gone at this point - removing the flush will get me to 91K log records per second (with or without A/V). Further rate improvements could be made by optimizing my formatter or file_name_composer, which are not the backend's concern. On 3/4/21 10:39 AM, Andrey Semashev via Boost wrote:
Unfortunately, text_multifile_backend is supposed to open and close file on every log record, as the file name is generated from the log record.
That makes sense. For my use case, I'm using text_multifile_backend to write to different files based on the channel in the record, and I've got on the order of 10-20 channels. I could do multiple regular streams with appropriate filters, but that would require declaring the channels I'm going to encounter ahead of time. Not having to do that is really nice. :-) I could see a lot of folks using text_multifile_backend like this (where there is a reasonable limit on how many distinct paths are actually created by the backend), where having a cache (with or even without flush) would be fine. For cases where there is no such limit, it makes less sense. And of course if every log record is going to a different file, no amount of caching is going to help with the windows file open/close slowness. On 3/4/21 10:39 AM, Andrey Semashev via Boost wrote:
In any case, I've created a ticket to consider adding a cache of open files:
https://github.com/boostorg/log/issues/142
Also, I've added a note to the docs.
Thanks so much. The note is excellent. The workarounds you mentioned make a lot of sense... multiple regular streams with filters would work, albeit a bit more awkward for a use case like mine. And I imagine an asynchronous frontend would help, although thinking about the CPU cycles wasted is still a little painful. ;-) I think an unbounded cache would be a good option for use cases like mine. Happy to help with any feedback/benchmarking/contributing if it is useful.