On Mon, May 6, 2013 at 4:09 PM, Alexander Arhipenko
Hi Andrey,
Regarding points 5 and 6: after analyzing this case, I've dropped support for datetime pattern in rollover collector. You've also mentioned an option of using datetime placeholder in file sink:
... Maybe it's better to avoid dealing with date/time placeholders in the collector at all and only work with file counters. But scan_for_files should be changed so that it is possible to set a pattern with date/time placeholders in the sink backend.
I liked that option, but existing implementation should be enhanced to support this: currently during rollover phase files are simply renamed. I.e., fileN -> fileN+1, fileN-1 -> fileN, ..., file -> file1. Thus, datetime placeholder will be invalid after rollover - it will point to the wrong date. If you think this option is very nice to have - please tell.
Why, the date will stay relevant even after the collector renames the file (assuming it doesn't change the date). Say, you have these files: file-2013-05-01.3.log file-2013-05-03.2.log file-2013-05-05.1.log and you rotate file-2013-05-07.log. After rotation you should have: file-2013-05-01.4.log file-2013-05-03.3.log file-2013-05-05.2.log file-2013-05-07.1.log So the dates in the file names still indicate the beginning of the time frame of the files. I admit, however, that it may look strange to use both dates and file counters with rollover collector. Dates are useful for lookup purposes and it shouldn't be difficult to support them. On the other hand the current collector already supports them. I have no strong opinion on whether to support dates in rollover collector or not, I'll leave it to you. In regards to a. - collectors repository.
I agree that only one file_collector should serve one target directory. Regarding rollover collectors - they are different beasts.
No, they are not, actually. Let's see why.
Agree that Boost.Log should not detect any "external" file deletions or additions. However, I don't see any problems of having 2 _rollover_ collectors targeting one directory. Each of them will monitor own file series and manipulates (add, remove) only on that series. The only 1 point of "race condition" we can imagine - that is minimum free space requirements. However, this is still not a problem: even if both collectors perform rotation at one point in time and minimum free space requirement is not met, each collector will just remove its file(s), freeing a little bit more of free space.
Nope, that case is still broken. Suppose you have two sinks that have two rollover collectors with the common target directory. Disregarding thread contention, the library will feed records to sinks sequentially, always in the same order. When the space limit for the target directory is reached, only one of the collectors will typically invoke old files deletion, and the other collector will typically not. You see, storage limits are set for the target storage and not for the sink or set of files. While maintaining these limits the collector should typically ensure that the files are being deleted in a fair manner. Always deleting the oldest file (regardless of the sink that file came from) maintains this behavior, assuming that all sinks rotate files at a more or less similar rate. This approach also works with rollover collector, IMHO, and that is why I think only one collector should be managing any given storage. I think I can see why you want to make the collectors separate - this would simplify managing the queues of files from different sinks. You can implement multiple queues of files within a single collector, which will still exclusively manage its target directory. You can also implement additional limits for each file queue, if you like, but then you'll have to come up with some rules that will choose the files to delete in a fair manner, and this is also much easier to implement within a single collector. Also, some words regarding Linux example.
On my Ubuntu 12.04 there are both files that reside in separate directories: /var/log/samba, /var/log/mongodb, /var/log/nginx as well as in root log directory: /var/log/syslog.log*, /var/log/dpkg.log*, /var/log/auth.log*. That means that having files from different services in one directory is existing practice. We need to give user an ability to cover this cases in Boost.Log library.
Yes, some services allow that, although I suspect most if not all of the files in /var/log are written by syslog daemon. My point was that storing files in service-specific directories is a fairly common practice already.
However, I still think that having active log file with persistent name will be a good benefit for many cases (one of the examples - /var/log/*.log* files mentioned above). After some thinking, I've came to conclusion that when * file sink does not have placeholders (i.e., log file name is static), * active log resides in collection directory (this point is not mandatory) - then we can safely do not rotate such file during sink destruction. So, I propose to implement the behavior described above, i.e.: log file is not rotated during sink destruction when it's filename is static.
I don't think it is safe to implicitly rely on configuration in this case. After all, directories and file name patterns may change after application restart. And such subtle change in behavior without being explicitly requested by user also doesn't look like a good idea. I don't really like this, but we could add an option to the sink (a flag, rotate_on_destroy), which by default would be true to maintain the current behavior. If the user knows what he's doing, he would be able to disable rotation on destructor.