On 13 Sep 2016 at 22:18, Andrey Semashev wrote:
Some might ask why not immediately unlink it in RAM as Linux does? Linux historically really didn't try hard to avoid data loss on sudden power loss, and even today it uniquely requires programmers to explicitly call fsync on containing directories in order to achieve sudden power loss safety. NTFS and Windows tries much harder, and it tries to always keep what *metadata* the program sees via the kernel syscalls equal to what is on physical storage (actual file data is a totally separate matter). It makes programming reliable filesystem code much easier on Windows than on Linux which was traditionally a real bear.
I'm not sure I understand how Windows behavior you described provides better protection against power loss. If the power is lost before metadata is flushed to media then the file stays present after reboot. The same happens in Linux, AFAICT, only you can influence the FS behavior with mount options.
You're thinking in terms of "potential loss of user data", and in that sense you're right. I'm referring to "writing multi-process concurrent filesystem code which is algorithmically correct and won't lose data". In this situation having the kernel only tell you what is actually physically on disk makes life much easier when writing correct code. In Linux in particular you have to spam fsync all over the place and pray the user hasn't set "journal=writeback" or barriers off etc, and also such design patterns are inefficient as you end up doing too many directory fsyncs. During AFIO v1 I used to get very annoyed that metadata views on Windows from other processes did not match the modifying process' view until the updates reach physical storage, so process A could extend a file and process B wouldn't see the extension for potentially many seconds later (same goes for hard links, timestamps etc). It seemed easier if every process saw the same thing and had a sequentially consistent view. But with the benefit of getting used to it, and also the fact that Linux (+ ext4) would appear to be the exceptional outlier here, it does have a compelling logic and it definitely can be put to very good use when writing algorithmically correct filesystem code.
The irritating difference is that even though the file is deleted (by all means the application has to observe that), the OS still doesn't allow to delete the containing folder because it's not empty.
Ah but the file is not deleted, so refusing to delete the containing folder is correct. It is "pending deletion" which means anything still using it can continue to do so, but nothing new can use it [1]. You can, of course, also unmark a file marked for deletion in Windows. Linux has a similar feature by letting you create a file entry to an anonymous inode. [1]: Also an opt-out Windows behaviour.
I'm seeing this effect nearly every time I boot into Windows - when I delete the bin.v2 directory created by Boost.Build. There may be historical reasons to it, but seriously, if the OS tries to cheat and pretends the file is deleted then it should go the whole way and act as if it is.
Are you referring to Windows Explorer hiding stuff you delete with it when it's not really deleted? That's a relatively recent addition to Windows Explorer. It's very annoying.
Workarounds like rename+delete are a sorry excuse because it's really difficult to say where to rename the file in presence of reparse points, quotas and permissions. And most importantly - why should one jump through these hoops in one specific case, on Windows? The same goes about inability to delete/move open files.
You can delete, rename and move open files just fine on Windows. Indeed an AFIO v1 unit test fires up a thread randomly renaming a few dozen files and directories and then ensure that a loop of filesystem operations on a rapidly changing filesystem does not race nor misoperate. You are correct that you must opt-in to being able to do this. The Windows kernel folk correctly observed most programmers, even otherwise expert ones, consistently write unsafe filesystem code. They therefore defaulted to an abundance of defaulted options to safety (and I would agree too much so, especially making symbolic links effectively an unusable feature). Regarding an ideally efficient way of correctly deleting a directory tree on Windows, AFIO v1 had an internal algorithm which when faced with pending delete files during a directory tree deletion, it would probe around for suitable locations to rename them to in order to scrub the directory tree immediately. It was pretty effective especially if %TEMP% is on the same volume, and the NT kernel API makes figuring out what's also on your volume trivial as compared to say statfs() on Linux which is awful. AFIO v2 will at some point expose that algorithm as a generic templated edition into afio::algorithm so anybody can use it. In the end, these platform specific differences are indeed annoying. But that's the whole point of system libraries and abstraction libraries like many of those in Boost, you write code once and it works equally everywhere. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/