Re: [boost] [filesystem] How to remove specific files from a directory?

14 Sep 2016

      On 13 Sep 2016 at 22:18, Andrey Semashev wrote:
...
...
Some might ask why not immediately unlink it in RAM as Linux does?
Linux historically really didn't try hard to avoid data loss on
sudden power loss, and even today it uniquely requires programmers to
explicitly call fsync on containing directories in order to achieve
sudden power loss safety. NTFS and Windows tries much harder, and it
tries to always keep what *metadata* the program sees via the kernel
syscalls equal to what is on physical storage (actual file data is a
totally separate matter). It makes programming reliable filesystem
code much easier on Windows than on Linux which was traditionally a
real bear.
I'm not sure I understand how Windows behavior you described provides 
better protection against power loss. If the power is lost before 
metadata is flushed to media then the file stays present after reboot. 
The same happens in Linux, AFAICT, only you can influence the FS 
behavior with mount options.
You're thinking in terms of "potential loss of user data", and in 
that sense you're right.

I'm referring to "writing multi-process concurrent filesystem code 
which is algorithmically correct and won't lose data". In this 
situation having the kernel only tell you what is actually physically 
on disk makes life much easier when writing correct code. In Linux in 
particular you have to spam fsync all over the place and pray the 
user hasn't set "journal=writeback" or barriers off etc, and also 
such design patterns are inefficient as you end up doing too many 
directory fsyncs.

During AFIO v1 I used to get very annoyed that metadata views on 
Windows from other processes did not match the modifying process' 
view until the updates reach physical storage, so process A could 
extend a file and process B wouldn't see the extension for 
potentially many seconds later (same goes for hard links, timestamps 
etc). It seemed easier if every process saw the same thing and had a 
sequentially consistent view. But with the benefit of getting used to 
it, and also the fact that Linux (+ ext4) would appear to be the 
exceptional outlier here, it does have a compelling logic and it 
definitely can be put to very good use when writing algorithmically 
correct filesystem code.
...
The irritating difference is that even though the file is deleted (by 
all means the application has to observe that), the OS still doesn't 
allow to delete the containing folder because it's not empty.
Ah but the file is not deleted, so refusing to delete the containing 
folder is correct. It is "pending deletion" which means anything 
still using it can continue to do so, but nothing new can use it [1]. 
You can, of course, also unmark a file marked for deletion in 
Windows. Linux has a similar feature by letting you create a file 
entry to an anonymous inode.

[1]: Also an opt-out Windows behaviour.
...
I'm seeing 
this effect nearly every time I boot into Windows - when I delete the 
bin.v2 directory created by Boost.Build. There may be historical reasons 
to it, but seriously, if the OS tries to cheat and pretends the file is 
deleted then it should go the whole way and act as if it is.
Are you referring to Windows Explorer hiding stuff you delete with it 
when it's not really deleted?

That's a relatively recent addition to Windows Explorer. It's very 
annoying.
...
Workarounds 
like rename+delete are a sorry excuse because it's really difficult to 
say where to rename the file in presence of reparse points, quotas and 
permissions. And most importantly - why should one jump through these 
hoops in one specific case, on Windows? The same goes about inability to 
delete/move open files.
You can delete, rename and move open files just fine on Windows. 
Indeed an AFIO v1 unit test fires up a thread randomly renaming a few 
dozen files and directories and then ensure that a loop of filesystem 
operations on a rapidly changing filesystem does not race nor 
misoperate.

You are correct that you must opt-in to being able to do this. The 
Windows kernel folk correctly observed most programmers, even 
otherwise expert ones, consistently write unsafe filesystem code. 
They therefore defaulted to an abundance of defaulted options to 
safety (and I would agree too much so, especially making symbolic 
links effectively an unusable feature).

Regarding an ideally efficient way of correctly deleting a directory 
tree on Windows, AFIO v1 had an internal algorithm which when faced 
with pending delete files during a directory tree deletion, it would 
probe around for suitable locations to rename them to in order to 
scrub the directory tree immediately. It was pretty effective 
especially if %TEMP% is on the same volume, and the NT kernel API 
makes figuring out what's also on your volume trivial as compared to 
say statfs() on Linux which is awful. AFIO v2 will at some point 
expose that algorithm as a generic templated edition into 
afio::algorithm so anybody can use it.

In the end, these platform specific differences are indeed annoying. 
But that's the whole point of system libraries and abstraction 
libraries like many of those in Boost, you write code once and it 
works equally everywhere.

Niall

-- 
ned Productions Limited Consulting
http://www.nedproductions.biz/ 
http://ie.linkedin.com/in/nialldouglas/

Re: [boost] [filesystem] How to remove specific files from a directory?

Niall Douglas