On 14 Mar 2015 at 13:11, Beman Dawes wrote:
On Fri, Mar 13, 2015 at 8:26 PM, Niall Douglas
wrote: ... deleting a directory tree which will randomly fail on Windows if you do it too quickly :).
I've run into that several times, but only when TortoiseGit, or more precisely its cache, is running or was running very recently. I wouldn't be surprised if similar cache programs exhibit the same behavior on any operating system that prevents deleting a directory tree while another program has any of the directories still open.
The cause of this is actually very interesting, or maybe it is just to people interested in filing systems like you and me. Anyway, time to bore the list ... NT inherited from VMS the notion of pending deletion never actual deletion, so when you delete a file you actually don't delete it, you merely tag it as likely to be deleted at some future point. Here's how DeleteFile/RemoveDirectory works internally: 1. Open the file/directory as a HANDLE with DELETE privs. 2. Tell the kernel to toggle the PendingDelete boolean using NtSetFileInformation. 3. Close the handle. The PendingDelete flag being set has two consequences. Firstly, all new file handle opens with read or write privs will now fail with STATUS_DELETE_PENDING (ACCESS_DENIED in Win32), though you can still open a new handle if you ask for no privs. Existing handles are unaffected. Secondly, as the reference count for the handle decreases, when it hits zero the PendingDelete flag means to do the following: 1. Mark as hidden the file name in its directory. It will no longer appear in directory enumerations, but attempting to create a file with the same name will return a STATUS_DELETE_PENDING error with no apparent cause. 2. Secure erasing now occurs, which on CIA/NSA editions of Windows means multiple scrubs of the file contents, for each of the named streams attached to the file entry. 3. Actual deallocation of the inode and extents containing data now occurs. These get scheduled to be flushed to the disc as soon as possible. 4. Once the extents deallocation hits the journal, only now does the file name become actually deleted from the directory and a new file with the same name can be created. On marking the entry as empty, Windows again flushes the directory to physical storage (i.e. a fsync). Note that on a busy hard drive it can take milliseconds for the extents deallocation to reach physical storage. Note that so long as the file name remains in the directory, even if hidden, you *cannot* delete that directory because the directory is not empty, even if it is indistinguishable from being empty. You can at this stage see how many ways a directory tree delete can fail. Firstly, any program holding open a handle to any file or directory in the tree will prevent deletion occuring, and therefore directories are not empty, and therefore cannot be removed. As you mentioned, TortoiseGit is a devil for that, but so are virus checkers or anything else which opens file handles. Secondly, if you try to delete a directory tree too quickly - which AFIO usually does because it will parallelise deletes on all CPU cores - you get caught by files taking up to a millisecond in the "file entry hidden but not actually deleted" stage which stops the directory being deleted. Not being able to delete a directory means everything higher up the tree can't also be deleted. It's a big pain. There is also a big problem with these semantics and lock files. If many threads are creating and deleting a lock file quickly, much of the time you get back access denied errors due to the zombie "being deleted" stage rather than more obvious errors like "file already exists". You also get enormously lowered lock file performance such that Windows looks very slow compared to Linux. All of the above plague AFIO's unit testing on Windows because code which works perfectly on POSIX will fail in all sorts of random ways on Windows, and AFIO does a lot of heavy filing system stress testing. So, in the v1.3 release (any day now, everything is finished except for fixing the last of the filing system races as AFIO now has a "race free" mode) I've added the following workarounds: 1. When tagging deletion, first rename it to a 128 bit hex crypto random name. DELETE privs also allows renaming, so this allows a new file with the same name to be immediately created. Performance with this workaround alone is about 20x faster, and suddenly NTFS looks competitive to ext4. 2. I'm shortly about to improve workaround 1 by renaming the about to be deleted item to live somewhere else on the same volume not in its original directory, and still with its crypto random name. I have yet to write logic to figure out some suitable other location on the same mounted volume, but it should be easy enough. This improved workaround stops pending file deletion getting in the way of deleting directory trees, and should make Windows filing system semantics identical to POSIX [1][2]. [1]: Well, apart from renaming. Windows does not permit renaming a directory containing an open file handle, so some future AFIO version may depth rename all the contents of a directory to the temp location, rename the directory, and then rename all the contents back in again. Yes this is completely daft. The only good news is that renames when using the NT kernel API are amazingly quick, and atomic, because metadata flushes of the containing directories don't appear to occur until the handle is closed. As the NT kernel API requires renames to open a handle to the item, you simply hold open the item during the switch out/switch in. [2]: The only other semantic difference is in symbolic link traversal. Windows doesn't have the same semantics as POSIX period. I can't do much about that. Most of the time no one will notice however. Okay, boring the list is over! Hopefully something in the above might be useful in helping Boost.Filesystem work around NT idiosyncracies. As much as they appear to be a pain, there is a logic to them, and judicious use of renaming can allow a reasonable emulation of POSIX semantics. Niall -- ned Productions Limited Consulting http://www.nedproductions.biz/ http://ie.linkedin.com/in/nialldouglas/