Comment 204 for bug 317781

Revision history for this message
Daniel Colascione (dcolascione) wrote :

First of all, the program under discussion got it wrong. It shouldn't have unlinked the destination filename. But the scenario it unwittingly created is *identical* to the first-time creation of a filename via a rename, and that's a very important case. EVERY program will encounter it the first time it creates a file via an atomic rename. If the system dies at the wrong time, the program will see a zero-length file in place of the one it just wrote.

This is your scenario two. This is *NOT* about data loss. If the program cared about data loss, it'd use fsync(), dammit. This is about consistent state.

The program didn't put that zero-length file there. Why should it be expected to handle it? It's perfectly reasonable to barf on a zero-length file. What if it's XML and needs a root element? What if it's a database that needs metadata? It's unreasonable to expect every program and library to be modified to not barf on empty files *it didn't write* just like it's unreasonable to modify every program to fsync gratuitously. Again -- from the point of view of the program on a running system, there was at *NO TIME* a zero-length file. Why should these programs have to deal with them mysteriously appearing after a crash?

Okay, and now what about XFS? XFS fills files with NULL instead of truncating them down to zero length (technically, it just makes the whole file sparse, but that's beside the point.) Do programs need to specially handle the full-of-NULLs case too? How many hoops will they have to go through just to pacify sadistic filesystems?

A commit after every rename has a whole host of advantages. It rounds out and completes the partial guarantee provided by a commit after an overwriting rename. It completely prevents the appearance of a garbage file regardless of whether a program is writing the destination for the first or the nth time. It prevents anyone from having to worry about garbage files at all.

It's far better to fix a program completely than to get it right 99% of the time and leave a sharp edge hiding in some dark corner. Just fix rename.

And what's the downside anyway? High-throughput applications don't rename brand-new files after they've just created them anyway.

As for no users being able to log in -- I was referring to an old BSD network daemon. But for a more modern example, how about cron.deny? If cron.deny does not exist, only root can use cron. If cron.deny exists *AND IS EMPTY*, all users can use cron.