On Thu, Mar 11, 2010 at 04:55:37PM -0000, Bela Lubkin wrote: > The documentation for -D comments that it "will detect directory entries > with duplicate names in a single directory, which e2fsck normally does > not enforce". It was for this enhanced detection that I added this > flag. I realize that it is a flag which directs fsck to write, but I > believe that it -- as with all(*) other writing flags -- would be > rendered inoperable by "-n". That is, I believed that the combination > "-n -D" would cause additional checks (for directories needing > optimization & for duplicate directory entries) without causing any > writes. (*)I realize this isn't fully true, that the three bad-block- > related flags -[clL] are effective even under -n. This is clearly > documented; the clarity of _that_ documentation lends support to the > supposition that no _other_ flags will override -n. Yeah, sorry. The -D option was added later, and I forgot to update the man page for the -n option. The -D option does indeed allow the file system to be opened read/write, and will rewrite the directories, which is danagerous when the file system is mounted. Your assumption that -n would make the -D will modify the filesystem part go away was a bad one. > In any case, I do not know if it was -D, the combination of -D -E > fragcheck, or some other random issue which caused the problem. For all > I know, `fsck -n` is fundamentally broken on ext4. I do not wish to > conduct further experiments after this unwitting one, which will leave > me reconstructing a system. There is a bug with -D and small directories in e2fsprogs 1.41.10, which I've since fixed, which may have affected you, but fundamentally, it is dangeorus to run e2fsck -D while the filesystem is mounted. > As the transcript shows, fsck responded with: > > /dev/sda5 is mounted. > > WARNING!!! Running e2fsck on a mounted filesystem may cause > SEVERE filesystem damage. > > Do you really want to continue (y/n)? > > Perhaps foolishly, I assumed that this message is issued in all cases -- > whether or not fsck will actually be writing. Nope, e2fsck is smart. It only issues this warning when the filesystem is opened read-only. If you try to run "e2fsck -n /dev/XXX" on a mounted filesystem", it won't ask that question. So yeah, you made two bad assumptions, and that's what lead to your file system getting badly screwed up. I'll change things so the message is made more explicit. In case you're curious, the reason why it was originally the worded the way it was because if the /etc/mtab hasn't been cleared by the init scripts when a user booted into single user mode, it was possible for e2fsck to think the filesystem is mounted, when it really wasn't mounted. But I'd much rather someone get scared off from running e2fsck if their /etc/mtab hasn't been cleared after a system crash, if it avoids the user who thinks, "surely this message doesn't apply to *me*". > After that I ran `fdisk -l`, which failed with an I/O error (I assume > due to the binary or shared objects not being accessible); and then > `df`, which succeeded but showed the root filesystem (/dev/sda5) in bad > shape. > > At that point I was sure the system was destroyed. Just in case, I > switched power off without doing any software shutdown actions; but this > did not help. Upon reboot I see: > > error: unknown filesystem. > grub rescue> _ Now *that's* surprising. That makes it sound like that superblock was destroyed, but it shouldn't have happened even when e2fsck is run read/write on a mounted filesystem. I can't really explain that. > POSSIBLE CAUSE: system was in-place upgraded from Ubuntu 9.10 Karmic > Koala. Root filesystem was ext3, not ext4, before the upgrade. I don't > believe I did anything to explicitly upgrade it to ext4. I probably > should not have invoked fsck as `fsck.ext4` but rather just `e2fsck` or > `fsck`, allowing the system to draw its own conclusion about filesystem > type. Nope, that's not it. Whether you invoke e2fsck as e2fsck, fsck.ext4, or fsck.ext3, doesn't change anything at all about its behaviour. The fatal mistake was -n -D, and then answering "yes" to the WARNING!!! question. > WARNING!!! Running e2fsck on a mounted filesystem may cause > SEVERE filesystem damage. > > Do you really want to continue (y/n)? yes > > /dev/sda5: recovering journal Ah.... recovering the journal while the file system is mounted might very well have done somehow wiped out the superblock. Anyway, I've applied the following two patches to e2fsck, which will be in e2fsprogs 1.41.11. Thanks for the feedback, and I'm sorry you managed to corrupt your filesystem. I think if you run e2fsck from a rescue CD, you should hopefully be able to recover most of your data. Some of the files might end up in /lost+found, but hopefully you'll be able to recover your home directory files, even if you need to reinstall the system afterwards. Best regards, - Ted From: Theodore Ts'o