Comment 7 for bug 317781

Theodore Ts'o (tytso) wrote :

Ben --- can you tell me what version of the kernel you are using? Since you are a Gentoo user, it's not obvious to me what version of the kernel you are using, and whether you have any ext4-related patches installed or not.

Bogden --- *any* files written during the previous boot cycle?

I've done some testing, using Ubuntu Interpid, and a stock (unmodified) 2.6.28 kernel on a Lenovo S10 netbook (my crash and burn machine; great for doing testing :-). On it, I created a fresh ext4 filesystem on an LVM partition, and I used as a test source a directory /home/worf, a test account that has been used briefly right after I installed it, so it has gnome dot files, plus a relatively small number of files in the Firefox cache. Its total size is 21 megabytes.

I then created a ext4 filesystem, and then tested it as follows:

% sudo bash
# cp -r /home/worf /mnt ; sleep 120; echo b > /proc/sysrq-trigger

After the system was forcely rebooted (the echo b >/proc/sysrq-trigger emulates a crash), I checked the contents of /mnt/worf using cp -r and cfv, and below changed the sleep time. What I found was that at sleep times above 65 seconds, all of /mnt/worf was safely written to disk. Below 30 seconds, none of /mnt/worf was written to disk. If the sleep 120 was replaced with a sync, everything was written to disk.

How aggressively the system writes things back out to disk can be controlled via some tuning parameters, in particular /proc/sys/vm/dirty_expire_centisecs and /proc/sys/vm/dirty_writeback_centisecs. The latter, in particular will be adjusted by laptop_mode and other tools that are trying to extend battery lifespans.

So the bottom line is that I'm not able to replicate any data loss except for very recently written data before a crash, and this can be controlled by explicitly using the "sync" command or adjusting how aggressively the system writes back dirty pages via /proc/sys/vm/dirty_expire_centisecs and /proc/sys/vm/dirty_writeback_centisecs.

It would be useful if you could send me the output of "sysctl -a", and if you can tell me whether the amount of data that you are losing is decreased if you explicitly issue the "sync" command before the crash (which you can simulate via "echo b > /proc/sysctl-trigger".)