Activity log for bug #1847340

Date Who What changed Old value New value Message
2019-10-08 20:01:39 dann frazier bug added bug
2019-10-08 20:30:10 Ubuntu Kernel Bot linux (Ubuntu): status New Incomplete
2019-10-08 21:41:28 dann frazier linux (Ubuntu): status Incomplete Confirmed
2019-10-08 21:41:36 dann frazier nominated for series Ubuntu Eoan
2019-10-08 21:41:36 dann frazier bug task added linux (Ubuntu Eoan)
2019-10-08 21:41:36 dann frazier nominated for series Ubuntu Ff-series
2019-10-08 21:41:36 dann frazier bug task added linux (Ubuntu Ff-series)
2019-10-08 21:41:36 dann frazier nominated for series Ubuntu Bionic
2019-10-08 21:41:36 dann frazier bug task added linux (Ubuntu Bionic)
2019-10-08 21:41:36 dann frazier nominated for series Ubuntu Disco
2019-10-08 21:41:36 dann frazier bug task added linux (Ubuntu Disco)
2019-10-08 21:41:45 dann frazier linux (Ubuntu Bionic): status New Confirmed
2019-10-08 21:41:51 dann frazier linux (Ubuntu Eoan): status Confirmed New
2019-10-08 22:00:07 Ubuntu Kernel Bot linux (Ubuntu): status New Incomplete
2019-10-08 22:00:10 Ubuntu Kernel Bot linux (Ubuntu Disco): status New Incomplete
2020-01-27 10:39:03 Mauricio Faria de Oliveira bug added subscriber Mauricio Faria de Oliveira
2020-07-02 19:56:27 Steve Langasek linux (Ubuntu Disco): status Incomplete Won't Fix
2021-08-24 20:21:55 Terry Rudd linux (Ubuntu Eoan): status Incomplete Won't Fix
2021-09-09 19:29:16 Mauricio Faria de Oliveira nominated for series Ubuntu Hirsute
2021-09-09 19:29:16 Mauricio Faria de Oliveira bug task added linux (Ubuntu Hirsute)
2021-09-09 19:29:29 Mauricio Faria de Oliveira linux (Ubuntu Bionic): status Confirmed In Progress
2021-09-09 19:29:33 Mauricio Faria de Oliveira linux (Ubuntu Bionic): importance Undecided Medium
2021-09-09 19:29:36 Mauricio Faria de Oliveira linux (Ubuntu Bionic): assignee Mauricio Faria de Oliveira (mfo)
2021-09-09 19:29:40 Mauricio Faria de Oliveira linux (Ubuntu Focal): status New In Progress
2021-09-09 19:29:43 Mauricio Faria de Oliveira linux (Ubuntu Focal): importance Undecided Medium
2021-09-09 19:29:45 Mauricio Faria de Oliveira linux (Ubuntu Focal): assignee Mauricio Faria de Oliveira (mfo)
2021-09-09 19:29:47 Mauricio Faria de Oliveira linux (Ubuntu Hirsute): status New Fix Released
2021-09-09 19:37:09 Mauricio Faria de Oliveira description [Impact] In the event of a loss of power, ext4 filesystems mounted w/ data=journal,journal_checksum are subject to a corruption issue that requires a fsck to recover. This is exacerbated by installations by curtin that set passno=0 in /etc/fstab, preventing fsck from running automatically and thus requiring a manual recovery. And *that* is further exacerbated because initramfs-tools is smart enough to not include fsck.ext4 when passno=0 is detected in /etc/fstab, requiring the user to boot from recovery media. [Test Case] Forcibly power cycle a system running 'stress-ng --dir 0'. I've created a package to automate the reproduction: https://git.launchpad.net/~dannf/+git/dgx2-ext4-csum-repro?h=master [Fix] [Regression Risk] [Impact] With mmap()ed files on ext4's data journaling it's possible to change a mapped page's buffers contents during their jbd2 transaction commit (as currently nothing prevents/blocks the write access at that time.) This might happen between the buffers checksum calculation and actual write to journal, so the (old) checksum is invalid for the (new) data. If the system crashes after that, but before such journal entry makes it to the filesystem, the journal replay on the next mount just fails, and the filesystem now requires fsck. (apparently curtin might set up /etc/fstab with passno=0, requiring manual intervention.) [39751.096455] EXT4-fs: Warning: mounting with data=journal disables delayed allocation and O_DIRECT support! [39751.114435] JBD2: Invalid checksum recovering block 87305 in log [39751.146133] JBD2: Invalid checksum recovering block 88039 in log [39751.195950] JBD2: Invalid checksum recovering block 49633 in log [39751.265158] JBD2: recovery failed [39751.265163] EXT4-fs (vdc): error loading journal [Fix] The fix is to write-protect the pages during journal transaction commit, so that writes to mapped pages hit a page fault, then ext4's page_mkwrite hook can block until the commit finishes and the buffers can be modified. In order to do that, add jbd2 journal callbacks that the filesystems can customize, called before/after the critical region in transaction commit, then have ext4 in data journaling mode to write-protect the pages whose buffers are being committed (and handle cases that need pages redirtied.) The changes are restricted to the data journaling mode and page_mkwrite hook, and other modes/paths use the same code/behavior in the callbacks. [Test Case] Set up an ext4 filesystem in data journaling mode, and run stress-ng's mmap file test on it, then crash the system after a bit; check whether the filesystem can mount again or not (i.e., with jbd2 checksum errors.) # mkfs.ext4 $DEV # mount -o data=journal $DEV $DIR # cd $DIR # stress-ng --mmap $((4*$(nproc))) --mmap-file & # sleep 60 # echo c >/proc/sysrq-trigger ... # mount -o data=journal $DEV $DIR # PASS/FAIL. # dmesg | tail [Regression Potential] Regressions would likely manifest in ext4 data journaling mode (which is not the default mode, 'ordered') with memory mapped access, as the other modes/paths are largely unaffected by the changes/same behavior. This has been tested with (x)fstests, that showed no regressions on data=ordered and data=journal on both Bionic and Focal (with kernel versions 4.15.0-156-generic and 5.4.0-84-generic) w/in 10 runs each. And the stress-ng test-case as well. (Numbers/details in the LP bug.) [Other info] The patchset is applied on 5.10, so Hirsute (5.11) is already fixed; only Focal and Bionic need it. There are little changes in the patches between Focal and Bionic (mostly minor backport adjustments, mainly due to no vm_fault_t) but unfortunately that needs separate versions for most patches. ... [Original Bug Description] [Impact] In the event of a loss of power, ext4 filesystems mounted w/ data=journal,journal_checksum are subject to a corruption issue that requires a fsck to recover. This is exacerbated by installations by curtin that set passno=0 in /etc/fstab, preventing fsck from running automatically and thus requiring a manual recovery. And *that* is further exacerbated because initramfs-tools is smart enough to not include fsck.ext4 when passno=0 is detected in /etc/fstab, requiring the user to boot from recovery media. [Test Case] Forcibly power cycle a system running 'stress-ng --dir 0'. I've created a package to automate the reproduction: https://git.launchpad.net/~dannf/+git/dgx2-ext4-csum-repro?h=master [Fix] [Regression Risk]
2021-09-09 20:12:55 Mauricio Faria de Oliveira attachment added fstests-run.sh https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1847340/+attachment/5524206/+files/fstests-run.sh
2021-09-09 20:13:08 Mauricio Faria de Oliveira attachment added fstests-check-logs.sh https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1847340/+attachment/5524207/+files/fstests-check-logs.sh
2021-09-09 20:13:20 Mauricio Faria de Oliveira attachment added test-loop.sh https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1847340/+attachment/5524208/+files/test-loop.sh
2021-09-09 20:19:32 Mauricio Faria de Oliveira summary ext4 journal recovery fails w/ data=journal + journal_checksum + mmap ext4 journal recovery fails w/ data=journal + mmap
2021-09-23 13:44:14 Pedro Principeza bug added subscriber Pedro Principeza
2021-09-23 16:25:14 Kleber Sacilotto de Souza linux (Ubuntu Bionic): status In Progress Fix Committed
2021-09-23 22:57:25 Kelsey Steele linux (Ubuntu Focal): status In Progress Fix Committed
2021-09-27 17:27:06 Ubuntu Kernel Bot tags verification-needed-focal
2021-09-28 20:07:41 Ubuntu Kernel Bot tags verification-needed-focal verification-needed-bionic verification-needed-focal
2021-09-28 20:28:28 Mauricio Faria de Oliveira tags verification-needed-bionic verification-needed-focal verification-done-focal verification-needed-bionic
2021-09-30 14:19:25 Mauricio Faria de Oliveira tags verification-done-focal verification-needed-bionic verification-done-bionic verification-done-focal
2021-10-18 19:49:29 Launchpad Janitor linux (Ubuntu Focal): status Fix Committed Fix Released
2021-10-18 19:49:29 Launchpad Janitor cve linked 2021-40490
2021-10-19 16:19:27 Launchpad Janitor linux (Ubuntu Bionic): status Fix Committed Fix Released