Activity log for bug #1933074

Date Who What changed Old value New value Message
2021-06-21 08:16:08 Jan Vollendorf bug added bug
2021-06-22 12:59:37 Terry Rudd bug added subscriber Terry Rudd
2021-06-23 08:29:57 Stefan Bader affects linux-signed (Ubuntu) linux (Ubuntu)
2021-06-23 08:30:10 Stefan Bader nominated for series Ubuntu Bionic
2021-06-23 08:30:10 Stefan Bader bug task added linux (Ubuntu Bionic)
2021-06-23 08:30:24 Stefan Bader linux (Ubuntu Bionic): status New Triaged
2021-06-23 08:30:24 Stefan Bader linux (Ubuntu Bionic): assignee Stefan Bader (smb)
2021-07-06 13:27:50 Colin Ian King description I believe, I found a bug in ext4 in recent kernel versions. I stumbled across this while I was trying to restore a backup to a new VM. How to reproduce this bug: 1. Use a virtual/physical machine with "Ubuntu 18.04.5 LTS" and kernel version 4.15.0-144-generic. 2. add a secondary disk to hold the test files. 3. prepare and mount the filesystem with enabled 'large_dir' flag: mkfs.ext4 -m0 /dev/sdb1; tune2fs -O large_dir /dev/sdb1; mkdir /mnt/storage; mount /dev/sdb1 /mnt/storage; 4. change to directory and create approx. 16 mio files cd /mnt/storage; i=0; while (( $i < 20000000 )); do i=$(( $i + 1 )); (( $i % 1000 == 0 )) && echo $i; touch file_$i.dat || break; done Expected behaviour: - 20 mio files shoud be created without error What happened instead: - The loop aborts with an error message: # 16263100 # touch: cannot touch 'file_16263173.dat': Structure needs cleaning - dmesg gives a little more details: # [Mon Jun 21 03:15:18 2021] EXT4-fs error (device sdb): dx_probe:855: inode #2: block 146221: comm touch: directory leaf block found instead of index block Additional notes: - This occurs on kernel version 4.15.0-144-generic - Not sure, but I believe one test was run on 4.15.0-143-generic and failed too. - Did not check against 4.15.0-142-generic - On 4.15.0-141-generic, the problem does not exist. Behaviour is as expected. == SRU, Bionic, Focal, Groovy, Hirsute, Impish == [Impact] Creating millions of files on ext4 partition with large_dir support by touching them will eventually trip an ext4 leaf node issue in the index hash. This occurs more frequently when also using smaller block sizes and ends up either with a EXIST or EUCLEAN failure. This occurs on the restart condition when performing do_split. [ Fix ] The fix protects do_split() from the restart condition, making it safe from both current and future ordering of goto statements in earlier sections of the code. The fix is from a patch sent upstream and cc'd to Ted Tso but didn't appear on the ext4 mailing list presumably because it got marked as SPAM. [ Test Case ] Without the fix touching tens of thousands of empty files will trip the issue. It seems to occur more frequently with memory pressure and smaller block sizes, e.g.: sudo mkdir -p /mnt/tmpfs /mnt/storage sudo mount -t tmpfs -o size=9000M tmpfs /mnt/tmpfs sudo dd if=/dev/urandom of=/mnt/tmpfs/ext4.img bs=1M sudo mkfs.ext4 -O large_dir -N 21000000 -O dir_index /mnt/tmpfs/ext4.img -b 1024 -F sudo mount /mnt/tmpfs/ext4.img /mnt/storage and compile and run the attached C program that quickly populates /mnt/storage with empty files. Without the fix this will terminate with an -EEXIST or -EUCLEAN error on the file creation after several tens of thousands of files. [Where problems could occur] This changes the behaviour of the directory indexing hashing so there is a regression potential that this may introduce subsequent index hashing issues when needed (or not) to do a split. This patch seems to cover all the necessary cases, so I believe this risk is relatively low. I have also tested this on all the kernel series in the SRU with 21,000,000 files so I am confident we have enough test coverage to show the fix is OK. ---------------------------------------------------------- I believe, I found a bug in ext4 in recent kernel versions. I stumbled across this while I was trying to restore a backup to a new VM. How to reproduce this bug: 1. Use a virtual/physical machine with "Ubuntu 18.04.5 LTS" and kernel version 4.15.0-144-generic. 2. add a secondary disk to hold the test files. 3. prepare and mount the filesystem with enabled 'large_dir' flag: mkfs.ext4 -m0 /dev/sdb1; tune2fs -O large_dir /dev/sdb1; mkdir /mnt/storage; mount /dev/sdb1 /mnt/storage; 4. change to directory and create approx. 16 mio files cd /mnt/storage; i=0; while (( $i < 20000000 )); do   i=$(( $i + 1 ));   (( $i % 1000 == 0 )) && echo $i;   touch file_$i.dat || break; done Expected behaviour: - 20 mio files shoud be created without error What happened instead: - The loop aborts with an error message: # 16263100 # touch: cannot touch 'file_16263173.dat': Structure needs cleaning - dmesg gives a little more details: # [Mon Jun 21 03:15:18 2021] EXT4-fs error (device sdb): dx_probe:855: inode #2: block 146221: comm touch: directory leaf block found instead of index block Additional notes: - This occurs on kernel version 4.15.0-144-generic - Not sure, but I believe one test was run on 4.15.0-143-generic and failed too. - Did not check against 4.15.0-142-generic - On 4.15.0-141-generic, the problem does not exist. Behaviour is as expected.
2021-07-06 13:28:06 Colin Ian King nominated for series Ubuntu Impish
2021-07-06 13:28:06 Colin Ian King bug task added linux (Ubuntu Impish)
2021-07-06 13:28:06 Colin Ian King nominated for series Ubuntu Focal
2021-07-06 13:28:06 Colin Ian King bug task added linux (Ubuntu Focal)
2021-07-06 13:28:06 Colin Ian King nominated for series Ubuntu Groovy
2021-07-06 13:28:06 Colin Ian King bug task added linux (Ubuntu Groovy)
2021-07-06 13:28:06 Colin Ian King nominated for series Ubuntu Hirsute
2021-07-06 13:28:06 Colin Ian King bug task added linux (Ubuntu Hirsute)
2021-07-06 13:31:09 Colin Ian King attachment added C source to touch millions of files https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1933074/+attachment/5509402/+files/touch.c
2021-07-06 13:31:46 Colin Ian King description == SRU, Bionic, Focal, Groovy, Hirsute, Impish == [Impact] Creating millions of files on ext4 partition with large_dir support by touching them will eventually trip an ext4 leaf node issue in the index hash. This occurs more frequently when also using smaller block sizes and ends up either with a EXIST or EUCLEAN failure. This occurs on the restart condition when performing do_split. [ Fix ] The fix protects do_split() from the restart condition, making it safe from both current and future ordering of goto statements in earlier sections of the code. The fix is from a patch sent upstream and cc'd to Ted Tso but didn't appear on the ext4 mailing list presumably because it got marked as SPAM. [ Test Case ] Without the fix touching tens of thousands of empty files will trip the issue. It seems to occur more frequently with memory pressure and smaller block sizes, e.g.: sudo mkdir -p /mnt/tmpfs /mnt/storage sudo mount -t tmpfs -o size=9000M tmpfs /mnt/tmpfs sudo dd if=/dev/urandom of=/mnt/tmpfs/ext4.img bs=1M sudo mkfs.ext4 -O large_dir -N 21000000 -O dir_index /mnt/tmpfs/ext4.img -b 1024 -F sudo mount /mnt/tmpfs/ext4.img /mnt/storage and compile and run the attached C program that quickly populates /mnt/storage with empty files. Without the fix this will terminate with an -EEXIST or -EUCLEAN error on the file creation after several tens of thousands of files. [Where problems could occur] This changes the behaviour of the directory indexing hashing so there is a regression potential that this may introduce subsequent index hashing issues when needed (or not) to do a split. This patch seems to cover all the necessary cases, so I believe this risk is relatively low. I have also tested this on all the kernel series in the SRU with 21,000,000 files so I am confident we have enough test coverage to show the fix is OK. ---------------------------------------------------------- I believe, I found a bug in ext4 in recent kernel versions. I stumbled across this while I was trying to restore a backup to a new VM. How to reproduce this bug: 1. Use a virtual/physical machine with "Ubuntu 18.04.5 LTS" and kernel version 4.15.0-144-generic. 2. add a secondary disk to hold the test files. 3. prepare and mount the filesystem with enabled 'large_dir' flag: mkfs.ext4 -m0 /dev/sdb1; tune2fs -O large_dir /dev/sdb1; mkdir /mnt/storage; mount /dev/sdb1 /mnt/storage; 4. change to directory and create approx. 16 mio files cd /mnt/storage; i=0; while (( $i < 20000000 )); do   i=$(( $i + 1 ));   (( $i % 1000 == 0 )) && echo $i;   touch file_$i.dat || break; done Expected behaviour: - 20 mio files shoud be created without error What happened instead: - The loop aborts with an error message: # 16263100 # touch: cannot touch 'file_16263173.dat': Structure needs cleaning - dmesg gives a little more details: # [Mon Jun 21 03:15:18 2021] EXT4-fs error (device sdb): dx_probe:855: inode #2: block 146221: comm touch: directory leaf block found instead of index block Additional notes: - This occurs on kernel version 4.15.0-144-generic - Not sure, but I believe one test was run on 4.15.0-143-generic and failed too. - Did not check against 4.15.0-142-generic - On 4.15.0-141-generic, the problem does not exist. Behaviour is as expected. == SRU, Bionic, Focal, Groovy, Hirsute, Impish == [Impact] Creating millions of files on ext4 partition with large_dir support by touching them will eventually trip an ext4 leaf node issue in the index hash. This occurs more frequently when also using smaller block sizes and ends up either with a EXIST or EUCLEAN failure. This occurs on the restart condition when performing do_split. [ Fix ] The fix protects do_split() from the restart condition, making it safe from both current and future ordering of goto statements in earlier sections of the code. The fix is from a patch sent upstream and cc'd to Ted Tso but didn't appear on the ext4 mailing list presumably because it got marked as SPAM. [ Test Case ] Without the fix touching tens of thousands of empty files will trip the issue. It seems to occur more frequently with memory pressure and smaller block sizes, e.g.: sudo mkdir -p /mnt/tmpfs /mnt/storage sudo mount -t tmpfs -o size=9000M tmpfs /mnt/tmpfs sudo dd if=/dev/urandom of=/mnt/tmpfs/ext4.img bs=1M sudo mkfs.ext4 -O large_dir -N 21000000 -O dir_index /mnt/tmpfs/ext4.img -b 1024 -F sudo mount /mnt/tmpfs/ext4.img /mnt/storage and compile and run the attached C program (see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1933074/+attachment/5509402/+files/touch.c) that quickly populates /mnt/storage with empty files. Without the fix this will terminate with an -EEXIST or -EUCLEAN error on the file creation after several tens of thousands of files. [Where problems could occur] This changes the behaviour of the directory indexing hashing so there is a regression potential that this may introduce subsequent index hashing issues when needed (or not) to do a split. This patch seems to cover all the necessary cases, so I believe this risk is relatively low. I have also tested this on all the kernel series in the SRU with 21,000,000 files so I am confident we have enough test coverage to show the fix is OK. ---------------------------------------------------------- I believe, I found a bug in ext4 in recent kernel versions. I stumbled across this while I was trying to restore a backup to a new VM. How to reproduce this bug: 1. Use a virtual/physical machine with "Ubuntu 18.04.5 LTS" and kernel version 4.15.0-144-generic. 2. add a secondary disk to hold the test files. 3. prepare and mount the filesystem with enabled 'large_dir' flag: mkfs.ext4 -m0 /dev/sdb1; tune2fs -O large_dir /dev/sdb1; mkdir /mnt/storage; mount /dev/sdb1 /mnt/storage; 4. change to directory and create approx. 16 mio files cd /mnt/storage; i=0; while (( $i < 20000000 )); do   i=$(( $i + 1 ));   (( $i % 1000 == 0 )) && echo $i;   touch file_$i.dat || break; done Expected behaviour: - 20 mio files shoud be created without error What happened instead: - The loop aborts with an error message: # 16263100 # touch: cannot touch 'file_16263173.dat': Structure needs cleaning - dmesg gives a little more details: # [Mon Jun 21 03:15:18 2021] EXT4-fs error (device sdb): dx_probe:855: inode #2: block 146221: comm touch: directory leaf block found instead of index block Additional notes: - This occurs on kernel version 4.15.0-144-generic - Not sure, but I believe one test was run on 4.15.0-143-generic and failed too. - Did not check against 4.15.0-142-generic - On 4.15.0-141-generic, the problem does not exist. Behaviour is as expected.
2021-07-06 13:41:41 Colin Ian King linux (Ubuntu Bionic): assignee Stefan Bader (smb) Colin Ian King (colin-king)
2021-07-07 07:31:40 Stefan Bader linux (Ubuntu Bionic): importance Undecided High
2021-07-07 07:32:07 Stefan Bader linux (Ubuntu Focal): importance Undecided High
2021-07-07 07:32:07 Stefan Bader linux (Ubuntu Focal): status New Triaged
2021-07-07 07:32:20 Stefan Bader linux (Ubuntu Groovy): importance Undecided High
2021-07-07 07:32:20 Stefan Bader linux (Ubuntu Groovy): status New Triaged
2021-07-07 07:32:32 Stefan Bader linux (Ubuntu Hirsute): importance Undecided High
2021-07-07 07:32:32 Stefan Bader linux (Ubuntu Hirsute): status New Triaged
2021-07-07 07:32:47 Stefan Bader linux (Ubuntu Impish): importance Undecided High
2021-07-07 07:32:47 Stefan Bader linux (Ubuntu Impish): status New Triaged
2021-07-15 16:55:36 Kleber Sacilotto de Souza linux (Ubuntu Bionic): status Triaged Fix Committed
2021-07-15 16:55:37 Kleber Sacilotto de Souza linux (Ubuntu Focal): status Triaged Fix Committed
2021-07-15 16:55:39 Kleber Sacilotto de Souza linux (Ubuntu Groovy): status Triaged Fix Committed
2021-07-15 16:55:41 Kleber Sacilotto de Souza linux (Ubuntu Hirsute): status Triaged Fix Committed
2021-07-21 14:57:55 Ubuntu Kernel Bot tags verification-needed-hirsute
2021-07-21 15:00:31 Ubuntu Kernel Bot tags verification-needed-hirsute verification-needed-focal verification-needed-hirsute
2021-07-21 15:03:22 Ubuntu Kernel Bot tags verification-needed-focal verification-needed-hirsute verification-needed-bionic verification-needed-focal verification-needed-hirsute
2021-07-21 17:15:26 Colin Ian King tags verification-needed-bionic verification-needed-focal verification-needed-hirsute verification-done-bionic verification-done-focal verification-done-hirsute
2021-07-28 23:11:29 Brian Murray linux (Ubuntu Groovy): status Fix Committed Won't Fix
2021-08-12 20:35:44 Launchpad Janitor linux (Ubuntu Focal): status Fix Committed Fix Released
2021-08-16 16:20:07 Launchpad Janitor linux (Ubuntu Hirsute): status Fix Committed Fix Released
2021-08-16 19:46:29 Launchpad Janitor linux (Ubuntu Bionic): status Fix Committed Fix Released
2021-08-16 19:46:29 Launchpad Janitor cve linked 2019-19036
2022-07-18 22:56:11 Brian Murray linux (Ubuntu Impish): status Triaged Won't Fix