large_dir in ext4 broken

Bug #1933074 reported by Jan Vollendorf
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Triaged
High
Unassigned
Bionic
Fix Released
High
Colin Ian King
Focal
Fix Released
High
Unassigned
Groovy
Won't Fix
High
Unassigned
Hirsute
Fix Released
High
Unassigned
Impish
Won't Fix
High
Unassigned

Bug Description

== SRU, Bionic, Focal, Groovy, Hirsute, Impish ==

[Impact]

Creating millions of files on ext4 partition with large_dir support by touching them will eventually trip an ext4 leaf node issue in the index hash. This occurs more frequently when also using smaller block sizes and ends up either with a EXIST or EUCLEAN failure.

This occurs on the restart condition when performing do_split.

[ Fix ]

The fix protects do_split() from the restart condition, making it safe from both current and future ordering of goto statements in earlier sections of the code.

The fix is from a patch sent upstream and cc'd to Ted Tso but didn't appear on the ext4 mailing list presumably because it got marked as SPAM.

[ Test Case ]

Without the fix touching tens of thousands of empty files will trip the issue. It seems to occur more frequently with memory pressure and smaller block sizes, e.g.:

sudo mkdir -p /mnt/tmpfs /mnt/storage
sudo mount -t tmpfs -o size=9000M tmpfs /mnt/tmpfs
sudo dd if=/dev/urandom of=/mnt/tmpfs/ext4.img bs=1M
sudo mkfs.ext4 -O large_dir -N 21000000 -O dir_index /mnt/tmpfs/ext4.img -b 1024 -F
sudo mount /mnt/tmpfs/ext4.img /mnt/storage

and compile and run the attached C program (see https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1933074/+attachment/5509402/+files/touch.c) that quickly populates /mnt/storage with empty files. Without the fix this will terminate with an -EEXIST or -EUCLEAN error on the file creation after several tens of thousands of files.

[Where problems could occur]

This changes the behaviour of the directory indexing hashing so there is a regression potential that this may introduce subsequent index hashing issues when needed (or not) to do a split. This patch seems to cover all the necessary cases, so I believe this risk is relatively low. I have also tested this on all the kernel series in the SRU with 21,000,000 files so I am confident we have enough test coverage to show the fix is OK.

----------------------------------------------------------

I believe, I found a bug in ext4 in recent kernel versions.
I stumbled across this while I was trying to restore a backup to a new VM.

How to reproduce this bug:

1. Use a virtual/physical machine with "Ubuntu 18.04.5 LTS" and kernel version 4.15.0-144-generic.
2. add a secondary disk to hold the test files.
3. prepare and mount the filesystem with enabled 'large_dir' flag:
mkfs.ext4 -m0 /dev/sdb1;
tune2fs -O large_dir /dev/sdb1;
mkdir /mnt/storage;
mount /dev/sdb1 /mnt/storage;
4. change to directory and create approx. 16 mio files
cd /mnt/storage;
i=0;
while (( $i < 20000000 )); do
  i=$(( $i + 1 ));
  (( $i % 1000 == 0 )) && echo $i;
  touch file_$i.dat || break;
done

Expected behaviour:
- 20 mio files shoud be created without error

What happened instead:
- The loop aborts with an error message:
# 16263100
# touch: cannot touch 'file_16263173.dat': Structure needs cleaning
- dmesg gives a little more details:
# [Mon Jun 21 03:15:18 2021] EXT4-fs error (device sdb): dx_probe:855: inode #2: block 146221: comm touch: directory leaf block found instead of index block

Additional notes:
- This occurs on kernel version 4.15.0-144-generic
- Not sure, but I believe one test was run on 4.15.0-143-generic and failed too.
- Did not check against 4.15.0-142-generic
- On 4.15.0-141-generic, the problem does not exist. Behaviour is as expected.

CVE References

Stefan Bader (smb)
affects: linux-signed (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Stefan Bader (smb)
status: New → Triaged
Revision history for this message
Colin Ian King (colin-king) wrote :

C source to reproduce the problem as fast as possible.

The -r option will remove the files.

description: updated
description: updated
Changed in linux (Ubuntu Bionic):
assignee: Stefan Bader (smb) → Colin Ian King (colin-king)
Stefan Bader (smb)
Changed in linux (Ubuntu Bionic):
importance: Undecided → High
Changed in linux (Ubuntu Focal):
importance: Undecided → High
status: New → Triaged
Changed in linux (Ubuntu Groovy):
importance: Undecided → High
status: New → Triaged
Changed in linux (Ubuntu Hirsute):
importance: Undecided → High
status: New → Triaged
Changed in linux (Ubuntu Impish):
importance: Undecided → High
status: New → Triaged
Changed in linux (Ubuntu Bionic):
status: Triaged → Fix Committed
Changed in linux (Ubuntu Focal):
status: Triaged → Fix Committed
Changed in linux (Ubuntu Groovy):
status: Triaged → Fix Committed
Changed in linux (Ubuntu Hirsute):
status: Triaged → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-hirsute' to 'verification-done-hirsute'. If the problem still exists, change the tag 'verification-needed-hirsute' to 'verification-failed-hirsute'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-hirsute
tags: added: verification-needed-focal
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

Revision history for this message
Colin Ian King (colin-king) wrote :

Tested bionic 4.15.0-152-generic, test passes fine. PASSED.

Revision history for this message
Colin Ian King (colin-king) wrote :

Tested focal 5.4.0-81-generic, test passes fine. PASSED.

Revision history for this message
Colin Ian King (colin-king) wrote :

Tested hirsute 5.11.0-26-generic, test passes fine. PASSED.

tags: added: verification-done-bionic verification-done-focal verification-done-hirsute
removed: verification-needed-bionic verification-needed-focal verification-needed-hirsute
Revision history for this message
Brian Murray (brian-murray) wrote :

The Groovy Gorilla has reached end of life, so this bug will not be fixed for that release

Changed in linux (Ubuntu Groovy):
status: Fix Committed → Won't Fix
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (18.2 KiB)

This bug was fixed in the package linux - 5.4.0-81.91

---------------
linux (5.4.0-81.91) focal; urgency=medium

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * large_dir in ext4 broken (LP: #1933074)
    - SAUCE: ext4: fix directory index node split corruption

  * Some test in kselftest/net on focal source tree were not tested at all
    (LP: #1934282)
    - selftests/net: add missing tests to Makefile

  * curtin: install flash-kernel in arm64 UEFI unexpected (LP: #1918427)
    - [Packaging] Allow grub-efi-arm* to satisfy recommends on ARM

  * Add l2tp.sh in net from ubuntu_kernel_selftests back (LP: #1934293)
    - Revert "UBUNTU: SAUCE: selftests/net -- disable l2tp.sh test"

  * icmp_redirect.sh in net from ubuntu_kernel_selftests failed on F-OEM-5.6 /
    F-OEM-5.10 / F-OEM-5.13 / F / G / H (LP: #1880645)
    - selftests: icmp_redirect: support expected failures

  * Focal update: v5.4.128 upstream stable release (LP: #1934179)
    - dmaengine: ALTERA_MSGDMA depends on HAS_IOMEM
    - dmaengine: QCOM_HIDMA_MGMT depends on HAS_IOMEM
    - dmaengine: stedma40: add missing iounmap() on error in d40_probe()
    - afs: Fix an IS_ERR() vs NULL check
    - mm/memory-failure: make sure wait for page writeback in memory_failure
    - kvm: LAPIC: Restore guard to prevent illegal APIC register access
    - batman-adv: Avoid WARN_ON timing related checks
    - net: ipv4: fix memory leak in netlbl_cipsov4_add_std
    - vrf: fix maximum MTU
    - net: rds: fix memory leak in rds_recvmsg
    - net: lantiq: disable interrupt before sheduling NAPI
    - udp: fix race between close() and udp_abort()
    - rtnetlink: Fix regression in bridge VLAN configuration
    - net/sched: act_ct: handle DNAT tuple collision
    - net/mlx5e: Remove dependency in IPsec initialization flows
    - net/mlx5e: Fix page reclaim for dead peer hairpin
    - net/mlx5: Consider RoCE cap before init RDMA resources
    - net/mlx5e: allow TSO on VXLAN over VLAN topologies
    - net/mlx5e: Block offload of outer header csum for UDP tunnels
    - netfilter: synproxy: Fix out of bounds when parsing TCP options
    - sch_cake: Fix out of bounds when parsing TCP options and header
    - alx: Fix an error handling path in 'alx_probe()'
    - net: stmmac: dwmac1000: Fix extended MAC address registers definition
    - net: make get_net_ns return error if NET_NS is disabled
    - qlcnic: Fix an error handling path in 'qlcnic_probe()'
    - netxen_nic: Fix an error handling path in 'netxen_nic_probe()'
    - net: qrtr: fix OOB Read in qrtr_endpoint_post
    - ptp: improve max_adj check against unreasonable values
    - net: cdc_ncm: switch to eth%d interface naming
    - lantiq: net: fix duplicated skb in rx descriptor ring
    - net: usb: fix possible use-after-free in smsc75xx_bind
    - net: fec_ptp: fix issue caused by refactor the fec_devtype
    - net: ipv4: fix memory leak in ip_mc_add1_src
    - net/af_unix: fix a data-race in unix_dgram_sendmsg / unix_release_sock
    - be2net: Fix an error handling path in 'be_probe()'
    - net: hamradio: fix memory leak in mkiss_close
    - net: cdc_eem: fix tx fixup skb leak
    - cxgb4: fix wrong shift.
    - bnx...

Changed in linux (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (48.1 KiB)

This bug was fixed in the package linux - 5.11.0-31.33

---------------
linux (5.11.0-31.33) hirsute; urgency=medium

  * hirsute/linux: 5.11.0-31.33 -proposed tracker (LP: #1939553)

  * REGRESSION: shiftfs lets sendfile fail with EINVAL (LP: #1939301)
    - SAUCE: shiftfs: fix sendfile() invocations

linux (5.11.0-26.28) hirsute; urgency=medium

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * large_dir in ext4 broken (LP: #1933074)
    - SAUCE: ext4: fix directory index node split corruption

  * Add l2tp.sh in net from ubuntu_kernel_selftests back (LP: #1934293)
    - Revert "UBUNTU: SAUCE: selftests/net -- disable l2tp.sh test"

  * icmp_redirect.sh in net from ubuntu_kernel_selftests failed on F-OEM-5.6 /
    F-OEM-5.10 / F-OEM-5.13 / F / G / H (LP: #1880645)
    - selftests: icmp_redirect: support expected failures

  * Mute/mic LEDs no function on some HP platfroms (LP: #1934878)
    - ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 450 G8
    - ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 445 G8
    - ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 630 G8

  * [SRU][OEM-5.10/H] Fix HDMI output issue on Intel TGL GPU (LP: #1934864)
    - drm/i915: Fix HAS_LSPCON macro for platforms between GEN9 and GEN10

  * mute/micmute LEDs no function on HP EliteBook 830 G8 Notebook PC
    (LP: #1934239)
    - ALSA: hda/realtek: fix mute/micmute LEDs for HP EliteBook 830 G8 Notebook PC

  * ubuntu-host driver lacks lseek ops (LP: #1934110)
    - ubuntu-host: add generic lseek op

  * ubuntu_kernel_selftests ftrace fails on arm64 F / aws-5.8 / amd64 F
    azure-5.8 (LP: #1927749)
    - selftests/ftrace: fix event-no-pid on 1-core machine

  * Hirsute update: upstream stable patchset 2021-06-29 (LP: #1934012)
    - proc: Track /proc/$pid/attr/ opener mm_struct
    - ASoC: max98088: fix ni clock divider calculation
    - ASoC: amd: fix for pcm_read() error
    - spi: Fix spi device unregister flow
    - spi: spi-zynq-qspi: Fix stack violation bug
    - bpf: Forbid trampoline attach for functions with variable arguments
    - net/nfc/rawsock.c: fix a permission check bug
    - usb: cdns3: Fix runtime PM imbalance on error
    - ASoC: Intel: bytcr_rt5640: Add quirk for the Glavey TM800A550L tablet
    - ASoC: Intel: bytcr_rt5640: Add quirk for the Lenovo Miix 3-830 tablet
    - vfio-ccw: Reset FSM state to IDLE inside FSM
    - vfio-ccw: Serialize FSM IDLE state with I/O completion
    - ASoC: sti-sas: add missing MODULE_DEVICE_TABLE
    - spi: sprd: Add missing MODULE_DEVICE_TABLE
    - usb: chipidea: udc: assign interrupt number to USB gadget structure
    - isdn: mISDN: netjet: Fix crash in nj_probe:
    - bonding: init notify_work earlier to avoid uninitialized use
    - netlink: disable IRQs for netlink_lock_table()
    - net: mdiobus: get rid of a BUG_ON()
    - cgroup: disable controllers at parse time
    - wq: handle VM suspension in stall detection
    - net/qla3xxx: fix schedule while atomic in ql_sem_spinlock
    - RDS tcp loopback connection can hang
    - net:sfc: fix non-freed irq in legacy irq mode
    - scsi: bnx2fc: Return failure if io_req is already in ABTS processing
    - scsi:...

Changed in linux (Ubuntu Hirsute):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (4.5 KiB)

This bug was fixed in the package linux - 4.15.0-154.161

---------------
linux (4.15.0-154.161) bionic; urgency=medium

  * bionic/linux: 4.15.0-154.161 -proposed tracker (LP: #1938411)

  * Potential reverts of 4.19.y stable changes in 18.04 (LP: #1938537)
    - SAUCE: Revert "locking/mutex: clear MUTEX_FLAGS if wait_list is empty due to
      signal"
    - SAUCE: Revert "drm/amd/amdgpu: fix refcount leak"

  * Packaging resync (LP: #1786013)
    - [Packaging] resync getabis
    - [Packaging] update helper scripts
    - update dkms package versions

  * btrfs: Automatic balance returns -EUCLEAN and leads to forced readonly
    filesystem (LP: #1934709) // CVE-2019-19036
    - btrfs: Validate child tree block's level and first key
    - btrfs: Detect unbalanced tree with empty leaf before crashing btree
      operations

  * btrfs: Automatic balance returns -EUCLEAN and leads to forced readonly
    filesystem (LP: #1934709)
    - Revert "btrfs: Detect unbalanced tree with empty leaf before crashing btree
      operations"
    - Revert "btrfs: Validate child tree block's level and first key"
    - btrfs: Only check first key for committed tree blocks
    - btrfs: Fix wrong first_key parameter in replace_path

  * Enable fib-onlink-tests.sh and msg_zerocopy.sh in kselftests/net on Bionic
    (LP: #1934759)
    - selftests: Add fib-onlink-tests.sh to TEST_PROGS
    - selftests: net: use TEST_PROGS_EXTENDED
    - selftests/net: enable msg_zerocopy test
    - SAUCE: selftests: Make fib-onlink-tests.sh executable

  * Kernel oops due to uninitialized list on kernfs (kernfs_kill_sb)
    (LP: #1934175)
    - kernfs: deal with kernfs_fill_super() failures
    - unfuck sysfs_mount()

  * large_dir in ext4 broken (LP: #1933074)
    - SAUCE: ext4: fix directory index node split corruption

  * btrfs: Attempting to balance a nearly full filesystem with relocated root
    nodes fails (LP: #1933172) // CVE-2019-19036
    - btrfs: reloc: fix reloc root leak and NULL pointer dereference

  * btrfs: Attempting to balance a nearly full filesystem with relocated root
    nodes fails (LP: #1933172)
    - Revert "btrfs: reloc: fix reloc root leak and NULL pointer dereference"

  * Pixel format change broken for Elgato Cam Link 4K (LP: #1932367)
    - (upstream) media: uvcvideo: Fix pixel format change for Elgato Cam Link 4K

  * Bionic update: upstream stable patchset 2021-06-23 (LP: #1933375)
    - net: usb: cdc_ncm: don't spew notifications
    - efi: Allow EFI_MEMORY_XP and EFI_MEMORY_RO both to be cleared
    - efi: cper: fix snprintf() use in cper_dimm_err_location()
    - vfio/pci: Fix error return code in vfio_ecap_init()
    - vfio/pci: zap_vma_ptes() needs MMU
    - vfio/platform: fix module_put call in error flow
    - ipvs: ignore IP_VS_SVC_F_HASHED flag when adding service
    - HID: pidff: fix error return code in hid_pidff_init()
    - HID: i2c-hid: fix format string mismatch
    - netfilter: nfnetlink_cthelper: hit EBUSY on updates if size mismatches
    - ieee802154: fix error return code in ieee802154_add_iface()
    - ieee802154: fix error return code in ieee802154_llsec_getparams()
    - Bluetooth: fix the erroneous flush_work() order
    - Blu...

Read more...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote :

Ubuntu 21.10 (Impish Indri) has reached end of life, so this bug will not be fixed for that specific release.

Changed in linux (Ubuntu Impish):
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.