superblock checksum mismatch in resize2fs

Bug #2036467 reported by Krister Johansen
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
e2fsprogs (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Hi,
We run ext4 on EBS volumes on EC2. During provisioning, cloud-init will occasionally report that resize2fs has failed due to a superblock checksum mismatch. We debugged this internally, and were able to come up with the following reproducer:

   #!/usr/bin/bash
   set -euxo pipefail

   while true
   do
           parted /dev/nvme1n1 mklabel gpt mkpart primary 2048s 2099200s
           sleep .5
           mkfs.ext4 /dev/nvme1n1p1
           mount -t ext4 /dev/nvme1n1p1 /mnt
           stress-ng --temp-path /mnt -D 4 &
           STRESS_PID=$!
           sleep 1
           growpart /dev/nvme1n1 1
           resize2fs /dev/nvme1n1p1
           kill $STRESS_PID
           wait $STRESS_PID
           umount /mnt
           wipefs -a /dev/nvme1n1p1
           wipefs -a /dev/nvme1n1
   done

(This was on a 60gb gp3 volume attached to a c5.4xlarge)

We were able to find a fix that works and get the patch accepted upstream. The short explanation is that by switching the superblock read to direct io, we no longer see the problem.

The patch is available here, but hasn't been published in a released version of e2fsprogs:

https://git.kernel.org/pub/scm/fs/ext2/e2fsprogs.git/commit/?id=43a498e938887956f393b5e45ea6ac79cc5f4b84

A longer thread with the maintainer is available here:

https://<email address hidden>/

This bug report is to request that Ubuntu backport this patch to the versions of e2fsprogs that are in releases that are available in images on AWS, preferably Focal and Jammy.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in e2fsprogs (Ubuntu):
status: New → Confirmed
Revision history for this message
Ye Lu (luye98) wrote :

Hi, we were seeing similar issues when bootstrapping AWS EC2 hosts in our service. We applied the patch provided in upstream internally and it indeed resolved the filesystem resize errors we previously encountered. It would be helpful to also backport the patch in Ubuntu and make it generally available for focal and jammy distributions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.