[Regression] EXT4-fs error (device sda1): ext4_validate_inode_bitmap:99: comm stress-ng: Corrupt inode bitmap

Bug #1780137 reported by dann frazier
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Medium
Unassigned
Bionic
Fix Released
Medium
Unassigned

Bug Description

[Impact]
We're seeing a very reproducible regression in the bionic kernel triggered by the stress-ng chdir test performed by the Ubuntu certification suite. We see this on both the HiSilicon D05 arm64 server and the HiSilicon D06 arm64 server. We have been unable to reproduce on other servers so far.

[Test Case]
$ sudo apt-add-repository -y ppa:hardware-certification/public
$ sudo apt install -y canonical-certification-server
$ sudo mkfs.ext4 /dev/sda1 (Obviously, this should not be your root disk!!)
$ sudo /usr/lib/plainbox-provider-checkbox/bin/disk_stress_ng sda --base-time 240 --really-run

This test runs a series of stress-ng tests against /dev/sda, and fails on the "chdir" test. To speed up reproduction, reduce the test list to just "chdir" in the disk_stress_ng script. Attempts to reproduce this directly with stress-ng have failed - presumably because of other environment setup that this script performs (e.g. setting aio-max-nr to 524288).

Our reproduction test is to use a non-root disk because it can lead to corruption, and mkfs.ext4'ing the partition just before running the test, to get to a pristine fs state.

[Fix]
I bisected this down to the following commit:

commit 555bc9b1421f10d94a1192c7eea4a59faca3e711
Author: Theodore Ts'o <email address hidden>
Date: Mon Feb 19 14:16:47 2018 -0500

    ext4: don't update checksum of new initialized bitmaps

    BugLink: http://bugs.launchpad.net/bugs/1773233

    commit 044e6e3d74a3d7103a0c8a9305dfd94d64000660 upstream.

Reverting that fixes the problem. Meanwhile, a proposed fix has been posted upstream:
https://www.spinics.net/lists/linux-ext4/msg61578.html

[Regression Risk]

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1780137

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: Incomplete → Triaged
Changed in linux (Ubuntu Bionic):
status: New → Triaged
importance: Undecided → Medium
tags: added: bionic
Revision history for this message
Colin Ian King (colin-king) wrote :

What is the stress-ng command that is being run by /usr/lib/plainbox-provider-checkbox/bin/disk_stress_ng - without knowing that it's hard to figure out the initial stressor conditions

Revision history for this message
dann frazier (dannf) wrote : Re: [Bug 1780137] Re: [Regression] EXT4-fs error (device sda1): ext4_validate_inode_bitmap:99: comm stress-ng: Corrupt inode bitmap

On Thu, Jul 5, 2018 at 9:35 AM Colin Ian King
<email address hidden> wrote:
>
> What is the stress-ng command that is being run by /usr/lib/plainbox-
> provider-checkbox/bin/disk_stress_ng - without knowing that it's hard to
> figure out the initial stressor conditions

It runs several stress-ng tests against the target. In my
reproduction, I've disabled all but one, and that is 'chdir'. Here's
the invocation:

stress-ng --aggressive --verify --timeout 240 --aio 0 --hdd-opts dsync
--readahead-bytes 16M -k

This is the only stress-ng command that runs after boot/fresh-mkfs.
However, as I mentioned, this stress-ng command alone did not appear
to trigger the issue. I suspect that this is due to sysctls that cert
sets before the running the test, but I have not attempted to prove
that yet.

  -dann

Revision history for this message
Colin Ian King (colin-king) wrote :

It may be worth grabbing a copy of the /proc/sys on a clean boot and then a copy after the sysctl changes so we can get and idea of any specific tweaks that may have occurred.

Just to note, I've been running the stress-ng command as noted in comment #3 on a 24 CPU ARM64 Synquacer box with the alleged faulty kernel and cannot reproduce the issue.

dann frazier (dannf)
description: updated
Revision history for this message
dann frazier (dannf) wrote :

The significant difference in the cert test wrapper actually appears to be that it creates a directory in which to run the test, vs. running it at the root of the mount point. I'm able to reproduce the failure outside of cert after a fresh reboot with the following script. The fs.aio-max-nr sysctl was a red herring.

mkdir -p /tmp/mnt
# My script does a mkfs.ext4 -F, but omitting that here in case someone blindly cut & pastes
mkfs.ext4 /dev/sda1
mount /dev/sda1 /tmp/mnt
# Running directly under /tmp/mnt doesn't trigger the issue, we need
# this subdirectory for some reason
mkdir -p /tmp/mnt/tmp/disk_stress_ng_f70f0f26-b332-4c48-9e07-67c529770e3d
stress-ng --aggressive --verify --timeout 240 --temp-path /tmp/mnt/tmp/disk_stress_ng_f70f0f26-b332-4c48-9e07-67c529770e3d --chdir 0 --hdd-opts dsync --readahead-bytes 16M -k

@Colin: I updated the description to note that we have seen this on 2 different HiSilicon servers that use different HiSilicon SoCs. We've also tested on a Cavium Sabre (ThunderX2-based) board, and were unable to reproduce it there.

Revision history for this message
dann frazier (dannf) wrote :
Revision history for this message
Colin Ian King (colin-king) wrote :

I would be useful to see if a non-SMP boot will cause the same issue; if it only occurs in SMP boots then we know it's a lower level locking/race style of problem.

Revision history for this message
dann frazier (dannf) wrote :

Copying over comments from Ike from a different forum:

Ike Panhc (ikepanhc) wrote on 2018-07-09: #42
I build kernel debs based on Ted's response on ext4 mailing list at

http://kernel.ubuntu.com/~ikepanhc/ext4msg61578/

Please also see

https://www.spinics.net/lists/linux-ext4/msg61578.html

Now running test on d05-4

Ike Panhc (ikepanhc) wrote 14 hours ago: #43
kernel debs in #42 generates other ext4 error message. I am going to tell if there is patch missing from upstream ext4 or we can response these to Ted.

# grep EXT4 d05-4-ipmi.log
[ 26.215587] EXT4-fs (sdb2): mounted filesystem with ordered data mode. Opts: (null)
[ 29.844105] EXT4-fs (sdb2): re-mounted. Opts: errors=remount-ro
[ 3586.211348] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 4705: bad block bitmap checksum
[ 8254.776992] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 4193: bad block bitmap checksum
[12921.245105] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 21254: bad block bitmap checksum
[17595.654170] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 13197: bad block bitmap checksum
[17627.071833] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 13214: bad block bitmap checksum
[17668.508879] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 13230: bad block bitmap checksum
[17764.903199] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 13199: bad block bitmap checksum
[19799.801233] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 19978: bad block bitmap checksum
[36257.005491] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 14158: bad block bitmap checksum
[40924.017749] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 18668: bad block bitmap checksum
[40927.479721] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 18679: bad block bitmap checksum
[50223.976763] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 2865: bad block bitmap checksum
[54896.662047] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 5029: bad block bitmap checksum
[54896.751793] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 5041: bad block bitmap checksum

Revision history for this message
dann frazier (dannf) wrote :

I am no longer able to reproduce after applying Ted's patch. I was able to run my unit test 20 times w/o failing on upstream + patch on d05-6. I then switched over to the Ubuntu kernel + patch, and it has now passed 64 times (and counting).

I then looked to see why Ike is still observing a failure. My theory is that the filesystem Ike was testing was already corrupted by a previous *unpatched* run, so the kernel is finding pre-existing corruption. Evidence follows.

The last record of "sudo mkfs.ext4" running in /var/log/auth.log:

Jul 5 06:43:39 d05-4 sudo: ubuntu : TTY=ttyAMA0 ; PWD=/home/ubuntu ; USER=root ; COMMAND=/sbin/mkfs.ext4 /dev/sda2

While the kernel w/ the fix wasn't built until Jul 9:
[ 0.000000] Linux version 4.15.0-25-generic (root@recht) (gcc version
 7.3.0 (Ubuntu/Linaro 7.3.0-16ubuntu3)) #27+ext4msg61578.1 SMP Mon Jul 9 08:28:49 UTC 2018 (Ubuntu 4.1
5.0-25.27+ext4msg61578.1-generic 4.15.18)

The first time /dev/sda2 was mounted after booting this kernel, it reported known errors:
Jul 9 05:43:33 d05-4 kernel: [ 138.522140] EXT4-fs (sda2): warning: mounting fs with errors, running e2fsck is recommended

Looking at the conserver log (logs all console activity on this system), it looks like the test used did not reformat the disk between iterations:
root@d05-4:~# ^G^G^G^G^G^G^G^G^G^G^G^Gwhile true; do sudo /usr/lib/plainbox-provider-checkbox/bin/disk
_st^Mtress_ng sda --base-time 240 --really-run; done

Finally, I manually ran mkfs.ext4 on /dev/sda2. Afterwards, both my unit test (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1780137/comments/5) and the full disk_stress_ng cert test Ike was running passed without error.

Revision history for this message
dann frazier (dannf) wrote :

@Ike: Let me know if the above sounds plausible. I should also make the following notes, which may provide evidence counter to my theory of pre-existing corruption:

It is possible that mkfs.ext4 was ran outside of sudo (e.g. directly in a root shell), and would therefore not be logged in /var/log/auth.log. It's also possible that testing was running in an ssh shell, so the commands I gleaned from the console may not be the full picture.

While it is true that the first mount of /dev/sda2 after applying Ted's patch reported "mounting fs with errors", subsequent mounts do not have that message, and yet there are "bad block bitmap checksum" messages that follow:

Jul 9 08:19:42 d05-4 kernel: [ 139.572607] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null)
Jul 9 09:17:09 d05-4 kernel: [ 3586.211348] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 4705: bad block bitmap checksum
Jul 9 09:37:34 d05-4 kernel: [ 4810.952360] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null)
Jul 9 10:34:58 d05-4 kernel: [ 8254.776992] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383: comm stress-ng: bg 4193: bad block bitmap checksum

Does that mean the filesystem was somehow no longer corrupted at that time (e.g. fsck'd, or re-mkfs'd), or is it possible that the fs-has-errors flag was just cleared while corruption persisted?

dann frazier (dannf)
description: updated
Changed in linux (Ubuntu Bionic):
status: Triaged → Fix Committed
Revision history for this message
Ike Panhc (ikepanhc) wrote :

@dannf, my command is

 `while true; do sudo /usr/lib/plainbox-provider-checkbox/bin/disk_stress_ng sda --base-time 240 --really-run; mkfs.ext4 -F /dev/sda2; done`

But I also find out that the e2fsprogs I am using is upstream one.

mke2fs 1.44.3-rc2 (3-July-2018)

I am going to reset my environment and re-test.

Seth Forshee (sforshee)
Changed in linux (Ubuntu):
status: Triaged → Fix Committed
Revision history for this message
Brad Figg (brad-figg) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Revision history for this message
dann frazier (dannf) wrote :

Verification results:

ubuntu@d05-6:~$ sudo ./stress
+ umount /tmp/mnt
umount: /tmp/mnt: not mounted.
+ /bin/true
+ mkdir -p /tmp/mnt
+ mkfs.ext4 -F /dev/sda1
mke2fs 1.44.1 (24-Mar-2018)
/dev/sda1 contains a ext4 file system
 last mounted on Wed Jul 11 23:26:23 2018
/dev/sda1 alignment is offset by 244736 bytes.
This may result in very poor performance, (re)-partitioning suggested.
Creating filesystem with 1953124855 4k blocks and 244142080 inodes
Filesystem UUID: 7aa77e9b-8518-4b64-9507-b65d09aa6382
Superblock backups stored on blocks:
 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208,
 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968,
 102400000, 214990848, 512000000, 550731776, 644972544, 1934917632

Allocating group tables: done
Writing inode tables: done
Creating journal (262144 blocks): done
Writing superblocks and filesystem accounting information: done

+ mount /dev/sda1 /tmp/mnt
+ dir=/tmp/mnt/tmp/disk_stress_ng_f70f0f26-b332-4c48-9e07-67c529770e3d
+ mkdir -p /tmp/mnt/tmp/disk_stress_ng_f70f0f26-b332-4c48-9e07-67c529770e3d
+ stress-ng --aggressive --verify --timeout 240 --temp-path /tmp/mnt/tmp/disk_stress_ng_f70f0f26-b332-4c48-9e07-67c529770e3d --chdir 0 --hdd-opts dsync --readahead-bytes 16M -k
stress-ng: info: [2670] dispatching hogs: 64 chdir
stress-ng: info: [2670] cache allocate: using built-in defaults as unable to determine cache details
stress-ng: info: [2670] successful run completed in 244.71s (4 mins, 4.71 secs)
ubuntu@d05-6:~$ cat /proc/version
Linux version 4.15.0-27-generic (buildd@bos02-arm64-027) (gcc version 7.3.0 (Ubuntu/Linaro 7.3.0-16ubuntu3)) #29-Ubuntu SMP Wed Jul 11 08:20:51 UTC 2018

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (4.1 KiB)

This bug was fixed in the package linux - 4.15.0-29.31

---------------
linux (4.15.0-29.31) bionic; urgency=medium

  * linux: 4.15.0-29.31 -proposed tracker (LP: #1782173)

  * [SRU Bionic][Cosmic] kernel panic in ipmi_ssif at msg_done_handler
    (LP: #1777716)
    - ipmi_ssif: Fix kernel panic at msg_done_handler

  * Update to ocxl driver for 18.04.1 (LP: #1775786)
    - misc: ocxl: use put_device() instead of device_unregister()
    - powerpc: Add TIDR CPU feature for POWER9
    - powerpc: Use TIDR CPU feature to control TIDR allocation
    - powerpc: use task_pid_nr() for TID allocation
    - ocxl: Rename pnv_ocxl_spa_remove_pe to clarify it's action
    - ocxl: Expose the thread_id needed for wait on POWER9
    - ocxl: Add an IOCTL so userspace knows what OCXL features are available
    - ocxl: Document new OCXL IOCTLs
    - ocxl: Fix missing unlock on error in afu_ioctl_enable_p9_wait()

  * Critical upstream bugfix missing in Ubuntu 18.04 - frequent Xorg crash after
    suspend (LP: #1776887)
    - ocxl: Document the OCXL_IOCTL_GET_METADATA IOCTL

  * Hard LOCKUP observed on stressing Ubuntu 18 04 (LP: #1777194)
    - powerpc: use NMI IPI for smp_send_stop
    - powerpc: Fix smp_send_stop NMI IPI handling

  * IPL: ppc64_cpu --frequency hang with INFO: rcu_sched detected stalls on
    CPUs/tasks on w34 and wsbmc016 with 920.1714.20170330n (LP: #1773964)
    - rtc: opal: Fix OPAL RTC driver OPAL_BUSY loops

  * [Regression] EXT4-fs error (device sda2): ext4_validate_block_bitmap:383:
    comm stress-ng: bg 4705: bad block bitmap checksum (LP: #1781709)
    - SAUCE: Revert "UBUNTU: SAUCE: ext4: fix ext4_validate_inode_bitmap: comm
      stress-ng: Corrupt inode bitmap"
    - SAUCE: ext4: check for allocation block validity with block group locked

linux (4.15.0-28.30) bionic; urgency=medium

  * linux: 4.15.0-28.30 -proposed tracker (LP: #1781433)

  * Cannot set MTU higher than 1500 in Xen instance (LP: #1781413)
    - xen-netfront: Fix mismatched rtnl_unlock
    - xen-netfront: Update features after registering netdev

linux (4.15.0-27.29) bionic; urgency=medium

  * linux: 4.15.0-27.29 -proposed tracker (LP: #1781062)

  * [Regression] EXT4-fs error (device sda1): ext4_validate_inode_bitmap:99:
    comm stress-ng: Corrupt inode bitmap (LP: #1780137)
    - SAUCE: ext4: fix ext4_validate_inode_bitmap: comm stress-ng: Corrupt inode
      bitmap

linux (4.15.0-26.28) bionic; urgency=medium

  * linux: 4.15.0-26.28 -proposed tracker (LP: #1780112)

  * failure to boot with linux-image-4.15.0-24-generic (LP: #1779827) // Cloud-
    init causes potentially huge boot delays with 4.15 kernels (LP: #1780062)
    - random: Make getrandom() ready earlier

linux (4.15.0-25.27) bionic; urgency=medium

  * linux: 4.15.0-25.27 -proposed tracker (LP: #1779354)

  * hisi_sas_v3_hw: internal task abort: timeout and not done. (LP: #1777736)
    - scsi: hisi_sas: Update a couple of register settings for v3 hw

  * hisi_sas: Add missing PHY spinlock init (LP: #1777734)
    - scsi: hisi_sas: Add missing PHY spinlock init

  * hisi_sas: improve read performance by pre-allocating slot DMA buffers
    (LP: #1777727)
    - scsi: hisi_sas: use dma_zalloc_cohe...

Read more...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.