Activity log for bug #1780137

Date Who What changed Old value New value Message
2018-07-04 16:51:45 dann frazier bug added bug
2018-07-04 17:00:05 Ubuntu Kernel Bot linux (Ubuntu): status New Incomplete
2018-07-05 14:20:48 Joseph Salisbury linux (Ubuntu): importance Undecided Medium
2018-07-05 14:21:02 Joseph Salisbury tags kernel-da-key
2018-07-05 14:21:08 Joseph Salisbury linux (Ubuntu): status Incomplete Triaged
2018-07-05 14:21:26 Joseph Salisbury nominated for series Ubuntu Bionic
2018-07-05 14:21:26 Joseph Salisbury bug task added linux (Ubuntu Bionic)
2018-07-05 14:21:33 Joseph Salisbury linux (Ubuntu Bionic): status New Triaged
2018-07-05 14:21:36 Joseph Salisbury linux (Ubuntu Bionic): importance Undecided Medium
2018-07-05 14:21:42 Joseph Salisbury tags kernel-da-key bionic kernel-da-key
2018-07-05 21:53:58 dann frazier description We're seeing a very reproducible regression in the bionic kernel triggered by the stress-ng chdir test performed by the Ubuntu certification suite. Platform is a HiSilicon D05 arm64 server, but we don't have reason to believe it is platform specific at this time. [Test Case] $ sudo apt-add-repository -y ppa:hardware-certification/public $ sudo apt install -y canonical-certification-server $ sudo mkfs.ext4 /dev/sda1 (Obviously, this should not be your root disk!!) $ sudo /usr/lib/plainbox-provider-checkbox/bin/disk_stress_ng sda --base-time 240 --really-run This test runs a series of stress-ng tests against /dev/sda, and fails on the "chdir" test. To speed up reproduction, reduce the test list to just "chdir" in the disk_stress_ng script. Attempts to reproduce this directly with stress-ng have failed - presumably because of other environment setup that this script performs (e.g. setting aio-max-nr to 524288). Our reproduction test is to use a non-root disk because it can lead to corruption, and mkfs.ext4'ing the partition just before running the test, to get to a pristine fs state. I bisected this down to the following commit: commit 555bc9b1421f10d94a1192c7eea4a59faca3e711 Author: Theodore Ts'o <tytso@mit.edu> Date: Mon Feb 19 14:16:47 2018 -0500 ext4: don't update checksum of new initialized bitmaps BugLink: http://bugs.launchpad.net/bugs/1773233 commit 044e6e3d74a3d7103a0c8a9305dfd94d64000660 upstream. We're seeing a very reproducible regression in the bionic kernel triggered by the stress-ng chdir test performed by the Ubuntu certification suite. We see this on both the HiSilicon D05 arm64 server and the HiSilicon D06 arm64 server. We have been unable to reproduce on other servers so far. [Test Case] $ sudo apt-add-repository -y ppa:hardware-certification/public $ sudo apt install -y canonical-certification-server $ sudo mkfs.ext4 /dev/sda1 (Obviously, this should not be your root disk!!) $ sudo /usr/lib/plainbox-provider-checkbox/bin/disk_stress_ng sda --base-time 240 --really-run This test runs a series of stress-ng tests against /dev/sda, and fails on the "chdir" test. To speed up reproduction, reduce the test list to just "chdir" in the disk_stress_ng script. Attempts to reproduce this directly with stress-ng have failed - presumably because of other environment setup that this script performs (e.g. setting aio-max-nr to 524288). Our reproduction test is to use a non-root disk because it can lead to corruption, and mkfs.ext4'ing the partition just before running the test, to get to a pristine fs state. I bisected this down to the following commit: commit 555bc9b1421f10d94a1192c7eea4a59faca3e711 Author: Theodore Ts'o <tytso@mit.edu> Date: Mon Feb 19 14:16:47 2018 -0500     ext4: don't update checksum of new initialized bitmaps     BugLink: http://bugs.launchpad.net/bugs/1773233     commit 044e6e3d74a3d7103a0c8a9305dfd94d64000660 upstream.
2018-07-10 20:19:21 dann frazier bug added subscriber Ike Panhc
2018-07-10 21:54:05 dann frazier description We're seeing a very reproducible regression in the bionic kernel triggered by the stress-ng chdir test performed by the Ubuntu certification suite. We see this on both the HiSilicon D05 arm64 server and the HiSilicon D06 arm64 server. We have been unable to reproduce on other servers so far. [Test Case] $ sudo apt-add-repository -y ppa:hardware-certification/public $ sudo apt install -y canonical-certification-server $ sudo mkfs.ext4 /dev/sda1 (Obviously, this should not be your root disk!!) $ sudo /usr/lib/plainbox-provider-checkbox/bin/disk_stress_ng sda --base-time 240 --really-run This test runs a series of stress-ng tests against /dev/sda, and fails on the "chdir" test. To speed up reproduction, reduce the test list to just "chdir" in the disk_stress_ng script. Attempts to reproduce this directly with stress-ng have failed - presumably because of other environment setup that this script performs (e.g. setting aio-max-nr to 524288). Our reproduction test is to use a non-root disk because it can lead to corruption, and mkfs.ext4'ing the partition just before running the test, to get to a pristine fs state. I bisected this down to the following commit: commit 555bc9b1421f10d94a1192c7eea4a59faca3e711 Author: Theodore Ts'o <tytso@mit.edu> Date: Mon Feb 19 14:16:47 2018 -0500     ext4: don't update checksum of new initialized bitmaps     BugLink: http://bugs.launchpad.net/bugs/1773233     commit 044e6e3d74a3d7103a0c8a9305dfd94d64000660 upstream. [Impact] We're seeing a very reproducible regression in the bionic kernel triggered by the stress-ng chdir test performed by the Ubuntu certification suite. We see this on both the HiSilicon D05 arm64 server and the HiSilicon D06 arm64 server. We have been unable to reproduce on other servers so far. [Test Case] $ sudo apt-add-repository -y ppa:hardware-certification/public $ sudo apt install -y canonical-certification-server $ sudo mkfs.ext4 /dev/sda1 (Obviously, this should not be your root disk!!) $ sudo /usr/lib/plainbox-provider-checkbox/bin/disk_stress_ng sda --base-time 240 --really-run This test runs a series of stress-ng tests against /dev/sda, and fails on the "chdir" test. To speed up reproduction, reduce the test list to just "chdir" in the disk_stress_ng script. Attempts to reproduce this directly with stress-ng have failed - presumably because of other environment setup that this script performs (e.g. setting aio-max-nr to 524288). Our reproduction test is to use a non-root disk because it can lead to corruption, and mkfs.ext4'ing the partition just before running the test, to get to a pristine fs state. [Fix] I bisected this down to the following commit: commit 555bc9b1421f10d94a1192c7eea4a59faca3e711 Author: Theodore Ts'o <tytso@mit.edu> Date: Mon Feb 19 14:16:47 2018 -0500     ext4: don't update checksum of new initialized bitmaps     BugLink: http://bugs.launchpad.net/bugs/1773233     commit 044e6e3d74a3d7103a0c8a9305dfd94d64000660 upstream. Reverting that fixes the problem. Meanwhile, a proposed fix has been posted upstream: https://www.spinics.net/lists/linux-ext4/msg61578.html [Regression Risk]
2018-07-11 08:03:49 Kleber Sacilotto de Souza linux (Ubuntu Bionic): status Triaged Fix Committed
2018-07-11 08:30:19 Ike Panhc attachment added consolelog.txt https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1780137/+attachment/5162302/+files/consolelog.txt
2018-07-11 18:46:34 Seth Forshee linux (Ubuntu): status Triaged Fix Committed
2018-07-12 12:01:18 Brad Figg tags bionic kernel-da-key bionic kernel-da-key verification-needed-bionic
2018-07-12 16:41:39 dann frazier tags bionic kernel-da-key verification-needed-bionic bionic kernel-da-key verification-done-bionic
2018-07-20 15:39:45 Launchpad Janitor linux (Ubuntu Bionic): status Fix Committed Fix Released
2018-07-26 05:13:52 Launchpad Janitor linux (Ubuntu): status Fix Committed Fix Released