Activity log for bug #1916468

Date Who What changed Old value New value Message
2021-02-22 10:21:44 Po-Hsu Lin bug added bug
2021-02-22 10:21:44 Po-Hsu Lin attachment added eeh-basic-dmesg.log https://bugs.launchpad.net/bugs/1916468/+attachment/5465912/+files/eeh-basic-dmesg.log
2021-02-22 10:21:54 Po-Hsu Lin nominated for series Ubuntu Focal
2021-02-22 10:21:54 Po-Hsu Lin bug task added linux (Ubuntu Focal)
2021-02-22 10:30:15 Ubuntu Kernel Bot linux (Ubuntu): status New Incomplete
2021-02-22 10:30:17 Ubuntu Kernel Bot linux (Ubuntu Focal): status New Incomplete
2021-02-23 11:48:17 Guilherme G. Piccoli bug added subscriber Guilherme G. Piccoli
2021-02-24 05:37:17 Po-Hsu Lin bug task added ubuntu-kernel-tests
2021-02-24 05:43:45 Po-Hsu Lin linux (Ubuntu): status Incomplete Fix Released
2021-02-24 05:49:54 Po-Hsu Lin description Issue found on node entei with Focal kernel. When trying to run this test, it will try to break 4 devices on Focal, and one of them is using the AHCI driver: $ sudo ./eeh-basic.sh 0000:00:00.0, Skipped: bridge 0001:00:00.0, Skipped: bridge 0020:00:00.0, Skipped: bridge 0021:00:00.0, Skipped: bridge 0021:01:00.0, Skipped: bridge 0021:02:01.0, Skipped: bridge 0021:02:08.0, Skipped: bridge 0021:02:09.0, Skipped: bridge 0021:02:0a.0, Skipped: bridge 0021:02:0b.0, Skipped: bridge 0021:02:0c.0, Skipped: bridge 0021:0d:00.0, Added 0021:0e:00.0, Added 0021:0f:00.0, Skipped: bridge 0021:10:00.0, Added 0022:00:00.0, Skipped: bridge 0022:01:00.0, Added Found 4 breakable devices... Breaking 0021:0d:00.0... 0021:0d:00.0, waited 0/60 0021:0d:00.0, waited 1/60 0021:0d:00.0, waited 2/60 0021:0d:00.0, waited 3/60 0021:0d:00.0, waited 4/60 0021:0d:00.0, waited 5/60 0021:0d:00.0, waited 6/60 0021:0d:00.0, waited 7/60 0021:0d:00.0, waited 8/60 0021:0d:00.0, Recovered after 9 seconds Breaking 0021:0e:00.0... 0021:0e:00.0, waited 0/60 0021:0e:00.0, waited 1/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 2/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 3/60 ./eeh-basic.sh: 74: sleep: Input/output error .... ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 59/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 60/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, Failed to recover! Breaking 0021:10:00.0... Skipping 0021:10:00.0, Initial PE state is not ok Breaking 0022:01:00.0... Skipping 0022:01:00.0, Initial PE state is not ok 3 devices failed to recover (4 tested) ./eeh-basic.sh: 81: lspci: Input/output error ./eeh-basic.sh: 81: diff: Input/output error ./eeh-basic.sh: 82: rm: Input/output error ./eeh-basic.sh: 84: test: 3: unexpected operator With the driver failed to recovery, the system will start acting up. $ ls ls: command not found And drop into read-only state, dmesg can be found in the attachment. [Impact] When trying to run this test on P8 node entei with Focal kernel, it will try to break 4 devices on Focal, and one of them is using the AHCI driver: $ sudo ./eeh-basic.sh 0000:00:00.0, Skipped: bridge 0001:00:00.0, Skipped: bridge 0020:00:00.0, Skipped: bridge 0021:00:00.0, Skipped: bridge 0021:01:00.0, Skipped: bridge 0021:02:01.0, Skipped: bridge 0021:02:08.0, Skipped: bridge 0021:02:09.0, Skipped: bridge 0021:02:0a.0, Skipped: bridge 0021:02:0b.0, Skipped: bridge 0021:02:0c.0, Skipped: bridge 0021:0d:00.0, Added 0021:0e:00.0, Added 0021:0f:00.0, Skipped: bridge 0021:10:00.0, Added 0022:00:00.0, Skipped: bridge 0022:01:00.0, Added Found 4 breakable devices... Breaking 0021:0d:00.0... 0021:0d:00.0, waited 0/60 0021:0d:00.0, waited 1/60 0021:0d:00.0, waited 2/60 0021:0d:00.0, waited 3/60 0021:0d:00.0, waited 4/60 0021:0d:00.0, waited 5/60 0021:0d:00.0, waited 6/60 0021:0d:00.0, waited 7/60 0021:0d:00.0, waited 8/60 0021:0d:00.0, Recovered after 9 seconds Breaking 0021:0e:00.0... 0021:0e:00.0, waited 0/60 0021:0e:00.0, waited 1/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 2/60 ./eeh-basic.sh: 74: sleep: Input/output error .... ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 59/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 60/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, Failed to recover! Breaking 0021:10:00.0... Skipping 0021:10:00.0, Initial PE state is not ok Breaking 0022:01:00.0... Skipping 0022:01:00.0, Initial PE state is not ok 3 devices failed to recover (4 tested) ./eeh-basic.sh: 81: lspci: Input/output error ./eeh-basic.sh: 81: diff: Input/output error ./eeh-basic.sh: 82: rm: Input/output error ./eeh-basic.sh: 84: test: 3: unexpected operator With the driver failed to recovery, the system will start acting up. $ ls ls: command not found And drop into a read-only state [Fixes] * bbe9064f30f06e ("selftests/eeh: Skip ahci adapters") This is only affecting Focal and it can be cherry-picked. [Test case] Run the eeh-basic.sh script in tools/testing/selftests/powerpc/eeh/ on the affected P8 node, the test should pass without any issue. [Where problems could occur] This fix is limited to PowerPC testing tool, it should not cause any issue.
2021-02-24 05:50:32 Po-Hsu Lin description [Impact] When trying to run this test on P8 node entei with Focal kernel, it will try to break 4 devices on Focal, and one of them is using the AHCI driver: $ sudo ./eeh-basic.sh 0000:00:00.0, Skipped: bridge 0001:00:00.0, Skipped: bridge 0020:00:00.0, Skipped: bridge 0021:00:00.0, Skipped: bridge 0021:01:00.0, Skipped: bridge 0021:02:01.0, Skipped: bridge 0021:02:08.0, Skipped: bridge 0021:02:09.0, Skipped: bridge 0021:02:0a.0, Skipped: bridge 0021:02:0b.0, Skipped: bridge 0021:02:0c.0, Skipped: bridge 0021:0d:00.0, Added 0021:0e:00.0, Added 0021:0f:00.0, Skipped: bridge 0021:10:00.0, Added 0022:00:00.0, Skipped: bridge 0022:01:00.0, Added Found 4 breakable devices... Breaking 0021:0d:00.0... 0021:0d:00.0, waited 0/60 0021:0d:00.0, waited 1/60 0021:0d:00.0, waited 2/60 0021:0d:00.0, waited 3/60 0021:0d:00.0, waited 4/60 0021:0d:00.0, waited 5/60 0021:0d:00.0, waited 6/60 0021:0d:00.0, waited 7/60 0021:0d:00.0, waited 8/60 0021:0d:00.0, Recovered after 9 seconds Breaking 0021:0e:00.0... 0021:0e:00.0, waited 0/60 0021:0e:00.0, waited 1/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 2/60 ./eeh-basic.sh: 74: sleep: Input/output error .... ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 59/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 60/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, Failed to recover! Breaking 0021:10:00.0... Skipping 0021:10:00.0, Initial PE state is not ok Breaking 0022:01:00.0... Skipping 0022:01:00.0, Initial PE state is not ok 3 devices failed to recover (4 tested) ./eeh-basic.sh: 81: lspci: Input/output error ./eeh-basic.sh: 81: diff: Input/output error ./eeh-basic.sh: 82: rm: Input/output error ./eeh-basic.sh: 84: test: 3: unexpected operator With the driver failed to recovery, the system will start acting up. $ ls ls: command not found And drop into a read-only state [Fixes] * bbe9064f30f06e ("selftests/eeh: Skip ahci adapters") This is only affecting Focal and it can be cherry-picked. [Test case] Run the eeh-basic.sh script in tools/testing/selftests/powerpc/eeh/ on the affected P8 node, the test should pass without any issue. [Where problems could occur] This fix is limited to PowerPC testing tool, it should not cause any issue. [Impact] When trying to run this test on P8 node entei with Focal kernel, it will try to break 4 devices on Focal, and one of them is using the AHCI driver which doesn't support error recovery: $ sudo ./eeh-basic.sh 0000:00:00.0, Skipped: bridge 0001:00:00.0, Skipped: bridge 0020:00:00.0, Skipped: bridge 0021:00:00.0, Skipped: bridge 0021:01:00.0, Skipped: bridge 0021:02:01.0, Skipped: bridge 0021:02:08.0, Skipped: bridge 0021:02:09.0, Skipped: bridge 0021:02:0a.0, Skipped: bridge 0021:02:0b.0, Skipped: bridge 0021:02:0c.0, Skipped: bridge 0021:0d:00.0, Added 0021:0e:00.0, Added 0021:0f:00.0, Skipped: bridge 0021:10:00.0, Added 0022:00:00.0, Skipped: bridge 0022:01:00.0, Added Found 4 breakable devices... Breaking 0021:0d:00.0... 0021:0d:00.0, waited 0/60 0021:0d:00.0, waited 1/60 0021:0d:00.0, waited 2/60 0021:0d:00.0, waited 3/60 0021:0d:00.0, waited 4/60 0021:0d:00.0, waited 5/60 0021:0d:00.0, waited 6/60 0021:0d:00.0, waited 7/60 0021:0d:00.0, waited 8/60 0021:0d:00.0, Recovered after 9 seconds Breaking 0021:0e:00.0... 0021:0e:00.0, waited 0/60 0021:0e:00.0, waited 1/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 2/60 ./eeh-basic.sh: 74: sleep: Input/output error .... ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 59/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, waited 60/60 ./eeh-basic.sh: 74: sleep: Input/output error 0021:0e:00.0, Failed to recover! Breaking 0021:10:00.0... Skipping 0021:10:00.0, Initial PE state is not ok Breaking 0022:01:00.0... Skipping 0022:01:00.0, Initial PE state is not ok 3 devices failed to recover (4 tested) ./eeh-basic.sh: 81: lspci: Input/output error ./eeh-basic.sh: 81: diff: Input/output error ./eeh-basic.sh: 82: rm: Input/output error ./eeh-basic.sh: 84: test: 3: unexpected operator With the driver failed to recovery, the system will start acting up. $ ls ls: command not found And drop into a read-only state [Fixes] * bbe9064f30f06e ("selftests/eeh: Skip ahci adapters") This is only affecting Focal and it can be cherry-picked. [Test case] Run the eeh-basic.sh script in tools/testing/selftests/powerpc/eeh/ on the affected P8 node, the test should pass without any issue. [Where problems could occur] This fix is limited to PowerPC testing tool, it should not cause any issue.
2021-02-24 07:15:12 Po-Hsu Lin ubuntu-kernel-tests: status New In Progress
2021-02-24 07:15:14 Po-Hsu Lin ubuntu-kernel-tests: assignee Po-Hsu Lin (cypressyew)
2021-02-24 07:15:16 Po-Hsu Lin linux (Ubuntu Focal): assignee Po-Hsu Lin (cypressyew)
2021-02-24 07:15:17 Po-Hsu Lin linux (Ubuntu Focal): status Incomplete In Progress
2021-02-26 02:03:27 Kelsey Steele linux (Ubuntu Focal): status In Progress Fix Committed
2021-03-12 08:07:39 Po-Hsu Lin tags ubuntu-kernel-selftests
2021-03-12 08:07:47 Po-Hsu Lin tags ubuntu-kernel-selftests 5.4 focal ppc64el ubuntu-kernel-selftests
2021-03-26 04:49:02 Ubuntu Kernel Bot tags 5.4 focal ppc64el ubuntu-kernel-selftests 5.4 focal ppc64el ubuntu-kernel-selftests verification-needed-focal
2021-03-29 15:20:19 Po-Hsu Lin tags 5.4 focal ppc64el ubuntu-kernel-selftests verification-needed-focal 5.4 focal ppc64el ubuntu-kernel-selftests verification-done-focal
2021-04-12 13:53:40 Launchpad Janitor linux (Ubuntu Focal): status Fix Committed Fix Released
2021-04-15 06:55:18 Po-Hsu Lin ubuntu-kernel-tests: status In Progress Fix Released