2021-02-22 10:21:44 |
Po-Hsu Lin |
bug |
|
|
added bug |
2021-02-22 10:21:44 |
Po-Hsu Lin |
attachment added |
|
eeh-basic-dmesg.log https://bugs.launchpad.net/bugs/1916468/+attachment/5465912/+files/eeh-basic-dmesg.log |
|
2021-02-22 10:21:54 |
Po-Hsu Lin |
nominated for series |
|
Ubuntu Focal |
|
2021-02-22 10:21:54 |
Po-Hsu Lin |
bug task added |
|
linux (Ubuntu Focal) |
|
2021-02-22 10:30:15 |
Ubuntu Kernel Bot |
linux (Ubuntu): status |
New |
Incomplete |
|
2021-02-22 10:30:17 |
Ubuntu Kernel Bot |
linux (Ubuntu Focal): status |
New |
Incomplete |
|
2021-02-23 11:48:17 |
Guilherme G. Piccoli |
bug |
|
|
added subscriber Guilherme G. Piccoli |
2021-02-24 05:37:17 |
Po-Hsu Lin |
bug task added |
|
ubuntu-kernel-tests |
|
2021-02-24 05:43:45 |
Po-Hsu Lin |
linux (Ubuntu): status |
Incomplete |
Fix Released |
|
2021-02-24 05:49:54 |
Po-Hsu Lin |
description |
Issue found on node entei with Focal kernel.
When trying to run this test, it will try to break 4 devices on Focal, and one of them is using the AHCI driver:
$ sudo ./eeh-basic.sh
0000:00:00.0, Skipped: bridge
0001:00:00.0, Skipped: bridge
0020:00:00.0, Skipped: bridge
0021:00:00.0, Skipped: bridge
0021:01:00.0, Skipped: bridge
0021:02:01.0, Skipped: bridge
0021:02:08.0, Skipped: bridge
0021:02:09.0, Skipped: bridge
0021:02:0a.0, Skipped: bridge
0021:02:0b.0, Skipped: bridge
0021:02:0c.0, Skipped: bridge
0021:0d:00.0, Added
0021:0e:00.0, Added
0021:0f:00.0, Skipped: bridge
0021:10:00.0, Added
0022:00:00.0, Skipped: bridge
0022:01:00.0, Added
Found 4 breakable devices...
Breaking 0021:0d:00.0...
0021:0d:00.0, waited 0/60
0021:0d:00.0, waited 1/60
0021:0d:00.0, waited 2/60
0021:0d:00.0, waited 3/60
0021:0d:00.0, waited 4/60
0021:0d:00.0, waited 5/60
0021:0d:00.0, waited 6/60
0021:0d:00.0, waited 7/60
0021:0d:00.0, waited 8/60
0021:0d:00.0, Recovered after 9 seconds
Breaking 0021:0e:00.0...
0021:0e:00.0, waited 0/60
0021:0e:00.0, waited 1/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 2/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 3/60
./eeh-basic.sh: 74: sleep: Input/output error
....
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 59/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 60/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, Failed to recover!
Breaking 0021:10:00.0...
Skipping 0021:10:00.0, Initial PE state is not ok
Breaking 0022:01:00.0...
Skipping 0022:01:00.0, Initial PE state is not ok
3 devices failed to recover (4 tested)
./eeh-basic.sh: 81: lspci: Input/output error
./eeh-basic.sh: 81: diff: Input/output error
./eeh-basic.sh: 82: rm: Input/output error
./eeh-basic.sh: 84: test: 3: unexpected operator
With the driver failed to recovery, the system will start acting up.
$ ls
ls: command not found
And drop into read-only state, dmesg can be found in the attachment. |
[Impact]
When trying to run this test on P8 node entei with Focal kernel, it will try to break 4 devices on Focal, and one of them is using the AHCI driver:
$ sudo ./eeh-basic.sh
0000:00:00.0, Skipped: bridge
0001:00:00.0, Skipped: bridge
0020:00:00.0, Skipped: bridge
0021:00:00.0, Skipped: bridge
0021:01:00.0, Skipped: bridge
0021:02:01.0, Skipped: bridge
0021:02:08.0, Skipped: bridge
0021:02:09.0, Skipped: bridge
0021:02:0a.0, Skipped: bridge
0021:02:0b.0, Skipped: bridge
0021:02:0c.0, Skipped: bridge
0021:0d:00.0, Added
0021:0e:00.0, Added
0021:0f:00.0, Skipped: bridge
0021:10:00.0, Added
0022:00:00.0, Skipped: bridge
0022:01:00.0, Added
Found 4 breakable devices...
Breaking 0021:0d:00.0...
0021:0d:00.0, waited 0/60
0021:0d:00.0, waited 1/60
0021:0d:00.0, waited 2/60
0021:0d:00.0, waited 3/60
0021:0d:00.0, waited 4/60
0021:0d:00.0, waited 5/60
0021:0d:00.0, waited 6/60
0021:0d:00.0, waited 7/60
0021:0d:00.0, waited 8/60
0021:0d:00.0, Recovered after 9 seconds
Breaking 0021:0e:00.0...
0021:0e:00.0, waited 0/60
0021:0e:00.0, waited 1/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 2/60
./eeh-basic.sh: 74: sleep: Input/output error
....
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 59/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 60/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, Failed to recover!
Breaking 0021:10:00.0...
Skipping 0021:10:00.0, Initial PE state is not ok
Breaking 0022:01:00.0...
Skipping 0022:01:00.0, Initial PE state is not ok
3 devices failed to recover (4 tested)
./eeh-basic.sh: 81: lspci: Input/output error
./eeh-basic.sh: 81: diff: Input/output error
./eeh-basic.sh: 82: rm: Input/output error
./eeh-basic.sh: 84: test: 3: unexpected operator
With the driver failed to recovery, the system will start acting up.
$ ls
ls: command not found
And drop into a read-only state
[Fixes]
* bbe9064f30f06e ("selftests/eeh: Skip ahci adapters")
This is only affecting Focal and it can be cherry-picked.
[Test case]
Run the eeh-basic.sh script in tools/testing/selftests/powerpc/eeh/ on the affected P8 node, the test should pass without any issue.
[Where problems could occur]
This fix is limited to PowerPC testing tool, it should not cause any issue. |
|
2021-02-24 05:50:32 |
Po-Hsu Lin |
description |
[Impact]
When trying to run this test on P8 node entei with Focal kernel, it will try to break 4 devices on Focal, and one of them is using the AHCI driver:
$ sudo ./eeh-basic.sh
0000:00:00.0, Skipped: bridge
0001:00:00.0, Skipped: bridge
0020:00:00.0, Skipped: bridge
0021:00:00.0, Skipped: bridge
0021:01:00.0, Skipped: bridge
0021:02:01.0, Skipped: bridge
0021:02:08.0, Skipped: bridge
0021:02:09.0, Skipped: bridge
0021:02:0a.0, Skipped: bridge
0021:02:0b.0, Skipped: bridge
0021:02:0c.0, Skipped: bridge
0021:0d:00.0, Added
0021:0e:00.0, Added
0021:0f:00.0, Skipped: bridge
0021:10:00.0, Added
0022:00:00.0, Skipped: bridge
0022:01:00.0, Added
Found 4 breakable devices...
Breaking 0021:0d:00.0...
0021:0d:00.0, waited 0/60
0021:0d:00.0, waited 1/60
0021:0d:00.0, waited 2/60
0021:0d:00.0, waited 3/60
0021:0d:00.0, waited 4/60
0021:0d:00.0, waited 5/60
0021:0d:00.0, waited 6/60
0021:0d:00.0, waited 7/60
0021:0d:00.0, waited 8/60
0021:0d:00.0, Recovered after 9 seconds
Breaking 0021:0e:00.0...
0021:0e:00.0, waited 0/60
0021:0e:00.0, waited 1/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 2/60
./eeh-basic.sh: 74: sleep: Input/output error
....
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 59/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 60/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, Failed to recover!
Breaking 0021:10:00.0...
Skipping 0021:10:00.0, Initial PE state is not ok
Breaking 0022:01:00.0...
Skipping 0022:01:00.0, Initial PE state is not ok
3 devices failed to recover (4 tested)
./eeh-basic.sh: 81: lspci: Input/output error
./eeh-basic.sh: 81: diff: Input/output error
./eeh-basic.sh: 82: rm: Input/output error
./eeh-basic.sh: 84: test: 3: unexpected operator
With the driver failed to recovery, the system will start acting up.
$ ls
ls: command not found
And drop into a read-only state
[Fixes]
* bbe9064f30f06e ("selftests/eeh: Skip ahci adapters")
This is only affecting Focal and it can be cherry-picked.
[Test case]
Run the eeh-basic.sh script in tools/testing/selftests/powerpc/eeh/ on the affected P8 node, the test should pass without any issue.
[Where problems could occur]
This fix is limited to PowerPC testing tool, it should not cause any issue. |
[Impact]
When trying to run this test on P8 node entei with Focal kernel, it will try to break 4 devices on Focal, and one of them is using the AHCI driver which doesn't support error recovery:
$ sudo ./eeh-basic.sh
0000:00:00.0, Skipped: bridge
0001:00:00.0, Skipped: bridge
0020:00:00.0, Skipped: bridge
0021:00:00.0, Skipped: bridge
0021:01:00.0, Skipped: bridge
0021:02:01.0, Skipped: bridge
0021:02:08.0, Skipped: bridge
0021:02:09.0, Skipped: bridge
0021:02:0a.0, Skipped: bridge
0021:02:0b.0, Skipped: bridge
0021:02:0c.0, Skipped: bridge
0021:0d:00.0, Added
0021:0e:00.0, Added
0021:0f:00.0, Skipped: bridge
0021:10:00.0, Added
0022:00:00.0, Skipped: bridge
0022:01:00.0, Added
Found 4 breakable devices...
Breaking 0021:0d:00.0...
0021:0d:00.0, waited 0/60
0021:0d:00.0, waited 1/60
0021:0d:00.0, waited 2/60
0021:0d:00.0, waited 3/60
0021:0d:00.0, waited 4/60
0021:0d:00.0, waited 5/60
0021:0d:00.0, waited 6/60
0021:0d:00.0, waited 7/60
0021:0d:00.0, waited 8/60
0021:0d:00.0, Recovered after 9 seconds
Breaking 0021:0e:00.0...
0021:0e:00.0, waited 0/60
0021:0e:00.0, waited 1/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 2/60
./eeh-basic.sh: 74: sleep: Input/output error
....
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 59/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, waited 60/60
./eeh-basic.sh: 74: sleep: Input/output error
0021:0e:00.0, Failed to recover!
Breaking 0021:10:00.0...
Skipping 0021:10:00.0, Initial PE state is not ok
Breaking 0022:01:00.0...
Skipping 0022:01:00.0, Initial PE state is not ok
3 devices failed to recover (4 tested)
./eeh-basic.sh: 81: lspci: Input/output error
./eeh-basic.sh: 81: diff: Input/output error
./eeh-basic.sh: 82: rm: Input/output error
./eeh-basic.sh: 84: test: 3: unexpected operator
With the driver failed to recovery, the system will start acting up.
$ ls
ls: command not found
And drop into a read-only state
[Fixes]
* bbe9064f30f06e ("selftests/eeh: Skip ahci adapters")
This is only affecting Focal and it can be cherry-picked.
[Test case]
Run the eeh-basic.sh script in tools/testing/selftests/powerpc/eeh/ on the affected P8 node, the test should pass without any issue.
[Where problems could occur]
This fix is limited to PowerPC testing tool, it should not cause any issue. |
|
2021-02-24 07:15:12 |
Po-Hsu Lin |
ubuntu-kernel-tests: status |
New |
In Progress |
|
2021-02-24 07:15:14 |
Po-Hsu Lin |
ubuntu-kernel-tests: assignee |
|
Po-Hsu Lin (cypressyew) |
|
2021-02-24 07:15:16 |
Po-Hsu Lin |
linux (Ubuntu Focal): assignee |
|
Po-Hsu Lin (cypressyew) |
|
2021-02-24 07:15:17 |
Po-Hsu Lin |
linux (Ubuntu Focal): status |
Incomplete |
In Progress |
|
2021-02-26 02:03:27 |
Kelsey Steele |
linux (Ubuntu Focal): status |
In Progress |
Fix Committed |
|
2021-03-12 08:07:39 |
Po-Hsu Lin |
tags |
|
ubuntu-kernel-selftests |
|
2021-03-12 08:07:47 |
Po-Hsu Lin |
tags |
ubuntu-kernel-selftests |
5.4 focal ppc64el ubuntu-kernel-selftests |
|
2021-03-26 04:49:02 |
Ubuntu Kernel Bot |
tags |
5.4 focal ppc64el ubuntu-kernel-selftests |
5.4 focal ppc64el ubuntu-kernel-selftests verification-needed-focal |
|
2021-03-29 15:20:19 |
Po-Hsu Lin |
tags |
5.4 focal ppc64el ubuntu-kernel-selftests verification-needed-focal |
5.4 focal ppc64el ubuntu-kernel-selftests verification-done-focal |
|
2021-04-12 13:53:40 |
Launchpad Janitor |
linux (Ubuntu Focal): status |
Fix Committed |
Fix Released |
|
2021-04-15 06:55:18 |
Po-Hsu Lin |
ubuntu-kernel-tests: status |
In Progress |
Fix Released |
|