2019-01-07 12:47:25 |
Guilherme G. Piccoli |
bug |
|
|
added bug |
2019-01-07 12:49:02 |
Guilherme G. Piccoli |
nominated for series |
|
Ubuntu Disco |
|
2019-01-07 12:49:02 |
Guilherme G. Piccoli |
nominated for series |
|
Ubuntu Xenial |
|
2019-01-07 12:49:02 |
Guilherme G. Piccoli |
nominated for series |
|
Ubuntu Cosmic |
|
2019-01-07 12:49:02 |
Guilherme G. Piccoli |
nominated for series |
|
Ubuntu Bionic |
|
2019-01-07 12:52:43 |
Guilherme G. Piccoli |
attachment added |
|
dmesg snippet showing the error https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1810781/+attachment/5227414/+files/dmesg-error |
|
2019-01-07 12:54:50 |
Dan Streetman |
bug task added |
|
linux (Ubuntu Bionic) |
|
2019-01-07 12:54:54 |
Dan Streetman |
bug task added |
|
linux (Ubuntu Cosmic) |
|
2019-01-07 12:54:59 |
Dan Streetman |
bug task added |
|
linux (Ubuntu Disco) |
|
2019-01-07 12:55:03 |
Dan Streetman |
bug task added |
|
linux (Ubuntu Xenial) |
|
2019-01-07 12:58:05 |
Guilherme G. Piccoli |
linux (Ubuntu Bionic): importance |
Undecided |
Critical |
|
2019-01-07 12:58:09 |
Guilherme G. Piccoli |
linux (Ubuntu Cosmic): importance |
Undecided |
Critical |
|
2019-01-07 12:58:11 |
Guilherme G. Piccoli |
linux (Ubuntu Xenial): importance |
Undecided |
Critical |
|
2019-01-07 12:58:12 |
Guilherme G. Piccoli |
linux (Ubuntu Xenial): status |
New |
Confirmed |
|
2019-01-07 12:58:14 |
Guilherme G. Piccoli |
linux (Ubuntu Bionic): status |
New |
Confirmed |
|
2019-01-07 12:58:17 |
Guilherme G. Piccoli |
linux (Ubuntu Cosmic): status |
New |
Confirmed |
|
2019-01-07 12:58:28 |
Guilherme G. Piccoli |
linux (Ubuntu Disco): status |
Confirmed |
Fix Released |
|
2019-01-07 12:58:33 |
Guilherme G. Piccoli |
linux (Ubuntu Cosmic): assignee |
|
Guilherme G. Piccoli (gpiccoli) |
|
2019-01-07 12:58:34 |
Guilherme G. Piccoli |
linux (Ubuntu Bionic): assignee |
|
Guilherme G. Piccoli (gpiccoli) |
|
2019-01-07 12:58:36 |
Guilherme G. Piccoli |
linux (Ubuntu Xenial): assignee |
|
Guilherme G. Piccoli (gpiccoli) |
|
2019-01-07 12:59:21 |
Guilherme G. Piccoli |
bug |
|
|
added subscriber Mauricio Faria de Oliveira |
2019-01-07 14:24:22 |
Guilherme G. Piccoli |
linux (Ubuntu Xenial): status |
Confirmed |
Won't Fix |
|
2019-01-07 14:24:32 |
Guilherme G. Piccoli |
linux (Ubuntu Xenial): importance |
Critical |
Medium |
|
2019-01-07 14:24:43 |
Guilherme G. Piccoli |
linux (Ubuntu Disco): importance |
Critical |
Medium |
|
2019-01-07 14:25:08 |
Guilherme G. Piccoli |
linux (Ubuntu Xenial): assignee |
Guilherme G. Piccoli (gpiccoli) |
Mauricio Faria de Oliveira (mfo) |
|
2019-01-07 14:25:16 |
Guilherme G. Piccoli |
linux (Ubuntu Bionic): assignee |
Guilherme G. Piccoli (gpiccoli) |
Mauricio Faria de Oliveira (mfo) |
|
2019-01-07 14:25:24 |
Guilherme G. Piccoli |
linux (Ubuntu Cosmic): assignee |
Guilherme G. Piccoli (gpiccoli) |
Mauricio Faria de Oliveira (mfo) |
|
2019-01-07 14:25:31 |
Guilherme G. Piccoli |
linux (Ubuntu Disco): assignee |
Guilherme G. Piccoli (gpiccoli) |
Mauricio Faria de Oliveira (mfo) |
|
2019-01-07 15:07:17 |
Mauricio Faria de Oliveira |
description |
[Impact]
* The mpt3sas driver relies in a FW queue (Reply Post Descriptor Queue) in the I/O completion path; there's a MMIO register that driver uses to flag an empty entry in such queue, called Reply Post Host Index. This value is updated during the driver interrupt routine [in _base_interrupt() function].
* Happens that there are 2 registers representing the Reply Post Host Index according to the type of the adapter. They are differentiated in the driver through the "ioc->combined_reply_queue" check. By the MPI specification (vendor spec), driver should use this combined reply queue according to the number of maximum MSI-X vectors that the adapter exposes and the spec version (SAS 3.0 vs SAS 3.5).
* Currently, this is wrong checked for a class of adapters, which was fixed in the upstream
kernel commit 2b48be65685a [0]. Without this commit, we can observe spontaneous resets in the
driver due to queue overflow (FW is not aware that there are free entries in the Reply Post Descriptor Queue). The dmesg log will show the following output in case of this error:
mpt3sas_cm0: fault_state(0x2100)!
mpt3sas_cm0: sending diag reset !!
mpt3sas_cm0: diag reset: SUCCESS
[followed by a lot of driver messages as result of the reset procedure]
* During these resets, I/O is stalled so it may affect performance.
[Test Case]
* It's not trivial to test the problem, but given a machine with an affected device, an I/O benchmark like FIO could be used to exercise the I/O path in a heavy way and trigger the issue. We have reports that the adapter "LSI Logic / Symbios Logic Device [1000:00ac]" is affected by the issue.
[Regression Potential]
* This is a long-term issue from the mpt3sas driver, affecting only a class of adapters of this vendor. Since it's a clear bug, the fix is necessary. The potential of regressions is unknown, but likely low - it changes the register used for the index updates given some set of characteristics of the adapter (according to the spec.), which restricts even more the scope of this patch. |
[Impact]
* Adapter resets periodically during high-load activity.
* I/O stalls until reset/reinit is complete (latency) and I/O performance
degrades across cluster (e.g., low throughput from data spread over nodes).
* The mpt3sas driver relies in a FW queue (Reply Post Descriptor Queue) in the I/O completion path; there's a MMIO register that driver uses to flag an empty entry in such queue, called Reply Post Host Index. This value is updated during the driver interrupt routine [in _base_interrupt() function].
* Happens that there are 2 registers representing the Reply Post Host Index according to the type of the adapter. They are differentiated in the driver through the "ioc->combined_reply_queue" check. By the MPI specification (vendor spec), driver should use this combined reply queue according to the number of maximum MSI-X vectors that the adapter exposes and the spec version (SAS 3.0 vs SAS 3.5).
* Currently, this is wrong checked for a class of adapters, which was fixed in the upstream
kernel commit 2b48be65685a [0]. Without this commit, we can observe spontaneous resets in the
driver due to queue overflow (FW is not aware that there are free entries in the Reply Post Descriptor Queue). The dmesg log will show the following output in case of this error:
mpt3sas_cm0: fault_state(0x2100)!
mpt3sas_cm0: sending diag reset !!
mpt3sas_cm0: diag reset: SUCCESS
[followed by a lot of driver messages as result of the reset procedure]
* During these resets, I/O is stalled so it may affect performance.
[Test Case]
* It's not trivial to test the problem, but given a machine with an affected device, an I/O benchmark like FIO could be used to exercise the I/O path in a heavy way and trigger the issue. We have reports that the adapter "LSI Logic / Symbios Logic Device [1000:00ac]" is affected by the issue.
[Regression Potential]
* This is a long-term issue from the mpt3sas driver, affecting only a class of adapters of this vendor. Since it's a clear bug, the fix is necessary. The potential of regressions is unknown, but likely low - it changes the register used for the index updates given some set of characteristics of the adapter (according to the spec.), which restricts even more the scope of this patch. |
|
2019-01-07 15:16:25 |
Mauricio Faria de Oliveira |
description |
[Impact]
* Adapter resets periodically during high-load activity.
* I/O stalls until reset/reinit is complete (latency) and I/O performance
degrades across cluster (e.g., low throughput from data spread over nodes).
* The mpt3sas driver relies in a FW queue (Reply Post Descriptor Queue) in the I/O completion path; there's a MMIO register that driver uses to flag an empty entry in such queue, called Reply Post Host Index. This value is updated during the driver interrupt routine [in _base_interrupt() function].
* Happens that there are 2 registers representing the Reply Post Host Index according to the type of the adapter. They are differentiated in the driver through the "ioc->combined_reply_queue" check. By the MPI specification (vendor spec), driver should use this combined reply queue according to the number of maximum MSI-X vectors that the adapter exposes and the spec version (SAS 3.0 vs SAS 3.5).
* Currently, this is wrong checked for a class of adapters, which was fixed in the upstream
kernel commit 2b48be65685a [0]. Without this commit, we can observe spontaneous resets in the
driver due to queue overflow (FW is not aware that there are free entries in the Reply Post Descriptor Queue). The dmesg log will show the following output in case of this error:
mpt3sas_cm0: fault_state(0x2100)!
mpt3sas_cm0: sending diag reset !!
mpt3sas_cm0: diag reset: SUCCESS
[followed by a lot of driver messages as result of the reset procedure]
* During these resets, I/O is stalled so it may affect performance.
[Test Case]
* It's not trivial to test the problem, but given a machine with an affected device, an I/O benchmark like FIO could be used to exercise the I/O path in a heavy way and trigger the issue. We have reports that the adapter "LSI Logic / Symbios Logic Device [1000:00ac]" is affected by the issue.
[Regression Potential]
* This is a long-term issue from the mpt3sas driver, affecting only a class of adapters of this vendor. Since it's a clear bug, the fix is necessary. The potential of regressions is unknown, but likely low - it changes the register used for the index updates given some set of characteristics of the adapter (according to the spec.), which restricts even more the scope of this patch. |
[Impact]
* Adapter resets periodically during high-load activity.
* I/O stalls until reset/reinit is complete (latency) and I/O performance
degrades across cluster (e.g., low throughput from data spread over nodes).
* The mpt3sas driver relies in a FW queue (Reply Post Descriptor Queue) in the I/O completion path; there's a MMIO register that driver uses to flag an empty entry in such queue, called Reply Post Host Index. This value is updated during the driver interrupt routine [in _base_interrupt() function].
* Happens that there are 2 registers representing the Reply Post Host Index according to the type of the adapter. They are differentiated in the driver through the "ioc->combined_reply_queue" check. By the MPI specification (vendor spec), driver should use this combined reply queue according to the number of maximum MSI-X vectors that the adapter exposes and the spec version (SAS 3.0 vs SAS 3.5).
* Currently, this is wrong checked for a class of adapters, which was fixed in the upstream
kernel commit 2b48be65685a [0]. Without this commit, we can observe spontaneous resets in the
driver due to queue overflow (FW is not aware that there are free entries in the Reply Post Descriptor Queue). The dmesg log will show the following output in case of this error:
mpt3sas_cm0: fault_state(0x2100)!
mpt3sas_cm0: sending diag reset !!
mpt3sas_cm0: diag reset: SUCCESS
[followed by a lot of driver messages as result of the reset procedure]
* During these resets, I/O is stalled so it may affect performance.
[Test Case]
* It's not trivial to test the problem, but given a machine with an affected device, an I/O benchmark like FIO could be used to exercise the I/O path in a heavy way and trigger the issue.
* We have reports that the adapter "LSI Logic / Symbios Logic Device [1000:00ac]" is affected by the issue. And this commit resolved the problem.
[Regression Potential]
* This is a long-term issue from the mpt3sas driver, affecting only a class of adapters of this vendor. Since it's a clearly bug, the fix is necessary. The potential of regressions is unknown, but likely low - it changes the register used for the index updates given some set of characteristics of the adapter (according to the spec.), which restricts even more the scope of this patch. |
|
2019-01-08 00:38:46 |
Khaled El Mously |
linux (Ubuntu Bionic): status |
Confirmed |
Fix Committed |
|
2019-01-08 00:38:50 |
Khaled El Mously |
linux (Ubuntu Cosmic): status |
Confirmed |
Fix Committed |
|
2019-01-08 01:00:00 |
Dominique Poulain |
bug |
|
|
added subscriber Dominique Poulain |
2019-01-15 10:33:23 |
Brad Figg |
tags |
sts |
sts verification-needed-cosmic |
|
2019-01-15 10:37:13 |
Brad Figg |
tags |
sts verification-needed-cosmic |
sts verification-needed-bionic verification-needed-cosmic |
|
2019-01-17 15:02:41 |
Mauricio Faria de Oliveira |
tags |
sts verification-needed-bionic verification-needed-cosmic |
sts verification-done-cosmic verification-needed-bionic |
|
2019-01-18 16:17:11 |
Mauricio Faria de Oliveira |
tags |
sts verification-done-cosmic verification-needed-bionic |
sts verification-done-bionic verification-done-cosmic |
|
2019-01-28 17:12:01 |
Launchpad Janitor |
linux (Ubuntu Bionic): status |
Fix Committed |
Fix Released |
|
2019-01-28 17:12:01 |
Launchpad Janitor |
cve linked |
|
2018-14625 |
|
2019-01-28 17:12:01 |
Launchpad Janitor |
cve linked |
|
2018-16882 |
|
2019-01-28 17:12:01 |
Launchpad Janitor |
cve linked |
|
2018-17972 |
|
2019-01-28 17:12:01 |
Launchpad Janitor |
cve linked |
|
2018-18281 |
|
2019-01-28 17:12:01 |
Launchpad Janitor |
cve linked |
|
2018-19407 |
|
2019-02-04 08:48:45 |
Launchpad Janitor |
linux (Ubuntu Cosmic): status |
Fix Committed |
Fix Released |
|
2019-03-15 21:09:50 |
Tomasz |
bug |
|
|
added subscriber Tomasz |
2023-04-14 13:19:12 |
Junien F |
bug |
|
|
added subscriber The Canonical Sysadmins |