Activity log for bug #1810781

Date Who What changed Old value New value Message
2019-01-07 12:47:25 Guilherme G. Piccoli bug added bug
2019-01-07 12:49:02 Guilherme G. Piccoli nominated for series Ubuntu Disco
2019-01-07 12:49:02 Guilherme G. Piccoli nominated for series Ubuntu Xenial
2019-01-07 12:49:02 Guilherme G. Piccoli nominated for series Ubuntu Cosmic
2019-01-07 12:49:02 Guilherme G. Piccoli nominated for series Ubuntu Bionic
2019-01-07 12:52:43 Guilherme G. Piccoli attachment added dmesg snippet showing the error https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1810781/+attachment/5227414/+files/dmesg-error
2019-01-07 12:54:50 Dan Streetman bug task added linux (Ubuntu Bionic)
2019-01-07 12:54:54 Dan Streetman bug task added linux (Ubuntu Cosmic)
2019-01-07 12:54:59 Dan Streetman bug task added linux (Ubuntu Disco)
2019-01-07 12:55:03 Dan Streetman bug task added linux (Ubuntu Xenial)
2019-01-07 12:58:05 Guilherme G. Piccoli linux (Ubuntu Bionic): importance Undecided Critical
2019-01-07 12:58:09 Guilherme G. Piccoli linux (Ubuntu Cosmic): importance Undecided Critical
2019-01-07 12:58:11 Guilherme G. Piccoli linux (Ubuntu Xenial): importance Undecided Critical
2019-01-07 12:58:12 Guilherme G. Piccoli linux (Ubuntu Xenial): status New Confirmed
2019-01-07 12:58:14 Guilherme G. Piccoli linux (Ubuntu Bionic): status New Confirmed
2019-01-07 12:58:17 Guilherme G. Piccoli linux (Ubuntu Cosmic): status New Confirmed
2019-01-07 12:58:28 Guilherme G. Piccoli linux (Ubuntu Disco): status Confirmed Fix Released
2019-01-07 12:58:33 Guilherme G. Piccoli linux (Ubuntu Cosmic): assignee Guilherme G. Piccoli (gpiccoli)
2019-01-07 12:58:34 Guilherme G. Piccoli linux (Ubuntu Bionic): assignee Guilherme G. Piccoli (gpiccoli)
2019-01-07 12:58:36 Guilherme G. Piccoli linux (Ubuntu Xenial): assignee Guilherme G. Piccoli (gpiccoli)
2019-01-07 12:59:21 Guilherme G. Piccoli bug added subscriber Mauricio Faria de Oliveira
2019-01-07 14:24:22 Guilherme G. Piccoli linux (Ubuntu Xenial): status Confirmed Won't Fix
2019-01-07 14:24:32 Guilherme G. Piccoli linux (Ubuntu Xenial): importance Critical Medium
2019-01-07 14:24:43 Guilherme G. Piccoli linux (Ubuntu Disco): importance Critical Medium
2019-01-07 14:25:08 Guilherme G. Piccoli linux (Ubuntu Xenial): assignee Guilherme G. Piccoli (gpiccoli) Mauricio Faria de Oliveira (mfo)
2019-01-07 14:25:16 Guilherme G. Piccoli linux (Ubuntu Bionic): assignee Guilherme G. Piccoli (gpiccoli) Mauricio Faria de Oliveira (mfo)
2019-01-07 14:25:24 Guilherme G. Piccoli linux (Ubuntu Cosmic): assignee Guilherme G. Piccoli (gpiccoli) Mauricio Faria de Oliveira (mfo)
2019-01-07 14:25:31 Guilherme G. Piccoli linux (Ubuntu Disco): assignee Guilherme G. Piccoli (gpiccoli) Mauricio Faria de Oliveira (mfo)
2019-01-07 15:07:17 Mauricio Faria de Oliveira description [Impact] * The mpt3sas driver relies in a FW queue (Reply Post Descriptor Queue) in the I/O completion path; there's a MMIO register that driver uses to flag an empty entry in such queue, called Reply Post Host Index. This value is updated during the driver interrupt routine [in _base_interrupt() function]. * Happens that there are 2 registers representing the Reply Post Host Index according to the type of the adapter. They are differentiated in the driver through the "ioc->combined_reply_queue" check. By the MPI specification (vendor spec), driver should use this combined reply queue according to the number of maximum MSI-X vectors that the adapter exposes and the spec version (SAS 3.0 vs SAS 3.5). * Currently, this is wrong checked for a class of adapters, which was fixed in the upstream kernel commit 2b48be65685a [0]. Without this commit, we can observe spontaneous resets in the driver due to queue overflow (FW is not aware that there are free entries in the Reply Post Descriptor Queue). The dmesg log will show the following output in case of this error: mpt3sas_cm0: fault_state(0x2100)! mpt3sas_cm0: sending diag reset !! mpt3sas_cm0: diag reset: SUCCESS [followed by a lot of driver messages as result of the reset procedure] * During these resets, I/O is stalled so it may affect performance. [Test Case] * It's not trivial to test the problem, but given a machine with an affected device, an I/O benchmark like FIO could be used to exercise the I/O path in a heavy way and trigger the issue. We have reports that the adapter "LSI Logic / Symbios Logic Device [1000:00ac]" is affected by the issue. [Regression Potential] * This is a long-term issue from the mpt3sas driver, affecting only a class of adapters of this vendor. Since it's a clear bug, the fix is necessary. The potential of regressions is unknown, but likely low - it changes the register used for the index updates given some set of characteristics of the adapter (according to the spec.), which restricts even more the scope of this patch. [Impact] * Adapter resets periodically during high-load activity. * I/O stalls until reset/reinit is complete (latency) and I/O performance degrades across cluster (e.g., low throughput from data spread over nodes). * The mpt3sas driver relies in a FW queue (Reply Post Descriptor Queue) in the I/O completion path; there's a MMIO register that driver uses to flag an empty entry in such queue, called Reply Post Host Index. This value is updated during the driver interrupt routine [in _base_interrupt() function]. * Happens that there are 2 registers representing the Reply Post Host Index according to the type of the adapter. They are differentiated in the driver through the "ioc->combined_reply_queue" check. By the MPI specification (vendor spec), driver should use this combined reply queue according to the number of maximum MSI-X vectors that the adapter exposes and the spec version (SAS 3.0 vs SAS 3.5). * Currently, this is wrong checked for a class of adapters, which was fixed in the upstream kernel commit 2b48be65685a [0]. Without this commit, we can observe spontaneous resets in the driver due to queue overflow (FW is not aware that there are free entries in the Reply Post Descriptor Queue). The dmesg log will show the following output in case of this error:   mpt3sas_cm0: fault_state(0x2100)!   mpt3sas_cm0: sending diag reset !!   mpt3sas_cm0: diag reset: SUCCESS [followed by a lot of driver messages as result of the reset procedure] * During these resets, I/O is stalled so it may affect performance. [Test Case] * It's not trivial to test the problem, but given a machine with an affected device, an I/O benchmark like FIO could be used to exercise the I/O path in a heavy way and trigger the issue. We have reports that the adapter "LSI Logic / Symbios Logic Device [1000:00ac]" is affected by the issue. [Regression Potential] * This is a long-term issue from the mpt3sas driver, affecting only a class of adapters of this vendor. Since it's a clear bug, the fix is necessary. The potential of regressions is unknown, but likely low - it changes the register used for the index updates given some set of characteristics of the adapter (according to the spec.), which restricts even more the scope of this patch.
2019-01-07 15:16:25 Mauricio Faria de Oliveira description [Impact] * Adapter resets periodically during high-load activity. * I/O stalls until reset/reinit is complete (latency) and I/O performance degrades across cluster (e.g., low throughput from data spread over nodes). * The mpt3sas driver relies in a FW queue (Reply Post Descriptor Queue) in the I/O completion path; there's a MMIO register that driver uses to flag an empty entry in such queue, called Reply Post Host Index. This value is updated during the driver interrupt routine [in _base_interrupt() function]. * Happens that there are 2 registers representing the Reply Post Host Index according to the type of the adapter. They are differentiated in the driver through the "ioc->combined_reply_queue" check. By the MPI specification (vendor spec), driver should use this combined reply queue according to the number of maximum MSI-X vectors that the adapter exposes and the spec version (SAS 3.0 vs SAS 3.5). * Currently, this is wrong checked for a class of adapters, which was fixed in the upstream kernel commit 2b48be65685a [0]. Without this commit, we can observe spontaneous resets in the driver due to queue overflow (FW is not aware that there are free entries in the Reply Post Descriptor Queue). The dmesg log will show the following output in case of this error:   mpt3sas_cm0: fault_state(0x2100)!   mpt3sas_cm0: sending diag reset !!   mpt3sas_cm0: diag reset: SUCCESS [followed by a lot of driver messages as result of the reset procedure] * During these resets, I/O is stalled so it may affect performance. [Test Case] * It's not trivial to test the problem, but given a machine with an affected device, an I/O benchmark like FIO could be used to exercise the I/O path in a heavy way and trigger the issue. We have reports that the adapter "LSI Logic / Symbios Logic Device [1000:00ac]" is affected by the issue. [Regression Potential] * This is a long-term issue from the mpt3sas driver, affecting only a class of adapters of this vendor. Since it's a clear bug, the fix is necessary. The potential of regressions is unknown, but likely low - it changes the register used for the index updates given some set of characteristics of the adapter (according to the spec.), which restricts even more the scope of this patch. [Impact] * Adapter resets periodically during high-load activity. * I/O stalls until reset/reinit is complete (latency) and I/O performance degrades across cluster (e.g., low throughput from data spread over nodes). * The mpt3sas driver relies in a FW queue (Reply Post Descriptor Queue) in the I/O completion path; there's a MMIO register that driver uses to flag an empty entry in such queue, called Reply Post Host Index. This value is updated during the driver interrupt routine [in _base_interrupt() function]. * Happens that there are 2 registers representing the Reply Post Host Index according to the type of the adapter. They are differentiated in the driver through the "ioc->combined_reply_queue" check. By the MPI specification (vendor spec), driver should use this combined reply queue according to the number of maximum MSI-X vectors that the adapter exposes and the spec version (SAS 3.0 vs SAS 3.5). * Currently, this is wrong checked for a class of adapters, which was fixed in the upstream kernel commit 2b48be65685a [0]. Without this commit, we can observe spontaneous resets in the driver due to queue overflow (FW is not aware that there are free entries in the Reply Post Descriptor Queue). The dmesg log will show the following output in case of this error:   mpt3sas_cm0: fault_state(0x2100)!   mpt3sas_cm0: sending diag reset !!   mpt3sas_cm0: diag reset: SUCCESS [followed by a lot of driver messages as result of the reset procedure] * During these resets, I/O is stalled so it may affect performance. [Test Case] * It's not trivial to test the problem, but given a machine with an affected device, an I/O benchmark like FIO could be used to exercise the I/O path in a heavy way and trigger the issue. * We have reports that the adapter "LSI Logic / Symbios Logic Device [1000:00ac]" is affected by the issue. And this commit resolved the problem. [Regression Potential] * This is a long-term issue from the mpt3sas driver, affecting only a class of adapters of this vendor. Since it's a clearly bug, the fix is necessary. The potential of regressions is unknown, but likely low - it changes the register used for the index updates given some set of characteristics of the adapter (according to the spec.), which restricts even more the scope of this patch.
2019-01-08 00:38:46 Khaled El Mously linux (Ubuntu Bionic): status Confirmed Fix Committed
2019-01-08 00:38:50 Khaled El Mously linux (Ubuntu Cosmic): status Confirmed Fix Committed
2019-01-08 01:00:00 Dominique Poulain bug added subscriber Dominique Poulain
2019-01-15 10:33:23 Brad Figg tags sts sts verification-needed-cosmic
2019-01-15 10:37:13 Brad Figg tags sts verification-needed-cosmic sts verification-needed-bionic verification-needed-cosmic
2019-01-17 15:02:41 Mauricio Faria de Oliveira tags sts verification-needed-bionic verification-needed-cosmic sts verification-done-cosmic verification-needed-bionic
2019-01-18 16:17:11 Mauricio Faria de Oliveira tags sts verification-done-cosmic verification-needed-bionic sts verification-done-bionic verification-done-cosmic
2019-01-28 17:12:01 Launchpad Janitor linux (Ubuntu Bionic): status Fix Committed Fix Released
2019-01-28 17:12:01 Launchpad Janitor cve linked 2018-14625
2019-01-28 17:12:01 Launchpad Janitor cve linked 2018-16882
2019-01-28 17:12:01 Launchpad Janitor cve linked 2018-17972
2019-01-28 17:12:01 Launchpad Janitor cve linked 2018-18281
2019-01-28 17:12:01 Launchpad Janitor cve linked 2018-19407
2019-02-04 08:48:45 Launchpad Janitor linux (Ubuntu Cosmic): status Fix Committed Fix Released
2019-03-15 21:09:50 Tomasz bug added subscriber Tomasz
2023-04-14 13:19:12 Junien F bug added subscriber The Canonical Sysadmins