Update mpt3sas Driver to 38.100.00.00 for Ubuntu 21.10 and 20.04

Bug #1935034 reported by Suganath Prabu
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
New
Undecided
Unassigned
Hirsute
Won't Fix
Undecided
Unassigned
Impish
Fix Released
Undecided
Unassigned

Bug Description

[Impact]
This is a feature request to update the mpt3sas driver to latest version in upstream 38.100.00.00 in Ubuntu 21.10 and Ubuntu 20.04.
This will allow users to use fixes and enhancements that landed upstream in 5.14. For this reason,
Broadcom and Dell have requested to pull these patches to update the
mpt3sas driver to the current upstream version.

I could see both Ubuntu 21.10 and 20.04 HWE has mpt3sas driver v36.100.00.00, following are the commit ID's to update to 38.100.00.00

[Fixes]
8b3c803529 scsi: mpt3sas: Signedness bug in _base_get_diag_triggers()
39718fe7ad scsi: mpt3sas: Fix spelling mistake in Kconfig "compatiblity" -> "compatibility"
bfb3f00c06 scsi: mpt3sas: Simplify bool comparison
d309ae0732 scsi: mpt3sas: Fix ReplyPostFree pool allocation
664f0dce20 scsi: mpt3sas: Add support for shared host tagset for CPU hotplug
688c1a0a13 scsi: mpt3sas: Additional diagnostic buffer query interface
446b5f3d3f scsi: mpt3sas: Update driver version to 37.100.00.00

8278807abd scsi: core: Add scsi_device_busy() wrapper
020b0f0a31 scsi: core: Replace sdev->device_busy with sbitmap

e015e0ded1 scsi: mpt3sas: Fix misspelling of _base_put_smid_default_atomic()
2111ba8781 scsi: mpt3sas: Move a little data from the stack onto the heap
cf9e575e62 scsi: mpt3sas: Fix a bunch of potential naming doc-rot
54cb88dc30 scsi: mpt3sas: Fix a couple of misdocumented functions/params
782a1ab33f scsi: mpt3sas: Fix some kernel-doc misnaming issues
a50bd64616 scsi: mpt3sas: Do not use GFP_KERNEL in atomic context
a1c4d77413 scsi: mpt3sas: Replace unnecessary dynamic allocation with a static one
d6adc251dd scsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region
7dd847dae1 scsi: mpt3sas: Force chain buffer allocations to be within same 4 GB region
970ac2bb70 scsi: mpt3sas: Force sense buffer allocations to be within same 4 GB region
58501fd937 scsi: mpt3sas: Force reply buffer allocations to be within same 4 GB region
2e4e858732 scsi: mpt3sas: Force reply post buffer allocations to be within same 4 GB region
c569de899b scsi: mpt3sas: Force reply post array allocations to be within same 4 GB region
37067b9793 scsi: mpt3sas: Update driver version to 37.101.00.00

a8d548b0b3 scsi: mpt3sas: Fix a few kernel-doc issues
3401ecf7fc scsi: mpt3sas: Fix error return code of mpt3sas_base_attach()

206a3afa94 scsi: mpt3sas: Fix a typo
4c51f95696 scsi: mpt3sas: Only one vSES is present even when IOC has multi vSES
c0629d70ca scsi: mpt3sas: Fix endianness for ActiveCablePowerRequirement
3c8604691d scsi: mpt3sas: Block PCI config access from userspace during reset
16660db3fc scsi: mpt3sas: Fix out-of-bounds warnings in _ctl_addnl_diag_query
3ad0b1da0d scsi: mpt3sas: Fix two kernel-doc headers
2910a4a9e9 scsi: mpt3sas: Documentation cleanup
e2fac6c44a scsi: mpt3sas: Fix deadlock while cancelling the running firmware event
19a622c39a scsi: mpt3sas: Handle firmware faults during first half of IOC init
a0815c45c8 scsi: mpt3sas: Handle firmware faults during second half of IOC init

f2b1e9c6f8 scsi: core: Introduce scsi_build_sense()

84a84cc6af scsi: mpt3sas: Fix fall-through warnings for Clang
cf750be8e6 scsi: mpt3sas: Fix Coverity reported issue
d6c2ce435f scsi: mpt3sas: Fix error return value in _scsih_expander_add()

Note:
We have already posted patch to update driver version to 38.100.00.00
Other patches are already available in upstream .

[Testing]
Load the driver
Run IO

[Regression Risk]
No extra test cases are needed, normal regression test cases such as driver load & running IOs are enough.

[Other Info]

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Tim Gardner (timg-tpi) wrote :

Suganath Prabu - You appear to be missing some scaffolding patches. At least 1ea18ac2e5f703e5648f663412ff6abab2f41def ("scsi: sbitmap: Export sbitmap_weight") and c548e62bcf6adc7066ff201e9ecc88e536dd8890 ("scsi: sbitmap: Move allocation hint into sbitmap").

Perhaps you should attempt to apply this list of commits yourself, then submit a pull request.

Revision history for this message
Michael Reed (mreed8855) wrote :

If patches are missing can you add them please? Currently this patch set is not cherry picking cleanly.

Revision history for this message
Michael Reed (mreed8855) wrote :

Also typically when there is a driver update to a specific driver version there is a patch updating the version. There is an example for 37.101.00.00 in this patch set "scsi: mpt3sas: Update driver version to 37.101.00.00" but I do not see one for 38.100.00.00, nor do I see it in linus' tree when I try to pull in the patches. Can you verify that all the patches needed are included in this listing?

Revision history for this message
Suganath Prabu (suganath) wrote :

Hi Michael,

I'll check and update.

Thanks,
Suganath

Revision history for this message
Suganath Prabu (suganath) wrote :
Download full text (3.9 KiB)

Hi Michael/Tim,

Patch to update version to 38.100.00.00 is posted sometime back, but not yet accepted.
We have RESENT it today. This patch contains only version update.
https://marc.info/?l=linux-scsi&m=162797310627327&w=2.
---
 drivers/scsi/mpt3sas/mpt3sas_base.h | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 0c6c3df0038d..ec0be3e80561 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -77,9 +77,9 @@
 #define MPT3SAS_DRIVER_NAME "mpt3sas"
 #define MPT3SAS_AUTHOR "Avago Technologies <email address hidden>"
 #define MPT3SAS_DESCRIPTION "LSI MPT Fusion SAS 3.0 Device Driver"
-#define MPT3SAS_DRIVER_VERSION "37.101.00.00"
-#define MPT3SAS_MAJOR_VERSION 37
-#define MPT3SAS_MINOR_VERSION 101
+#define MPT3SAS_DRIVER_VERSION "38.100.00.00"
+#define MPT3SAS_MAJOR_VERSION 38
+#define MPT3SAS_MINOR_VERSION 100
 #define MPT3SAS_BUILD_VERSION 0
 #define MPT3SAS_RELEASE_VERSION 00

Below is the updated list of Patches. Looks like some fixes are already pulled in Ubuntu 5.11 git, so removed those from previous list.
==================================================================================================
bfb3f00c06 scsi: mpt3sas: Simplify bool comparison
d309ae0732 scsi: mpt3sas: Fix ReplyPostFree pool allocation
664f0dce20 scsi: mpt3sas: Add support for shared host tagset for CPU hotplug
688c1a0a13 scsi: mpt3sas: Additional diagnostic buffer query interface
446b5f3d3f scsi: mpt3sas: Update driver version to 37.100.00.00
8278807abd scsi: core: Add scsi_device_busy() wrapper
020b0f0a31 scsi: core: Replace sdev->device_busy with sbitmap
e015e0ded1 scsi: mpt3sas: Fix misspelling of _base_put_smid_default_atomic()
2111ba8781 scsi: mpt3sas: Move a little data from the stack onto the heap
cf9e575e62 scsi: mpt3sas: Fix a bunch of potential naming doc-rot
54cb88dc30 scsi: mpt3sas: Fix a couple of misdocumented functions/params
782a1ab33f scsi: mpt3sas: Fix some kernel-doc misnaming issues
a1c4d77413 scsi: mpt3sas: Replace unnecessary dynamic allocation with a static one
d6adc251dd scsi: mpt3sas: Force PCIe scatterlist allocations to be within same 4 GB region
7dd847dae1 scsi: mpt3sas: Force chain buffer allocations to be within same 4 GB region
970ac2bb70 scsi: mpt3sas: Force sense buffer allocations to be within same 4 GB region
58501fd937 scsi: mpt3sas: Force reply buffer allocations to be within same 4 GB region
2e4e858732 scsi: mpt3sas: Force reply post buffer allocations to be within same 4 GB region
c569de899b scsi: mpt3sas: Force reply post array allocations to be within same 4 GB region
37067b9793 scsi: mpt3sas: Update driver version to 37.101.00.00
a8d548b0b3 scsi: mpt3sas: Fix a few kernel-doc issues
206a3afa94 scsi: mpt3sas: Fix a typo
c0629d70ca scsi: mpt3sas: Fix endianness for ActiveCablePowerRequirement
16660db3fc scsi: mpt3sas: Fix out-of-bounds warnings in _ctl_addnl_diag_query
3ad0b1da0d scsi: mpt3sas: Fix two kernel-doc headers
2910a4a9e9 scsi: mpt3sas: Documentation cleanup
e2fac6c44a scsi: mpt3sas: Fix deadlock while cancelling the running firmware event
19a62...

Read more...

Revision history for this message
Michael Reed (mreed8855) wrote :

I have created a test kernel for 21.10 at the following link:

https://people.canonical.com/~mreed/lp_1935034_mpt3sas_update_38_100/impish/

Revision history for this message
Michael Reed (mreed8855) wrote :

I just updated the files in link in comment #7. Please test with the updated test kernel.

Revision history for this message
Sujith Pandel (sujithpandel) wrote :

Hi Michael,
Driver version check
Install and boot
12hr stress test
12hr reboot loop

All these tests went fine with the above test kernel.

Revision history for this message
Suganath Prabu (suganath) wrote :

Michael,
The test kernel is under validation, will share the result by end of this week.

Revision history for this message
Suganath Prabu (suganath) wrote :

We have covered Basic validation for the test kernel and no issue was observed.

Revision history for this message
Michael Reed (mreed8855) wrote :

I need an impact statement before I can submit this to the mailing list for approval.

SRU Justification:
[Impact]
[FIXES]
[TESTING]
[REGRESSION RISK]
[Other Info] (optional)

Here is an additional reference.

https://wiki.ubuntu.com/Kernel/Dev/StablePatchFormat

Revision history for this message
Suganath Prabu (suganath) wrote :
Download full text (5.8 KiB)

Only below two patches have a driver functionality fix, but these are not that critical.

1. d309ae0732 scsi: mpt3sas: Fix ReplyPostFree pool allocation
Here is SRU for this bug fix patch,
[Impact]
Driver always allocates ReplyPostFree queue pools as a set of 16 ReplyPostFree queues. So, on a VM where there are only one or two CPUs assigned to it and with less RAM memory then there are chances that the kernel cannot allocate a continuous memory segment for a ReplyPostFree queues set, and hence IOC initialization is terminated.
[Fix]
On a system with the number of CPUs less than 16, the driver doesn't need 16 ReplyPostFree queues, it just needs the CPU count number of ReplyPostFree queues. So the driver allocates only the CPU count of ReplyPostFree queues instead of allocating 16 ReplyPostFree queues and IOC gets initialized successfully.
[Testing]
Load driver on a VM with one or two CPUs and less RAM size (such as 1GB) and driver should successfully Initialize the IOC.
[REGRESSION RISK]
There is no regression risk with this bug-fix patch.

2. e2fac6c44a SCSI: mpt3sas: Fix deadlock while canceling the running firmware event
Here is SRU for this bug fix patch,

[Impact]
Will observe driver hang if any MPT request message gets timeout and which requires hard reset while processing the Firmware Event.
[Fix]
During hard reset operation, don't call canel_work_sync() API for the Firmware event work if this hard reset operation is invoked while processing this corresponding firmware event.
[Testing]
Load the driver and perform more target drive's hot plug and unplug operations (such that more firmware events get generated).
[REGRESSION RISK]
There is no regression risk with this bugfix patch.

Below fix patches are cosmetic patches which doesn't have any driver functionality impacts,
bfb3f00c06 scsi: mpt3sas: Simplify bool comparison
cf9e575e62 scsi: mpt3sas: Fix a bunch of potential naming doc-rot
54cb88dc30 scsi: mpt3sas: Fix a couple of misdocumented functions/params
782a1ab33f scsi: mpt3sas: Fix some kernel-doc misnaming issues
a8d548b0b3 scsi: mpt3sas: Fix a few kernel-doc issues
206a3afa94 scsi: mpt3sas: Fix a typo
c0629d70ca scsi: mpt3sas: Fix endianness for ActiveCablePowerRequirement
16660db3fc scsi: mpt3sas: Fix out-of-bounds warnings in _ctl_addnl_diag_query
3ad0b1da0d scsi: mpt3sas: Fix two kernel-doc headers
2910a4a9e9 scsi: mpt3sas: Documentation cleanup
e2fac6c44a scsi: mpt3sas: Fix deadlock while cancelling the running firmware event
84a84cc6af scsi: mpt3sas: Fix fall-through warnings for Clang
cf750be8e6 scsi: mpt3sas: Fix Coverity reported issue
d6c2ce435f scsi: mpt3sas: Fix error return value in _scsih_expander_add()

Below patches are related to kernel APIs compatible changes and doesn't have any driver functionality impacts,
8278807abd scsi: core: Add scsi_device_busy() wrapper
020b0f0a31 scsi: core: Replace sdev->device_busy with sbitmap
f2b1e9c6f8 scsi: core: Introduce scsi_build_sense()

Below patches are the driver enhancement patches that we have added in Phase18 & Phase19.

1. Add host tagset support:
Through this feature, the driver can emulate the number of request queues to more than one at kernel level even t...

Read more...

Michael Reed (mreed8855)
description: updated
Revision history for this message
Michael Reed (mreed8855) wrote (last edit ):

Many of the patches were already in the impish kernel, so I have summarized the impact statement for what was cherry picked.

SRU Justification:

[Impact]
This is a feature request to update the mpt3sas driver to latest version in upstream 38.100.00.00 in Ubuntu 21.10 and Ubuntu 20.04.
This will allow users to use fixes and enhancements that landed upstream in 5.14. For this reason,
Broadcom and Dell have requested to pull these patches to update the
mpt3sas driver to the current upstream version.

[FIXES]
The following fix patches are cosmetic patches which doesn't have any driver functionality impacts:
- 84a84cc6af scsi: mpt3sas: Fix fall-through warnings for Clang
- cf750be8e6 scsi: mpt3sas: Fix Coverity reported issue

The following patches are related to kernel APIs compatible changes and doesn't have any driver functionality impacts:
- f2b1e9c6f8 scsi: core: Introduce scsi_build_sense()
- 2910a4a9e9 scsi: mpt3sas: Documentation cleanup

Handle firmware faults gracefully during IOC initialization time:
Currently, if the driver detects any firmware fault during the IOC initialization then the driver terminates the IOC initialization operation. With this feature, the driver tries to recover the IOC from the fault state, if IOC successfully recovers from the fault state then driver continues with the IOC initialization operation.
- 19a622c39a scsi: mpt3sas: Handle firmware faults during first half of IOC init
- a0815c45c8 scsi: mpt3sas: Handle firmware faults during second half of IOC init

[TESTING]
Load the driver
Run IO

[REGRESSION RISK]
No extra test cases are needed, normal regression test cases such as driver load & running IOs are enough.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-hwe-5.13/5.13.0-17.17~20.04.1 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Sujith Pandel (sujithpandel) wrote :

Hi Michael,

Driver version check
Install and boot
12hr stress test
12hr reboot loop

All these tests went fine with the above proposed kernel. Please help include the updates into SRU.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 5.13.0-19.19

---------------
linux (5.13.0-19.19) impish; urgency=medium

  * impish/linux: 5.13.0-19.19 -proposed tracker (LP: #1946337)

  * impish:linux-aws 5.13 panic during systemd autotest (LP: #1946001)
    - [Config] disable KFENCE

 -- Andrea Righi <email address hidden> Thu, 07 Oct 2021 11:09:51 +0200

Changed in linux (Ubuntu Impish):
status: Confirmed → Fix Released
tags: added: verification-done-focal
removed: verification-needed-focal
Michael Reed (mreed8855)
description: updated
Revision history for this message
Brian Murray (brian-murray) wrote :

The Hirsute Hippo has reached End of Life, so this bug will not be fixed for that release.

Changed in linux (Ubuntu Hirsute):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.