2024-03-06 16:55:10 |
Thibf |
bug |
|
|
added bug |
2024-03-06 16:58:32 |
Thibf |
nominated for series |
|
Ubuntu Noble |
|
2024-03-06 16:58:32 |
Thibf |
bug task added |
|
linux (Ubuntu Noble) |
|
2024-03-06 16:59:48 |
Thibf |
description |
[Impact]
Some improvement are made to qat which makes it
more resilient and able to handle reset in a
better way.
There is an upstream patch set that improve this
but it's applied to linux-next and scheduled for 6.9.
We should apply the patch set to the Noble 6.8 kernel,
so we experience less issue with qat and be more maintainable.
[Test case]
Use the added mechanism to inject errors and rmmod/modprobe the module
to verify that qat recover and log issues properly.
[Fix]
Apply the following commits (from linux-next):
2ecd43413d76 Documentation: qat: fix auto_reset section
7d42e097607c crypto: qat - resolve race condition during AER recovery
c2304e1a0b80 crypto: qat - change SLAs cleanup flow at shutdown
9567d3dc7609 crypto: qat - improve aer error reset handling
750fa7c20e60 crypto: qat - limit heartbeat notifications
f5419a4239af crypto: qat - add auto reset on error
2aaa1995a94a crypto: qat - add fatal error notification
4469f9b23468 crypto: qat - re-enable sriov after pf reset
ec26f8e6c784 crypto: qat - update PFVF protocol for recovery
758a0087db98 crypto: qat - disable arbitration before reset
ae508d7afb75 crypto: qat - add fatal error notify method
e2b67859ab6e crypto: qat - add heartbeat error simulator
[Regression potential]
We may experience qat regression when crashing or restarting the module. |
[Impact]
Some improvement are made to qat which makes it
more resilient and able to handle reset in a
better way.
There is an upstream patch set that improve this
but it's applied to linux-next and scheduled for 6.9.
We should apply the patch set to the Noble 6.8 kernel,
so we experience less issues with qat and be more maintainable.
[Test case]
Use the added mechanism to inject errors and rmmod/modprobe the module
to verify that qat recover and log issues properly.
[Fix]
Apply the following commits (from linux-next):
2ecd43413d76 Documentation: qat: fix auto_reset section
7d42e097607c crypto: qat - resolve race condition during AER recovery
c2304e1a0b80 crypto: qat - change SLAs cleanup flow at shutdown
9567d3dc7609 crypto: qat - improve aer error reset handling
750fa7c20e60 crypto: qat - limit heartbeat notifications
f5419a4239af crypto: qat - add auto reset on error
2aaa1995a94a crypto: qat - add fatal error notification
4469f9b23468 crypto: qat - re-enable sriov after pf reset
ec26f8e6c784 crypto: qat - update PFVF protocol for recovery
758a0087db98 crypto: qat - disable arbitration before reset
ae508d7afb75 crypto: qat - add fatal error notify method
e2b67859ab6e crypto: qat - add heartbeat error simulator
[Regression potential]
We may experience qat regression when crashing or restarting the module. |
|
2024-03-07 16:12:08 |
Ricardo Martins |
linux (Ubuntu Noble): status |
New |
In Progress |
|
2024-03-07 22:04:37 |
Thibf |
description |
[Impact]
Some improvement are made to qat which makes it
more resilient and able to handle reset in a
better way.
There is an upstream patch set that improve this
but it's applied to linux-next and scheduled for 6.9.
We should apply the patch set to the Noble 6.8 kernel,
so we experience less issues with qat and be more maintainable.
[Test case]
Use the added mechanism to inject errors and rmmod/modprobe the module
to verify that qat recover and log issues properly.
[Fix]
Apply the following commits (from linux-next):
2ecd43413d76 Documentation: qat: fix auto_reset section
7d42e097607c crypto: qat - resolve race condition during AER recovery
c2304e1a0b80 crypto: qat - change SLAs cleanup flow at shutdown
9567d3dc7609 crypto: qat - improve aer error reset handling
750fa7c20e60 crypto: qat - limit heartbeat notifications
f5419a4239af crypto: qat - add auto reset on error
2aaa1995a94a crypto: qat - add fatal error notification
4469f9b23468 crypto: qat - re-enable sriov after pf reset
ec26f8e6c784 crypto: qat - update PFVF protocol for recovery
758a0087db98 crypto: qat - disable arbitration before reset
ae508d7afb75 crypto: qat - add fatal error notify method
e2b67859ab6e crypto: qat - add heartbeat error simulator
[Regression potential]
We may experience qat regression when crashing or restarting the module. |
[Impact]
This set improves the error recovery flows in the QAT drivers and
adds a mechanism to test it through an heartbeat simulator.
This is an upstream patch set applied to linux-next and scheduled for 6.9.
Link to the upstream submission:
https://patchwork.kernel.org/project/linux-crypto/cover/20240202105324.50391-1-mun.chun.yep@intel.com/
We should apply this set to the Noble 6.8 kernel,
in order to experience less issues with qat and improve maintainability.
An added commit is required to update the configuration.
[Test case]
Unload and reload the module to verify that qat recover
and log issues properly. Use the added error injection mechanism
to verify the recovery flow.
[Fix]
Apply the following commits (from linux-next):
2ecd43413d76 Documentation: qat: fix auto_reset section
7d42e097607c crypto: qat - resolve race condition during AER recovery
c2304e1a0b80 crypto: qat - change SLAs cleanup flow at shutdown
9567d3dc7609 crypto: qat - improve aer error reset handling
750fa7c20e60 crypto: qat - limit heartbeat notifications
f5419a4239af crypto: qat - add auto reset on error
2aaa1995a94a crypto: qat - add fatal error notification
4469f9b23468 crypto: qat - re-enable sriov after pf reset
ec26f8e6c784 crypto: qat - update PFVF protocol for recovery
758a0087db98 crypto: qat - disable arbitration before reset
ae508d7afb75 crypto: qat - add fatal error notify method
e2b67859ab6e crypto: qat - add heartbeat error simulator
[Regression potential]
We may experience qat regression when crashing or restarting the module. |
|
2024-03-07 22:04:41 |
Thibf |
linux (Ubuntu Noble): assignee |
|
Thibf (thibf) |
|
2024-03-08 13:59:01 |
Thibf |
summary |
qat: Improve error/reset handling |
qat: Improve error recovery flows |
|
2024-03-08 15:51:42 |
Thibf |
linux (Ubuntu Noble): status |
In Progress |
Fix Committed |
|
2024-03-20 23:09:33 |
Thibf |
linux (Ubuntu Noble): status |
Fix Committed |
Fix Released |
|
2024-03-21 09:27:03 |
Kleber Sacilotto de Souza |
linux (Ubuntu Noble): status |
Fix Released |
Fix Committed |
|
2024-03-28 08:15:32 |
Launchpad Janitor |
linux (Ubuntu Noble): status |
Fix Committed |
Fix Released |
|
2024-05-14 19:47:47 |
Ubuntu Kernel Bot |
tags |
|
kernel-spammed-jammy-linux-nvidia-6.8-v2 verification-needed-jammy-linux-nvidia-6.8 |
|