Comment 25 for bug 1779756

Revision history for this message
Nivedita Singhvi (niveditasinghvi) wrote :

We have a user who has been successfully running under load
with the test kernel provided here which was patched with
the following two commits:

"i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled"
Commit: fa38e30ac73fbb01d7e5d0fd1b12d412fa3ac3ee

"i40e: prevent overlapping tx_timeout recover"
Commit: d5585b7b6846a6d0f9517afe57be3843150719da

The issue was hit while running on 4.15.0-38-generic #41~16.04.1-Ubuntu
on Xenial (the hwe kernel).

Symptoms include messages in the kernel log of the form:

[4733544.982116] i40e 0000:18:00.1 eno2: tx_timeout: VSI_seid: 390, Q 6, NTC: 0x1a0, HWB: 0x66, NTU: 0x66, TAIL: 0x66, INT: 0x0
[4733544.982119] i40e 0000:18:00.1 eno2: tx_timeout recovery level 1, hung_queue 6
[4733572.116270] i40e 0000:18:00.1 eno2: tx_timeout: VSI_seid: 390, Q 2, NTC: 0x49, HWB: 0x123, NTU: 0x123, TAIL: 0x123, INT: 0x0
[4733572.116272] i40e 0000:18:00.1 eno2: tx_timeout recovery level 1, hung_queue 2

Leading to Kafka server issues, etc.

We are fairly confident this is the same as the original reporter,
and we'd like to use this bug to proceed on the stable release update process.