Activity log for bug #1779756

Date Who What changed Old value New value Message
2018-07-02 20:21:10 Vivien GUEANT bug added bug
2018-07-02 20:21:10 Vivien GUEANT attachment added Ubuntu server 16.04 with Kernel 4.15 boot https://bugs.launchpad.net/bugs/1779756/+attachment/5158895/+files/201807_bug_driver_i40e_kernel_415.png
2018-07-02 20:21:40 Vivien GUEANT affects ichthux ubuntu
2018-07-02 20:25:31 Vivien GUEANT affects ubuntu linux-meta-hwe (Ubuntu)
2018-07-02 20:27:16 Vivien GUEANT attachment added dmesg | grep i40e https://bugs.launchpad.net/ubuntu/+source/linux-meta-hwe/+bug/1779756/+attachment/5158896/+files/dmesg_i40e.txt
2018-08-08 11:04:25 Julien Royannais bug added subscriber Julien Royannais
2018-08-27 20:43:58 Launchpad Janitor linux-meta-hwe (Ubuntu): status New Confirmed
2018-09-04 13:06:54 Janåke Rönnblom bug added subscriber Janåke Rönnblom
2018-09-11 13:07:15 Vivien GUEANT attachment added Dmesg of the bug https://bugs.launchpad.net/ubuntu/+source/linux-meta-hwe/+bug/1779756/+attachment/5187571/+files/dmesg.txt
2018-09-12 08:14:28 Julien Royannais removed subscriber Julien Royannais
2018-09-27 07:21:29 Roman Karlstetter bug added subscriber Roman Karlstetter
2018-10-05 18:41:39 Vivien GUEANT affects linux-meta-hwe (Ubuntu) linux-firmware (Ubuntu)
2018-10-05 18:53:08 Vivien GUEANT summary i40e driver does not work with kernel 4.15 Intel XL710 - i40e driver does not work with kernel 4.15 (Ubuntu 18.04)
2018-10-06 02:29:23 Joseph Salisbury linux-firmware (Ubuntu): importance Undecided High
2018-10-06 02:30:45 Joseph Salisbury affects linux-firmware (Ubuntu) linux (Ubuntu)
2018-10-06 02:30:55 Joseph Salisbury nominated for series Ubuntu Bionic
2018-10-06 02:30:55 Joseph Salisbury bug task added linux (Ubuntu Bionic)
2018-10-06 02:31:01 Joseph Salisbury linux (Ubuntu Bionic): importance Undecided High
2018-10-06 02:31:04 Joseph Salisbury linux (Ubuntu Bionic): status New Triaged
2018-10-06 02:31:08 Joseph Salisbury linux (Ubuntu): status Confirmed Triaged
2018-10-06 02:31:12 Joseph Salisbury linux (Ubuntu): assignee Joseph Salisbury (jsalisbury)
2018-10-06 02:31:15 Joseph Salisbury linux (Ubuntu Bionic): assignee Joseph Salisbury (jsalisbury)
2018-10-18 11:57:54 Visa Hänninen bug added subscriber Visa Hänninen
2018-11-13 12:00:21 flynx bug added subscriber flynx
2018-11-15 21:35:38 Joseph Salisbury linux (Ubuntu Bionic): status Triaged Incomplete
2018-11-15 21:35:41 Joseph Salisbury linux (Ubuntu): status Triaged Incomplete
2018-12-19 14:55:24 Joseph Salisbury linux (Ubuntu): status Incomplete In Progress
2018-12-19 14:55:27 Joseph Salisbury linux (Ubuntu Bionic): status Incomplete In Progress
2018-12-20 00:07:32 Dominique Poulain bug added subscriber Dominique Poulain
2018-12-20 10:16:45 Szilard Cserey bug added subscriber Szilard Cserey
2019-01-22 20:20:09 Joseph Salisbury bug added subscriber Joseph Salisbury
2019-01-22 20:20:13 Joseph Salisbury removed subscriber Joseph Salisbury
2019-01-23 01:07:55 Joseph Salisbury linux (Ubuntu Bionic): status In Progress Confirmed
2019-01-23 01:08:00 Joseph Salisbury linux (Ubuntu): status In Progress Confirmed
2019-01-23 01:08:02 Joseph Salisbury linux (Ubuntu Bionic): assignee Joseph Salisbury (jsalisbury)
2019-01-23 01:08:04 Joseph Salisbury linux (Ubuntu): assignee Joseph Salisbury (jsalisbury)
2019-01-25 15:33:06 Seth Forshee linux (Ubuntu): status Confirmed Fix Committed
2019-02-04 14:46:37 Launchpad Janitor linux (Ubuntu): status Fix Committed Fix Released
2019-02-13 20:30:19 Terry Rudd bug added subscriber Terry Rudd
2019-02-19 11:29:28 Terry Rudd nominated for series Ubuntu Cosmic
2019-02-19 14:10:55 Stefan Bader bug task added linux (Ubuntu Cosmic)
2019-02-19 14:11:12 Stefan Bader linux (Ubuntu Cosmic): status New Confirmed
2019-02-19 14:11:34 Stefan Bader linux (Ubuntu Cosmic): importance Undecided High
2019-03-04 14:53:14 Nivedita Singhvi linux (Ubuntu Bionic): assignee Nivedita Singhvi (niveditasinghvi)
2019-03-04 14:53:19 Nivedita Singhvi linux (Ubuntu Cosmic): assignee Nivedita Singhvi (niveditasinghvi)
2019-03-04 14:54:08 Nivedita Singhvi linux (Ubuntu Bionic): status Confirmed In Progress
2019-03-04 14:54:12 Nivedita Singhvi linux (Ubuntu Cosmic): status Confirmed In Progress
2019-03-19 07:57:47 Nivedita Singhvi description Today Ubuntu 16.04 LTS Enablement Stacks has moved from the Kernel 4.13 to the Kernel 4.15.0-24-generic. On a "Dell PowerEdge R330" server with a network adapter "Intel Ethernet Converged Network Adapter X710-DA2" (driver i40e) the network card no longer works and permanently displays these three lines : [ 98.012098] i40e 0000:01:00.0 enp1s0f0: tx_timeout: VSI_seid: 388, Q 8, NTC: 0x0, HWB: 0x0, NTU: 0x1, TAIL: 0x1, INT: 0x1 [ 98.012119] i40e 0000:01:00.0 enp1s0f0: tx_timeout recovery level 11, hung_queue 8 [ 98.012125] i40e 0000:01:00.0 enp1s0f0: tx_timeout recovery unsuccessful [Impact] The i40e driver can get stalled on tx timeouts. This can happen when DCB is enabled on the connected switch. This can also trigger a second situation when a tx timeout occurs before the recovery of a previous timeout has completed due to CPU load, which is not handled correctly. This leads to networking delays, drops and application timeouts and hangs. Note that the first tx timeout cause is just one of the ways to end up in the second situation. This issue was seen on a heavily loaded Kafka broker node running the 4.15.0-38-generic kernel on Xenial. Symptoms include messages in the kernel log of the form: --- [4733544.982116] i40e 0000:18:00.1 eno2: tx_timeout: VSI_seid: 390, Q 6, NTC: 0x1a0, HWB: 0x66, NTU: 0x66, TAIL: 0x66, INT: 0x0 [4733544.982119] i40e 0000:18:00.1 eno2: tx_timeout recovery level 1, hung_queue 6 ---- With the test kernel provided in this LP bug which had these two commits compiled in, the problem has not been seen again, and has been running successfully for several months: "i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled" Commit: fa38e30ac73fbb01d7e5d0fd1b12d412fa3ac3ee "i40e: prevent overlapping tx_timeout recover" Commit: d5585b7b6846a6d0f9517afe57be3843150719da * The first commit is already in Disco, Cosmic * The second commit is already in Disco * Bionic needs both patches and Cosmic needs the second [Test Case] * We are considering the case of both issues above occurring. * Seen by reporter on a Kafka broker node with heavy traffic. * Not easy to reproduce as it requires something like the following example environment and heavy load: Kernel: 4.15.0-38-generic Network driver: i40e version: 2.1.14-k firmware-version: 6.00 0x800034e6 18.3.6 NIC: Intel 40Gb XL710 DCB enabled [Regression Potential] Low, as the first only impacts i40e DCB environment, and has been running for several months in production-load testing successfully. --- Original Description Today Ubuntu 16.04 LTS Enablement Stacks has moved from the Kernel 4.13 to the Kernel 4.15.0-24-generic. On a "Dell PowerEdge R330" server with a network adapter "Intel Ethernet Converged Network Adapter X710-DA2" (driver i40e) the network card no longer works and permanently displays these three lines : [ 98.012098] i40e 0000:01:00.0 enp1s0f0: tx_timeout: VSI_seid: 388, Q 8, NTC: 0x0, HWB: 0x0, NTU: 0x1, TAIL: 0x1, INT: 0x1 [ 98.012119] i40e 0000:01:00.0 enp1s0f0: tx_timeout recovery level 11, hung_queue 8 [ 98.012125] i40e 0000:01:00.0 enp1s0f0: tx_timeout recovery unsuccessful
2019-03-19 08:10:09 Nivedita Singhvi tags 4.15.0-24-generic kernel 4.15.0-24-generic bionic cosmic kernel
2019-03-27 06:07:42 Khaled El Mously linux (Ubuntu Bionic): status In Progress Fix Committed
2019-03-27 06:07:44 Khaled El Mously linux (Ubuntu Cosmic): status In Progress Fix Committed
2019-04-04 18:01:58 Ubuntu Kernel Bot tags 4.15.0-24-generic bionic cosmic kernel 4.15.0-24-generic bionic cosmic kernel verification-needed-cosmic
2019-04-04 18:04:17 Ubuntu Kernel Bot tags 4.15.0-24-generic bionic cosmic kernel verification-needed-cosmic 4.15.0-24-generic bionic cosmic kernel verification-needed-bionic verification-needed-cosmic
2019-04-08 05:22:57 Nivedita Singhvi tags 4.15.0-24-generic bionic cosmic kernel verification-needed-bionic verification-needed-cosmic bionic verification-done-bionic verification-done-cosmic
2019-04-08 05:23:09 Nivedita Singhvi description [Impact] The i40e driver can get stalled on tx timeouts. This can happen when DCB is enabled on the connected switch. This can also trigger a second situation when a tx timeout occurs before the recovery of a previous timeout has completed due to CPU load, which is not handled correctly. This leads to networking delays, drops and application timeouts and hangs. Note that the first tx timeout cause is just one of the ways to end up in the second situation. This issue was seen on a heavily loaded Kafka broker node running the 4.15.0-38-generic kernel on Xenial. Symptoms include messages in the kernel log of the form: --- [4733544.982116] i40e 0000:18:00.1 eno2: tx_timeout: VSI_seid: 390, Q 6, NTC: 0x1a0, HWB: 0x66, NTU: 0x66, TAIL: 0x66, INT: 0x0 [4733544.982119] i40e 0000:18:00.1 eno2: tx_timeout recovery level 1, hung_queue 6 ---- With the test kernel provided in this LP bug which had these two commits compiled in, the problem has not been seen again, and has been running successfully for several months: "i40e: Fix for Tx timeouts when interface is brought up if DCB is enabled" Commit: fa38e30ac73fbb01d7e5d0fd1b12d412fa3ac3ee "i40e: prevent overlapping tx_timeout recover" Commit: d5585b7b6846a6d0f9517afe57be3843150719da * The first commit is already in Disco, Cosmic * The second commit is already in Disco * Bionic needs both patches and Cosmic needs the second [Test Case] * We are considering the case of both issues above occurring. * Seen by reporter on a Kafka broker node with heavy traffic. * Not easy to reproduce as it requires something like the following example environment and heavy load: Kernel: 4.15.0-38-generic Network driver: i40e version: 2.1.14-k firmware-version: 6.00 0x800034e6 18.3.6 NIC: Intel 40Gb XL710 DCB enabled [Regression Potential] Low, as the first only impacts i40e DCB environment, and has been running for several months in production-load testing successfully. --- Original Description Today Ubuntu 16.04 LTS Enablement Stacks has moved from the Kernel 4.13 to the Kernel 4.15.0-24-generic. On a "Dell PowerEdge R330" server with a network adapter "Intel Ethernet Converged Network Adapter X710-DA2" (driver i40e) the network card no longer works and permanently displays these three lines : [ 98.012098] i40e 0000:01:00.0 enp1s0f0: tx_timeout: VSI_seid: 388, Q 8, NTC: 0x0, HWB: 0x0, NTU: 0x1, TAIL: 0x1, INT: 0x1 [ 98.012119] i40e 0000:01:00.0 enp1s0f0: tx_timeout recovery level 11, hung_queue 8 [ 98.012125] i40e 0000:01:00.0 enp1s0f0: tx_timeout recovery unsuccessful [Impact] The i40e driver can get stalled on tx timeouts. This can happen when DCB is enabled on the connected switch. This can also trigger a second situation when a tx timeout occurs before the recovery of a previous timeout has completed due to CPU load, which is not handled correctly. This leads to networking delays, drops and application timeouts and hangs. Note that the first tx timeout cause is just one of the ways to end up in the second situation. This issue was seen on a heavily loaded Kafka broker node running the 4.15.0-38-generic kernel on Xenial. Symptoms include messages in the kernel log of the form: --- [4733544.982116] i40e 0000:18:00.1 eno2: tx_timeout: VSI_seid: 390, Q 6, NTC: 0x1a0, HWB: 0x66, NTU: 0x66, TAIL: 0x66, INT: 0x0 [4733544.982119] i40e 0000:18:00.1 eno2: tx_timeout recovery level 1, hung_queue 6 ---- With the test kernel provided in this LP bug which had these two commits compiled in, the problem has not been seen again, and has been running successfully for several months: "i40e: Fix for Tx timeouts when interface is brought up if  DCB is enabled" Commit: fa38e30ac73fbb01d7e5d0fd1b12d412fa3ac3ee "i40e: prevent overlapping tx_timeout recover" Commit: d5585b7b6846a6d0f9517afe57be3843150719da * The first commit is already in Disco, Cosmic * The second commit is already in Disco * Bionic needs both patches and Cosmic needs the second [Test Case] * We are considering the case of both issues above occurring. * Seen by reporter on a Kafka broker node with heavy traffic. * Not easy to reproduce as it requires something like the   following example environment and heavy load:   Kernel: 4.15.0-38-generic   Network driver: i40e         version: 2.1.14-k         firmware-version: 6.00 0x800034e6 18.3.6   NIC: Intel 40Gb XL710   DCB enabled [Regression Potential] Low, as the first only impacts i40e DCB environment, and has been running for several months in production-load testing successfully. --- Original Description Today Ubuntu 16.04 LTS Enablement Stacks has moved from the Kernel 4.13 to the Kernel 4.15.0-24-generic. On a "Dell PowerEdge R330" server with a network adapter "Intel Ethernet Converged Network Adapter X710-DA2" (driver i40e) the network card no longer works and permanently displays these three lines : [ 98.012098] i40e 0000:01:00.0 enp1s0f0: tx_timeout: VSI_seid: 388, Q 8, NTC: 0x0, HWB: 0x0, NTU: 0x1, TAIL: 0x1, INT: 0x1 [ 98.012119] i40e 0000:01:00.0 enp1s0f0: tx_timeout recovery level 11, hung_queue 8 [ 98.012125] i40e 0000:01:00.0 enp1s0f0: tx_timeout recovery unsuccessful
2019-04-08 10:06:18 Po-Hsu Lin tags bionic verification-done-bionic verification-done-cosmic bionic cosmic verification-done-bionic verification-done-cosmic
2019-04-23 21:35:02 Launchpad Janitor linux (Ubuntu Cosmic): status Fix Committed Fix Released
2019-04-23 21:35:02 Launchpad Janitor cve linked 2017-5715
2019-04-24 07:39:21 Launchpad Janitor linux (Ubuntu Bionic): status Fix Committed Fix Released
2019-04-24 07:39:21 Launchpad Janitor cve linked 2017-5754
2019-04-24 07:39:21 Launchpad Janitor cve linked 2018-3639
2019-05-23 02:41:34 Nivedita Singhvi tags bionic cosmic verification-done-bionic verification-done-cosmic bionic cosmic sts verification-done-bionic verification-done-cosmic