2017-10-12 14:09:26 |
Dan Streetman |
bug |
|
|
added bug |
2017-10-12 14:10:54 |
Dan Streetman |
nominated for series |
|
Ubuntu Xenial |
|
2017-10-12 14:10:59 |
Dan Streetman |
linux (Ubuntu): status |
New |
In Progress |
|
2017-10-12 14:11:02 |
Dan Streetman |
linux (Ubuntu): importance |
Undecided |
Medium |
|
2017-10-12 14:11:04 |
Dan Streetman |
linux (Ubuntu): assignee |
|
Dan Streetman (ddstreet) |
|
2017-10-12 14:27:03 |
Adam Thorn |
bug |
|
|
added subscriber Adam Thorn |
2017-10-12 14:57:16 |
Eric Desrochers |
bug task added |
|
linux (Ubuntu Xenial) |
|
2017-10-12 14:57:39 |
Dan Streetman |
linux (Ubuntu Xenial): importance |
Undecided |
Medium |
|
2017-10-12 14:57:42 |
Dan Streetman |
linux (Ubuntu Xenial): status |
New |
In Progress |
|
2017-10-12 14:57:45 |
Dan Streetman |
linux (Ubuntu Xenial): assignee |
|
Dan Streetman (ddstreet) |
|
2017-10-12 15:40:33 |
Logan V |
bug |
|
|
added subscriber Logan V |
2017-10-12 18:26:27 |
Vinson Lee |
bug |
|
|
added subscriber Vinson Lee |
2017-10-13 08:02:20 |
Björn Zettergren |
bug |
|
|
added subscriber Björn Zettergren |
2017-10-18 08:11:04 |
Björn Zettergren |
attachment added |
|
redacted_i40e_syslog.txt https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1723127/+attachment/4974593/+files/redacted_i40e_syslog.txt |
|
2018-01-24 08:19:38 |
Stefan Kooman |
bug |
|
|
added subscriber Stefan Kooman |
2018-03-09 00:46:45 |
Nobuto Murata |
bug |
|
|
added subscriber Nobuto Murata |
2018-03-14 06:13:25 |
Roman Karlstetter |
bug |
|
|
added subscriber Roman Karlstetter |
2018-03-20 15:18:21 |
Dan Streetman |
nominated for series |
|
Ubuntu Artful |
|
2018-03-20 15:18:21 |
Dan Streetman |
bug task added |
|
linux (Ubuntu Artful) |
|
2018-03-20 15:18:28 |
Dan Streetman |
linux (Ubuntu Artful): assignee |
|
Dan Streetman (ddstreet) |
|
2018-03-20 15:18:31 |
Dan Streetman |
linux (Ubuntu Artful): importance |
Undecided |
Medium |
|
2018-03-20 15:18:33 |
Dan Streetman |
linux (Ubuntu Artful): status |
New |
Incomplete |
|
2018-03-20 15:18:39 |
Dan Streetman |
linux (Ubuntu Artful): status |
Incomplete |
In Progress |
|
2018-03-20 15:18:48 |
Dan Streetman |
nominated for series |
|
Ubuntu Bionic |
|
2018-03-20 15:18:48 |
Dan Streetman |
bug task added |
|
linux (Ubuntu Bionic) |
|
2018-03-20 15:18:55 |
Dan Streetman |
linux (Ubuntu Bionic): status |
In Progress |
Fix Released |
|
2018-03-20 15:19:50 |
Dan Streetman |
nominated for series |
|
Ubuntu Trusty |
|
2018-03-20 15:19:50 |
Dan Streetman |
bug task added |
|
linux (Ubuntu Trusty) |
|
2018-03-20 15:19:56 |
Dan Streetman |
linux (Ubuntu Trusty): status |
New |
Won't Fix |
|
2018-03-21 10:46:45 |
Dan Streetman |
description |
This is a continuation from bug 1713553; a patch was added in that bug to attempt to fix this, and it may have helped reduce the issue but appears not to have fixed it, based on more reports.
The issue is the i40e driver, when TSO is enabled, sometimes sees the NIC firmware issue a "MDD event" where MDD is "Malicious Driver Detection". This is vaguely defined in the i40e spec, but with no way to tell what the NIC actually saw that it didn't like. So, the driver can do nothing but print an error message and reset the PF (or VF). Unfortunately, this resets the interface, which causes an interruption in network traffic flow while the PF is resetting.
See bug 1713553 for more details. |
[impact]
The i40e driver sometimes causes a "malicious device" event that the firmware detects, which causes the firmware to reset the nic, causing an interruption in the network connection - which can cause further problems, e.g. if the interface is in a bond; the reset will at least cause a temporary interruption in network traffic.
[fix]
The upstream patch to fix this adjusts how the driver fragments TX data; the "malicious driver" detected by the firmware is a result of incorrectly crafted TX fragment descriptors (the firmware has specific complicated restrictions on this). The patch is from Intel, and they suggested this specific patch to address the problem; additionally I have checked with someone who reported this to me and provided a test kernel with the patch to them, and they have been able to run ~6 weeks so far without reproducing the issue; previously they could reproduce it as quickly as a day, but usually within 2-3 weeks.
[test case]
the bug is unfortunately very difficult to reproduce, but as shown in this (and previous) bug comments, some users of the i40e have traffic that can consistently reproduce the problem (although usually on the order of days, or longer, to reproduce). Reproducing is easily detected, as the nw traffic will be interrupted and the system logs will contain a message like:
i40e 0000:02:00.1: TX driver issue detected, PF reset issued
[regression potential]
the patch for this alters how tx is fragmented by the driver, so a possible regression would likely cause problems in TX traffic and/or additional "malicious device detection" events.
[original description]
This is a continuation from bug 1713553; a patch was added in that bug to attempt to fix this, and it may have helped reduce the issue but appears not to have fixed it, based on more reports.
The issue is the i40e driver, when TSO is enabled, sometimes sees the NIC firmware issue a "MDD event" where MDD is "Malicious Driver Detection". This is vaguely defined in the i40e spec, but with no way to tell what the NIC actually saw that it didn't like. So, the driver can do nothing but print an error message and reset the PF (or VF). Unfortunately, this resets the interface, which causes an interruption in network traffic flow while the PF is resetting.
See bug 1713553 for more details. |
|
2018-03-22 13:12:21 |
Janåke Rönnblom |
bug |
|
|
added subscriber Janåke Rönnblom |
2018-04-03 11:20:45 |
Kleber Sacilotto de Souza |
linux (Ubuntu Xenial): status |
In Progress |
Fix Committed |
|
2018-04-03 11:20:48 |
Kleber Sacilotto de Souza |
linux (Ubuntu Artful): status |
In Progress |
Fix Committed |
|
2018-04-10 09:30:46 |
Brad Figg |
tags |
|
verification-needed-xenial |
|
2018-04-10 09:32:32 |
Brad Figg |
tags |
verification-needed-xenial |
verification-needed-artful verification-needed-xenial |
|
2018-04-10 11:38:26 |
Dan Streetman |
tags |
verification-needed-artful verification-needed-xenial |
verification-done-artful verification-done-xenial |
|
2018-04-23 08:22:44 |
Launchpad Janitor |
linux (Ubuntu Xenial): status |
Fix Committed |
Fix Released |
|
2018-04-23 08:22:44 |
Launchpad Janitor |
cve linked |
|
2017-5715 |
|
2018-04-23 08:22:44 |
Launchpad Janitor |
cve linked |
|
2017-5754 |
|
2018-04-23 08:22:45 |
Launchpad Janitor |
linux (Ubuntu Xenial): status |
Fix Committed |
Fix Released |
|
2018-04-23 09:21:59 |
Launchpad Janitor |
linux (Ubuntu Artful): status |
Fix Committed |
Fix Released |
|
2018-04-23 09:21:59 |
Launchpad Janitor |
cve linked |
|
2018-8043 |
|
2018-04-23 09:21:59 |
Launchpad Janitor |
linux (Ubuntu Artful): status |
Fix Committed |
Fix Released |
|