Activity log for bug #1940957

Date Who What changed Old value New value Message
2021-08-24 12:49:00 Nobuto Murata bug added bug
2021-09-03 13:57:46 Nobuto Murata summary i40e: support 25G AOC/ACC cables DPDK ports get disabled after Open vSwitch restart with Intel XXV710(i40e) and 25G AOC cables
2021-09-03 14:21:36 Nobuto Murata description Ubuntu 20.04 LTS dpdk 19.11.7-0ubuntu0.20.04.1 We are seeing issues with link status of ports as DPDK-bond members and those links suddenly go away and marked as down. There are multiple parameters that could cause this issue, but one of the suggestions we've got from a server vendor was that the following upstream patch would be required to support 25G AOC/ACC cables. https://github.com/DPDK/dpdk/commit/b1daa34614 (available for v21.05 onward) (todo: define a test case) [expected status] ---- dpdk-bond0 ---- bond_mode: balance-tcp bond may use recirculation: yes, Recirc-ID : 1 bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms next rebalance: 1691 ms lacp_status: negotiated lacp_fallback_ab: false active slave mac: 40:a6:b7:3e:4a:60(dpdk-d2cb784) slave dpdk-7272e20: enabled may_enable: true slave dpdk-d2cb784: enabled active slave may_enable: true [after sometime - links are lost] ---- dpdk-bond0 ---- bond_mode: balance-tcp bond may use recirculation: yes, Recirc-ID : 1 bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms next rebalance: 7267 ms lacp_status: configured lacp_fallback_ab: false active slave mac: 00:00:00:00:00:00(none) slave dpdk-7272e20: disabled may_enable: false slave dpdk-d2cb784: disabled may_enable: false - Ubuntu 20.04 LTS - dpdk 19.11.7-0ubuntu0.20.04.1 (we tested it with 19.11.10~rc1, but the problem persists) - Intel XXV710 - Cisco 25G AOC cables Patch to backport: https://git.dpdk.org/dpdk/commit/?id=b1daa3461429e7674206a714c17adca65e9b44b4 [Impact] DPDK ports for a bond get disabled and no traffic goes in and out after openvswitch restart with the combination above. If that happens the DPDK bond has to be re-created as a workaround but it's not feasible since service restart basically breaks everything. ---- dpdk-bond0 ---- bond_mode: balance-tcp bond may use recirculation: yes, Recirc-ID : 1 bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms next rebalance: 7267 ms lacp_status: configured lacp_fallback_ab: false active slave mac: 00:00:00:00:00:00(none) slave dpdk-7272e20: disabled may_enable: false slave dpdk-d2cb784: disabled may_enable: false [Test Plan] 1. configure a DPDK bond with openvswitch as follows for example. $ sudo ovs-appctl bond/show dpdk-bond0 ---- dpdk-bond0 ---- bond_mode: balance-tcp bond may use recirculation: yes, Recirc-ID : 1 bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms next rebalance: 1691 ms lacp_status: negotiated lacp_fallback_ab: false active slave mac: 40:a6:b7:XX:YY:ZZ(dpdk-d2cb784) slave dpdk-7272e20: enabled may_enable: true slave dpdk-d2cb784: enabled active slave may_enable: true 2. Apply updated packages 3. Reboot the machine (just to make sure we are not using anything old) 4. Restart the openvswitch $ sudo systemctl restart openvswitch-switch 5. Confirm ports are enabled after both the step 3. and 4. and the port status matches the one in the step 1. [Where problems could occur] The scope of the patch is i40e and the two specific cable types only: i40e + 25G AOC and ACC cables so it's unlikely to affect any other combinations. Before this patch, 25G AOC/ACC cables were not in the additional PHY types of the driver functionality so it's not likely to make things worse.
2021-09-06 04:43:54 Utkarsh Gupta bug added subscriber Christian Ehrhardt 
2021-09-06 07:24:35 Christian Ehrhardt  dpdk (Ubuntu): status New Confirmed
2021-09-06 13:17:35 Christian Ehrhardt  nominated for series Ubuntu Impish
2021-09-06 13:17:35 Christian Ehrhardt  bug task added dpdk (Ubuntu Impish)
2021-09-06 13:17:35 Christian Ehrhardt  nominated for series Ubuntu Focal
2021-09-06 13:17:35 Christian Ehrhardt  bug task added dpdk (Ubuntu Focal)
2021-09-06 13:17:35 Christian Ehrhardt  nominated for series Ubuntu Hirsute
2021-09-06 13:17:35 Christian Ehrhardt  bug task added dpdk (Ubuntu Hirsute)
2021-09-06 13:17:42 Christian Ehrhardt  dpdk (Ubuntu Focal): status New Confirmed
2021-09-06 13:17:44 Christian Ehrhardt  dpdk (Ubuntu Hirsute): status New Confirmed
2021-09-06 13:17:47 Christian Ehrhardt  dpdk (Ubuntu Impish): status Confirmed Triaged
2021-09-06 14:19:52 Christian Ehrhardt  description - Ubuntu 20.04 LTS - dpdk 19.11.7-0ubuntu0.20.04.1 (we tested it with 19.11.10~rc1, but the problem persists) - Intel XXV710 - Cisco 25G AOC cables Patch to backport: https://git.dpdk.org/dpdk/commit/?id=b1daa3461429e7674206a714c17adca65e9b44b4 [Impact] DPDK ports for a bond get disabled and no traffic goes in and out after openvswitch restart with the combination above. If that happens the DPDK bond has to be re-created as a workaround but it's not feasible since service restart basically breaks everything. ---- dpdk-bond0 ---- bond_mode: balance-tcp bond may use recirculation: yes, Recirc-ID : 1 bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms next rebalance: 7267 ms lacp_status: configured lacp_fallback_ab: false active slave mac: 00:00:00:00:00:00(none) slave dpdk-7272e20: disabled may_enable: false slave dpdk-d2cb784: disabled may_enable: false [Test Plan] 1. configure a DPDK bond with openvswitch as follows for example. $ sudo ovs-appctl bond/show dpdk-bond0 ---- dpdk-bond0 ---- bond_mode: balance-tcp bond may use recirculation: yes, Recirc-ID : 1 bond-hash-basis: 0 updelay: 0 ms downdelay: 0 ms next rebalance: 1691 ms lacp_status: negotiated lacp_fallback_ab: false active slave mac: 40:a6:b7:XX:YY:ZZ(dpdk-d2cb784) slave dpdk-7272e20: enabled may_enable: true slave dpdk-d2cb784: enabled active slave may_enable: true 2. Apply updated packages 3. Reboot the machine (just to make sure we are not using anything old) 4. Restart the openvswitch $ sudo systemctl restart openvswitch-switch 5. Confirm ports are enabled after both the step 3. and 4. and the port status matches the one in the step 1. [Where problems could occur] The scope of the patch is i40e and the two specific cable types only: i40e + 25G AOC and ACC cables so it's unlikely to affect any other combinations. Before this patch, 25G AOC/ACC cables were not in the additional PHY types of the driver functionality so it's not likely to make things worse. [Impact] * Cable detection breaks i40e driver based use cases in some setups * An upstream patch was identified that resolves the issues, proposed and accepted upstream-stable and hereby backported (as we do not want to wait for 19.10.11 in December) along the 19.11.10 updates. [Test Plan] * Nobuto has contact with a site that has a setup with the right cables and devices to trigger this. He will coordinate the testing of this on Focal. * For non-Focal this is part of the normal MRE policy for DPDK (see bug 1940913) as it is (will be) part of the upstream stable releases. [Where problems could occur] * First of all this only affects a certain driver (i40e) all others will be unchanged due to this. When using that driver the detection of cables is adjusted and thereby the use-cases to look out for regression is more like "establish connection" "restart connection" and "setup" than let's say "bulk traffic" [Other Info] * The patch is accepted in the WIP 19.11.11 stable release and will on the next MRE be everywhere (not just in Ubuntu) --- - Ubuntu 20.04 LTS - dpdk 19.11.7-0ubuntu0.20.04.1   (we tested it with 19.11.10~rc1, but the problem persists) - Intel XXV710 - Cisco 25G AOC cables Patch to backport: https://git.dpdk.org/dpdk/commit/?id=b1daa3461429e7674206a714c17adca65e9b44b4 [Impact] DPDK ports for a bond get disabled and no traffic goes in and out after openvswitch restart with the combination above. If that happens the DPDK bond has to be re-created as a workaround but it's not feasible since service restart basically breaks everything.     ---- dpdk-bond0 ----     bond_mode: balance-tcp     bond may use recirculation: yes, Recirc-ID : 1     bond-hash-basis: 0     updelay: 0 ms     downdelay: 0 ms     next rebalance: 7267 ms     lacp_status: configured     lacp_fallback_ab: false     active slave mac: 00:00:00:00:00:00(none)     slave dpdk-7272e20: disabled       may_enable: false     slave dpdk-d2cb784: disabled       may_enable: false [Test Plan] 1. configure a DPDK bond with openvswitch as follows for example. $ sudo ovs-appctl bond/show dpdk-bond0     ---- dpdk-bond0 ----     bond_mode: balance-tcp     bond may use recirculation: yes, Recirc-ID : 1     bond-hash-basis: 0     updelay: 0 ms     downdelay: 0 ms     next rebalance: 1691 ms     lacp_status: negotiated     lacp_fallback_ab: false     active slave mac: 40:a6:b7:XX:YY:ZZ(dpdk-d2cb784)     slave dpdk-7272e20: enabled       may_enable: true     slave dpdk-d2cb784: enabled       active slave       may_enable: true 2. Apply updated packages 3. Reboot the machine (just to make sure we are not using anything old) 4. Restart the openvswitch $ sudo systemctl restart openvswitch-switch 5. Confirm ports are enabled after both the step 3. and 4. and the port status matches the one in the step 1. [Where problems could occur] The scope of the patch is i40e and the two specific cable types only: i40e + 25G AOC and ACC cables so it's unlikely to affect any other combinations. Before this patch, 25G AOC/ACC cables were not in the additional PHY types of the driver functionality so it's not likely to make things worse.
2021-09-06 14:21:17 Launchpad Janitor merge proposal linked https://code.launchpad.net/~paelzer/ubuntu/+source/dpdk/+git/dpdk/+merge/408161
2021-09-06 14:22:09 Launchpad Janitor merge proposal linked https://code.launchpad.net/~paelzer/ubuntu/+source/dpdk/+git/dpdk/+merge/408162
2021-09-06 14:23:57 Launchpad Janitor merge proposal linked https://code.launchpad.net/~paelzer/ubuntu/+source/dpdk/+git/dpdk/+merge/408163
2021-09-08 07:19:08 Christian Ehrhardt  dpdk (Ubuntu Hirsute): status Confirmed Triaged
2021-09-08 07:19:10 Christian Ehrhardt  dpdk (Ubuntu Focal): status Confirmed Triaged
2021-09-08 07:19:11 Christian Ehrhardt  dpdk (Ubuntu Impish): status Triaged In Progress
2021-09-08 11:13:05 Launchpad Janitor dpdk (Ubuntu Impish): status In Progress Fix Released
2021-09-09 15:20:32 Łukasz Zemczak dpdk (Ubuntu Hirsute): status Triaged Fix Committed
2021-09-09 15:20:35 Łukasz Zemczak bug added subscriber Ubuntu Stable Release Updates Team
2021-09-09 15:20:38 Łukasz Zemczak bug added subscriber SRU Verification
2021-09-09 15:20:42 Łukasz Zemczak tags verification-needed verification-needed-hirsute
2021-09-09 15:26:28 Łukasz Zemczak dpdk (Ubuntu Focal): status Triaged Fix Committed
2021-09-09 15:26:33 Łukasz Zemczak tags verification-needed verification-needed-hirsute verification-needed verification-needed-focal verification-needed-hirsute
2021-09-13 08:45:13 Christian Ehrhardt  tags verification-needed verification-needed-focal verification-needed-hirsute verification-done verification-done-focal verification-needed-hirsute
2021-09-13 15:21:26 Christian Ehrhardt  tags verification-done verification-done-focal verification-needed-hirsute verification-done verification-done-focal verification-done-hirsute
2021-09-21 18:28:21 Launchpad Janitor dpdk (Ubuntu Hirsute): status Fix Committed Fix Released
2021-09-21 18:28:53 Brian Murray removed subscriber Ubuntu Stable Release Updates Team
2021-09-21 18:30:03 Launchpad Janitor dpdk (Ubuntu Focal): status Fix Committed Fix Released