Ubuntu 16.04.02: ibmveth: Support to enable LSO/CSO for Trunk VEA

Bug #1692538 reported by bugproxy
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
Fix Released
Critical
Canonical Kernel Team
linux (Ubuntu)
Fix Released
Critical
Joseph Salisbury
Zesty
Fix Released
Critical
Joseph Salisbury
Artful
Fix Released
Critical
Joseph Salisbury

Bug Description

== SRU Justification ==
Commit 66aa0678ef is request to fix four issues with the ibmveth driver.
The issues are as follows:
- Issue 1: ibmveth doesn't support largesend and checksum offload features when configured as "Trunk".
- Issue 2: SYN packet drops seen at destination VM. When the packet
originates, it has CHECKSUM_PARTIAL flag set and as it gets delivered to IO
server's inbound Trunk ibmveth, on validating "checksum good" bits in ibmveth
receive routine, SKB's ip_summed field is set with CHECKSUM_UNNECESSARY flag.
- Issue 3: First packet of a TCP connection will be dropped, if there is
no OVS flow cached in datapath.
- Issue 4: ibmveth driver doesn't have support for SKB's with frag_list.

The details for the fixes to these issues are described in the commits git log.

== Comment: #0 - BRYANT G. LY <email address hidden> - 2017-05-22 08:40:16 ==
---Problem Description---

 - Issue 1: ibmveth doesn't support largesend and checksum offload features
   when configured as "Trunk". Driver has explicit checks to prevent
   enabling these offloads.

 - Issue 2: SYN packet drops seen at destination VM. When the packet
   originates, it has CHECKSUM_PARTIAL flag set and as it gets delivered to
   IO server's inbound Trunk ibmveth, on validating "checksum good" bits
   in ibmveth receive routine, SKB's ip_summed field is set with
   CHECKSUM_UNNECESSARY flag. This packet is then bridged by OVS (or Linux
   Bridge) and delivered to outbound Trunk ibmveth. At this point the
   outbound ibmveth transmit routine will not set "no checksum" and
   "checksum good" bits in transmit buffer descriptor, as it does so only
   when the ip_summed field is CHECKSUM_PARTIAL. When this packet gets
   delivered to destination VM, TCP layer receives the packet with checksum
   value of 0 and with no checksum related flags in ip_summed field. This
   leads to packet drops. So, TCP connections never goes through fine.

 - Issue 3: First packet of a TCP connection will be dropped, if there is
   no OVS flow cached in datapath. OVS while trying to identify the flow,
   computes the checksum. The computed checksum will be invalid at the
   receiving end, as ibmveth transmit routine zeroes out the pseudo
   checksum value in the packet. This leads to packet drop.

 - Issue 4: ibmveth driver doesn't have support for SKB's with frag_list.
   When Physical NIC has GRO enabled and when OVS bridges these packets,
   OVS vport send code will end up calling dev_queue_xmit, which in turn
   calls validate_xmit_skb.
   In validate_xmit_skb routine, the larger packets will get segmented into
   MSS sized segments, if SKB has a frag_list and if the driver to which
   they are delivered to doesn't support NETIF_F_FRAGLIST feature.

Contact Information = Bryant G. <email address hidden>

---uname output---
4.8.0-51.54

Machine Type = p8

---Debugger---
A debugger is not configured

---Steps to Reproduce---
 Increases performance greatly

The patch has been accepted upstream:
https://patchwork.ozlabs.org/patch/764533/

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-154875 severity-critical targetmilestone-inin16042
Changed in ubuntu:
assignee: nobody → Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
affects: ubuntu → linux (Ubuntu)
tags: added: kernel-da-key
Manoj Iyer (manjo)
tags: added: ubuntu-16.04
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu):
status: New → Confirmed
Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
status: New → Confirmed
Manoj Iyer (manjo)
Changed in linux (Ubuntu):
importance: Undecided → Critical
Changed in ubuntu-power-systems:
importance: Undecided → Critical
Changed in linux (Ubuntu):
assignee: Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) → Canonical Kernel Team (canonical-kernel-team)
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I built Xenial, Zesty and Artful test kernels with commit 66aa0678efc2. The test kernels can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1692538/

Can you test these kernels and see if they resolve this bug?

Thanks in advance!

Frank Heimes (fheimes)
Changed in ubuntu-power-systems:
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Manoj Iyer (manjo)
tags: added: triage-g
Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2017-07-25 14:52 EDT-------
It looks like the directory had artful and zesty. I didnt see anything in Xenial but Zesty/Artful works.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I had a build failure with Xenial. There are some prereq commits required that I am working on identifying now.

Changed in linux (Ubuntu Zesty):
importance: Undecided → Critical
Changed in linux (Ubuntu Xenial):
importance: Undecided → Critical
Changed in linux (Ubuntu Zesty):
status: New → In Progress
Changed in linux (Ubuntu Xenial):
status: New → In Progress
Changed in linux (Ubuntu Artful):
status: Confirmed → In Progress
assignee: Canonical Kernel Team (canonical-kernel-team) → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Zesty):
assignee: nobody → Joseph Salisbury (jsalisbury)
Changed in linux (Ubuntu Xenial):
assignee: nobody → Joseph Salisbury (jsalisbury)
Seth Forshee (sforshee)
Changed in linux (Ubuntu Artful):
status: In Progress → Fix Committed
Stefan Bader (smb)
Changed in linux (Ubuntu Zesty):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.12.0-11.12

---------------
linux (4.12.0-11.12) artful; urgency=low

  * linux: 4.12.0-11.12 -proposed tracker (LP: #1709929)

  * CVE-2017-1000111
    - packet: fix tp_reserve race in packet_set_ring

  * CVE-2017-1000112
    - udp: consistently apply ufo or fragmentation

  * Please only recommend or suggest initramfs-tools | linux-initramfs-tool for
    kernels able to boot without initramfs (LP: #1700972)
    - Revert "UBUNTU: [Debian] Don't depend on initramfs-tools"
    - [Debian] Don't depend on initramfs-tools

  * Miscellaneous Ubuntu changes
    - SAUCE: (noup) Update spl to 0.6.5.11-ubuntu1, zfs to 0.6.5.11-1ubuntu3
    - SAUCE: powerpc: Always initialize input array when calling epapr_hypercall()

  * Miscellaneous upstream changes
    - selftests: typo correction for memory-hotplug test
    - selftests: check hot-pluggagble memory for memory-hotplug test
    - selftests: check percentage range for memory-hotplug test
    - selftests: add missing test name in memory-hotplug test
    - selftests: fix memory-hotplug test

 -- Seth Forshee <email address hidden> Thu, 10 Aug 2017 13:37:00 -0500

Changed in linux (Ubuntu Artful):
status: Fix Committed → Fix Released
Revision history for this message
Kleber Sacilotto de Souza (kleber-souza) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-zesty' to 'verification-done-zesty'. If the problem still exists, change the tag 'verification-needed-zesty' to 'verification-failed-zesty'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-zesty
Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-08-16 16:14 EDT-------
I added the tag: verification-done-zesty

tags: added: verification-done-zesty
removed: verification-needed-zesty
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

There is now a Xenial test kernel, which has a backport of commit 66aa0678efc2. The test kernels can be downloaded from:

http://kernel.ubuntu.com/~jsalisbury/lp1692538/xenial

Can you test these kernels and see if they resolve this bug?

Thanks in advance!

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-08-24 16:07 EDT-------
Works for us.

tags: added: verification-done-xenial
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (8.5 KiB)

This bug was fixed in the package linux - 4.10.0-33.37

---------------
linux (4.10.0-33.37) zesty; urgency=low

  * linux: 4.10.0-33.37 -proposed tracker (LP: #1709303)

  * CVE-2017-1000112
    - Revert "udp: consistently apply ufo or fragmentation"
    - udp: consistently apply ufo or fragmentation

  * CVE-2017-1000111
    - Revert "net-packet: fix race in packet_set_ring on PACKET_RESERVE"
    - packet: fix tp_reserve race in packet_set_ring

  * ThunderX: soft lockup on 4.8+ kernels when running qemu-efi with vhost=on
    (LP: #1673564)
    - irqchip/gic-v3: Add missing system register definitions
    - arm64: KVM: Do not use stack-protector to compile EL2 code
    - KVM: arm/arm64: vgic-v3: Use PREbits to infer the number of ICH_APxRn_EL2
      registers
    - KVM: arm/arm64: vgic-v3: Fix nr_pre_bits bitfield extraction
    - arm64: Add a facility to turn an ESR syndrome into a sysreg encoding
    - KVM: arm/arm64: vgic-v3: Add accessors for the ICH_APxRn_EL2 registers
    - KVM: arm64: Make kvm_condition_valid32() accessible from EL2
    - KVM: arm64: vgic-v3: Add hook to handle guest GICv3 sysreg accesses at EL2
    - KVM: arm64: vgic-v3: Add ICV_BPR1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_IGRPEN1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_IAR1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_EOIR1_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_AP1Rn_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_HPPIR1_EL1 handler
    - KVM: arm64: vgic-v3: Enable trapping of Group-1 system registers
    - KVM: arm64: Enable GICv3 Group-1 sysreg trapping via command-line
    - KVM: arm64: vgic-v3: Add ICV_BPR0_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_IGNREN0_EL1 handler
    - KVM: arm64: vgic-v3: Add misc Group-0 handlers
    - KVM: arm64: vgic-v3: Enable trapping of Group-0 system registers
    - KVM: arm64: Enable GICv3 Group-0 sysreg trapping via command-line
    - arm64: Add MIDR values for Cavium cn83XX SoCs
    - [Config] CONFIG_CAVIUM_ERRATUM_30115=y
    - arm64: Add workaround for Cavium Thunder erratum 30115
    - KVM: arm64: vgic-v3: Add ICV_DIR_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_RPR_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_CTLR_EL1 handler
    - KVM: arm64: vgic-v3: Add ICV_PMR_EL1 handler
    - KVM: arm64: Enable GICv3 common sysreg trapping via command-line
    - KVM: arm64: vgic-v3: Log which GICv3 system registers are trapped
    - arm64: KVM: Make unexpected reads from WO registers inject an undef
    - KVM: arm64: Log an error if trapping a read-from-write-only GICv3 access
    - KVM: arm64: Log an error if trapping a write-to-read-only GICv3 access

  * ibmvscsis: Do not send aborted task response (LP: #1689365)
    - target: Fix unknown fabric callback queue-full errors
    - ibmvscsis: Do not send aborted task response
    - ibmvscsis: Clear left-over abort_cmd pointers
    - ibmvscsis: Fix the incorrect req_lim_delta

  * hisi_sas performance improvements (LP: #1708734)
    - scsi: hisi_sas: define hisi_sas_device.device_id as int
    - scsi: hisi_sas: optimise the usage of hisi_hba.lock
    - scsi: hisi_sas: relocate sata_done_v2_hw()
    - scsi: hisi_sas: optimise DMA slot memory

  * hisi_sas...

Read more...

Changed in linux (Ubuntu Zesty):
status: Fix Committed → Fix Released
bugproxy (bugproxy)
tags: added: targetmilestone-inin16043
removed: targetmilestone-inin16042
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: Confirmed → In Progress
status: In Progress → Fix Committed
Changed in ubuntu-power-systems:
status: Fix Committed → In Progress
Revision history for this message
Manoj Iyer (manjo) wrote :

Needs testing for Xenial.

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-11-13 09:53 EDT-------
Tested on Xenial, looks good.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :
Download full text (3.2 KiB)

Our support team has encountered a case where ibmveth + openvswitch + bnx2x has
lead to some issues, which IBM should probably be aware of before
turning on large segments in more places.

Here's a summary from support for that issue:

==========

[Issue: we see a firmware assertion from an IBM branded bnx2x card.
Decoding the dump with the help of upstream shows that the assert is
caused by a packet with GSO on and gso_size > ~9700 bytes being passed
to the card. We traced the packets through the system, and came up
with this root cause. The system uses ibmveth to talk to AIX LPARs, a
bnx2x network card to talk to the world, and Open vSwitch to tie them
together. There is no VIOS involvement - the card is attached to the
Linux partition.]

The packets causing the issue come through the ibmveth interface -
from the AIX LPAR. The veth protocol is 'special' - communication
between LPARs on the same chassis can use very large (64k) frames to
reduce overhead. Normal networks cannot handle such large packets, so
traditionally, the VIOS partition would signal to the AIX partitions
that it was 'special', and AIX would send regular, ethernet-sized
packets to VIOS, which VIOS would then send out.

This signalling between VIOS and AIX is done in a way that is not
standards-compliant, and so was never made part of Linux. Instead, the
Linux driver has always understood large frames and passed them up the
network stack.

In some cases (e.g. with TCP), multiple TCP segments are coalesced
into one large packet. In Linux, this goes through the generic receive
offload code, using a similar mechanism to GSO. These segments can be
very large which presents as a very large MSS (maximum segment size)
or gso_size.

Normally, the large packet is simply passed to whatever network
application on Linux is going to consume it, and everything is OK.

However, in this case, the packets go through Open vSwitch, and are
then passed to the bnx2x driver. The bnx2x driver/hardware supports
TSO and GSO, but with a restriction: the maximum segment size is
limited to around 9700 bytes. Normally this is more than adequate as
jumbo frames are limited to 9000 bytes. However, if a large packet
with large (>9700 byte) TCP segments arrives through ibmveth, and is
passed to bnx2x, the hardware will panic.

Turning off TSO prevents the crash as the kernel resegments the data
and assembles the packets in software. This has a performance cost.

Clearly at the very least, bnx2x should not crash in this case, and I
am working towards a patch for that.

However, this still leaves us with some issues. The only thing the
bnx2x driver can sensibly do is drop the packet, which will prevent
the crash. However, there will still be issues with large packets:
when they are dropped, the other side will eventually realise that the
data is missing and ask for a retransmit, but the retransmit might
also be too big - there's no way of signalling back to the AIX LPAR
that it should reduce the MSS. Even if the data eventually gets
through there will be a latency/throughput/performance hit.

==========

Seeing as IBM seems to be in active development in this area - indeed
this code explicitly deals with ibm...

Read more...

Revision history for this message
Daniel Axtens (daxtens) wrote :

Just as an update: I am working with Jay V on a set of patches to drop the oversized packets at the openvswitch/bridge level to prevent the crash I mentioned.

But that is not sufficient to solve the underlying problem: there will still be packet loss when there's an MTU mismatch here. A device in AIX with a 64k MTU being bridged (via openvswitch or a native bridge) to a device with a 1500 or 9000 byte MTU is never going to work reliably and efficiently, and IBM will need to figure out how they want to solve this.

Regards,
Daniel

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-11-17 11:07 EDT-------
I was told that this issue is due to a mis-configuration where if end to end MTU isn't set correctly then you will see this problem. So as long as the user sets the MTU end to end correctly you shouldn't see this problem.

Revision history for this message
Daniel Axtens (daxtens) wrote :

Hi Bryant,

So, to be crystal clear, IBM's position is if customers are using this setup, that they should set the MTU in their AIX partitions to 1500? (or 9000 if using jumbo frames)

Is this documented anywhere on your website that we can point users to?

I ask because I have asked one of your customers/our users to do this in a support context and they were unhappy about the performance impact. So if this is the official line, can we have some official documentation of it?

Regards,
Daniel

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-11-20 04:13 EDT-------
Hi Daniel,

When one enables largesend offload in AIX, MTU still stays at 1500 (when Jumbo frames are disabled) and 9000 (when Jumbo frames are enabled). So, when a workload pushes data > MTU sizes, AIX LPAR will send the bigger payload with an MSS value based on MTU. This MSS value is what the adapter uses later for segmenting.

By default AIX doesn't set MTU to 64k when largesend is enabled. In the scenario described in bnx2x driver issue, I suspect the end-user manually changed the MTU to a bigger value (~64k or so), otherwise we shouldn't be seeing this issue.

Now, coming back to your question on what will happen if user configures a bigger MTU value (say for example 64k), AIX LPAR will send the bigger payload to VIOS with MSS value ~64k, this will lead to physical NICs in VIOS drop the packet, leading to restransmissions from AIX LPAR. Eventually AIX LPAR will disable largesend offload for the specific connection, post certain number of retransmissions and then the data flow goes through fine. So, in the event of user misconfiguration, I agree there will be a performance impact.

This issue may happen in non-virtualized environment too, when the end-user sets a higher MTU than the one supported by the physical adapter. Here the driver/adapter may drop the packet, leading to retransmissions.

Regards,
Siva K

Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: In Progress → Fix Released
status: Fix Released → In Progress
Revision history for this message
Daniel Axtens (daxtens) wrote :

Hi Siva,

Thank you for your quick and thoughtful response.

I will ask about the default MTU for the veth interface to see if the user increased it themselves.

I'm not sure I completely understand what you mean about largesend offload being disabled after retransmits. I'm also not completely sure if it's largesend offload or just large packets that are causing issues. If I have understood correctly (e.g. https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/com.ibm.aix.performance/tcp_large_send_offload.htm) large-send offload is what Linux would call TCP Segmentation Offload (TSO) - does that match your understanding?

Here's my concern. The code I'm looking at (let's look at Zesty, so v4.10) is in ibmveth.c, ibmveth_poll(). There we see:

 if (length > netdev->mtu + ETH_HLEN) {
 ibmveth_rx_mss_helper(skb, mss, lrg_pkt);
 adapter->rx_large_packets++;
 }

Then ibmveth_rx_mss_helper() has the following - setting GSO on regardless of the large_pkt bit:

 /* if mss is not set through Large Packet bit/mss in rx buffer,
  * expect that the mss will be written to the tcp header checksum.
  */
 tcph = (struct tcphdr *)(skb->data + offset);
 if (lrg_pkt) {
 skb_shinfo(skb)->gso_size = mss;
 } else if (offset) {
 skb_shinfo(skb)->gso_size = ntohs(tcph->check);
 tcph->check = 0;
 }

It looks to me that Linux will interpret a packet from the veth adaptor as a GSO/GRO packet based only on whether or not the size of the received packet is greater than the linux-side MTU plus the header size - not based on whether AIX thinks it is transmitting a LSO packet.

To put it another way - if I have understood correctly - there are two ways we could end up with a GSO/GRO packet coming out of a veth adaptor. The ibmveth_rx_mss_helper path is taken when the size of the packet is greater than MTU+ETH_HLEN, which can happen when:

 1) The AIX end has turned on LSO, so the large_packet bit is set
 2) Large-send is off in AIX but there is a mis-matched MTU between AIX and Linux

In the first case case, you say that AIX will turn off largesend, which will fix the issue. But in the second case, if I have understood correctly, AIX will not be able to do anything. Unless you are saying that AIX will dynamically reduce the MTU for a connection in the presence of a number of re-transmits?

This isn't necessarily wrong behaviour from AIX - Linux can't do anything in this situation either; a 'hop' that can participate in Path MTU Discovery would be needed.

If I understand it, then, the optimal configuration would be for the AIX LPAR to set an MTU of 1500/9000 and turn on LSO for veth on the AIX side - does that sound right?

Thanks again!
Regards,
Daniel

Revision history for this message
bugproxy (bugproxy) wrote :

------- Comment From <email address hidden> 2017-11-21 02:33 EDT-------
Hi Daniel,

Yes. I meant TCP Segmentation Offload when I said large-send offload.

Your understanding of the current ibmveth code is right.

AIX Path MTU Discovery is enabled by default and so though the very first packet (larger payload) can leave the LPAR with higher MSS, AIX will discover that the path MTU size is lower than the MSS negotiated. This leads to AIX TCP recovering (based on path MTU values) and sending packets of right MTU size for further packets.

But if user disables Path MTU and configures an MTU higher than what physical NIC can support, then this will lead to packet drops (and AIX stack cannot rediscover MTU without path MTU). So, second scenario you mentioned in your note will lead to communication issues.

Also by default, AIX MTU settings are optimal ones i.e 1500 bytes (even when TSO is enabled). This helps in AIX LPAR sending larger payloads with MSS value of 1500. If an user knows that the physical NIC in VIOS or Linux IO server has Jumbo frames enabled, MTU can be set to 9000 (here again if TSO is enabled, larger payloads will go with MSS of 9000). So, the optimal values are the default values (1500/9000).

Regards,
Siva K.

tags: added: triage-a
removed: triage-g
Revision history for this message
Frank Heimes (fheimes) wrote :

Siva and Daniel, may I just ask where we are on this?
Well it looks to me that Siva/IBM sees this more as a miss-configuration, so that the changes in comment #18 are _not_ needed. Daniel, do you see it now the same way?
But in this case this needs to be documented somewhere, so that we can point customers, too it - right?

Revision history for this message
Daniel Axtens (daxtens) wrote : Re: [Bug 1692538] Re: Ubuntu 16.04.02: ibmveth: Support to enable LSO/CSO for Trunk VEA
Download full text (4.4 KiB)

Hi Frank,

Yes, that is how I see it - these changes can go through, but we need
good docs to point people to as there is an incredibly high likelihood
of misconfiguration at various points.

Regards,
Daniel

On Tue, Dec 19, 2017 at 3:46 AM, Frank Heimes
<email address hidden> wrote:
> Siva and Daniel, may I just ask where we are on this?
> Well it looks to me that Siva/IBM sees this more as a miss-configuration, so that the changes in comment #18 are _not_ needed. Daniel, do you see it now the same way?
> But in this case this needs to be documented somewhere, so that we can point customers, too it - right?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1692538
>
> Title:
> Ubuntu 16.04.02: ibmveth: Support to enable LSO/CSO for Trunk VEA
>
> Status in The Ubuntu-power-systems project:
> In Progress
> Status in linux package in Ubuntu:
> Fix Released
> Status in linux source package in Xenial:
> In Progress
> Status in linux source package in Zesty:
> Fix Released
> Status in linux source package in Artful:
> Fix Released
>
> Bug description:
>
> == SRU Justification ==
> Commit 66aa0678ef is request to fix four issues with the ibmveth driver.
> The issues are as follows:
> - Issue 1: ibmveth doesn't support largesend and checksum offload features when configured as "Trunk".
> - Issue 2: SYN packet drops seen at destination VM. When the packet
> originates, it has CHECKSUM_PARTIAL flag set and as it gets delivered to IO
> server's inbound Trunk ibmveth, on validating "checksum good" bits in ibmveth
> receive routine, SKB's ip_summed field is set with CHECKSUM_UNNECESSARY flag.
> - Issue 3: First packet of a TCP connection will be dropped, if there is
> no OVS flow cached in datapath.
> - Issue 4: ibmveth driver doesn't have support for SKB's with frag_list.
>
> The details for the fixes to these issues are described in the commits
> git log.
>
>
>
> == Comment: #0 - BRYANT G. LY <email address hidden> - 2017-05-22 08:40:16 ==
> ---Problem Description---
>
> - Issue 1: ibmveth doesn't support largesend and checksum offload features
> when configured as "Trunk". Driver has explicit checks to prevent
> enabling these offloads.
>
> - Issue 2: SYN packet drops seen at destination VM. When the packet
> originates, it has CHECKSUM_PARTIAL flag set and as it gets delivered to
> IO server's inbound Trunk ibmveth, on validating "checksum good" bits
> in ibmveth receive routine, SKB's ip_summed field is set with
> CHECKSUM_UNNECESSARY flag. This packet is then bridged by OVS (or Linux
> Bridge) and delivered to outbound Trunk ibmveth. At this point the
> outbound ibmveth transmit routine will not set "no checksum" and
> "checksum good" bits in transmit buffer descriptor, as it does so only
> when the ip_summed field is CHECKSUM_PARTIAL. When this packet gets
> delivered to destination VM, TCP layer receives the packet with checksum
> value of 0 and with no checksum related flags in ip_summed field. This
> leads to packet drops. So, TCP connections never goes thro...

Read more...

Manoj Iyer (manjo)
tags: added: triage-g
removed: triage-a
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: In Progress → Fix Committed
Manoj Iyer (manjo)
Changed in ubuntu-power-systems:
status: Fix Committed → Fix Released
no longer affects: linux (Ubuntu Xenial)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.