Checksum drop of metadata traffic on isolated networks with DPDK

Bug #1832021 reported by David Ames on 2019-06-07
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack neutron-openvswitch charm
Undecided
Unassigned
neutron
Medium
Unassigned

Bug Description

When an isolated network using provider networks for tenants (meaning without virtual routers: DVR or network node), metadata access occurs in the qdhcp ip netns rather than the qrouter netns.

The following options are set in the dhcp_agent.ini file:
force_metadata = True
enable_isolated_metadata = True

VMs on the provider tenant network are unable to access metadata as packets are dropped due to checksum.

When we added the following in the qdhcp netns, VMs regained access to metadata:

 iptables -t mangle -A OUTPUT -o ns-+ -p tcp --sport 80 -j CHECKSUM --checksum-fill

It seems this setting was recently removed from the qrouter netns [0] but it never existed in the qdhcp to begin with.

[0] https://review.opendev.org/#/c/654645/

Related LP Bug #1831935
See https://bugs.launchpad.net/charm-neutron-openvswitch/+bug/1831935/comments/10

Jeff Hillman (jhillman) on 2019-06-07
tags: added: cpe-onsite
Brian Haley (brian-haley) wrote :

David - please see the link that was the reason I reverted this change, https://lore.kernel.org/patchwork/patch/824819/ - that is basically saying this rule has no effect for TCP, it was only meant for UDP, and was finally changed to log a warning in the kernel.

There is probably something else going on here causing issues, possibly outside of neutron.

David Ames (thedac) wrote :

Brian,

Thanks for getting back to me. It seems this is a duplicate of LP Bug #1722584 [0]. And the explanation for my running into it is that we have not yet pushed your reversion into our Ubuntu packaging.

Marking this bug a duplicate of LP Bug #1722584

[0] https://bugs.launchpad.net/cloud-archive/+bug/1722584

James Page (james-page) on 2019-06-13
Changed in neutron:
status: New → Incomplete
David Ames (thedac) wrote :
Download full text (3.2 KiB)

We de-duplicated this bug as we have narrowed the focus. This is DPDK specific.

When using isolated provider networks AND DPDK metadata is dropped due to incorrect TCP checksum. Specifically when the provider interface is a DPDK interface.

It is temporarily mitigated by adding the iptables rule:
iptables -t mangle -A OUTPUT -o ns-+ -p tcp --sport 80 -j CHECKSUM --checksum-fill
However this is not sustainable as any restart of openvswitch will clear this setting.

Here is an example of a tcpdump in the qdhcp netns:

    172.20.0.23.50060 > 169.254.169.254.80: Flags [S], cksum 0xb3c1 (correct), seq 3293184694, win 29200, options [mss 1460,sackOK,TS val 78090881 ecr 0,nop,wscale 7], length 0
23:36:43.932728 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)

    169.254.169.254.80 > 172.20.0.23.50060: Flags [S.], cksum 0x0057 (incorrect -> 0x2b02), seq 2915867972, ack 3293184695, win 28960, options [mss 1460,sackOK,TS val 4009971593 ecr 78090881,nop,wscale 7], length 0

Continuing with re-transmissions of the same. Note the incorrect cksum.

With the mangle rule in place:

    172.20.0.4.46706 > 169.254.169.254.80: Flags [S], cksum 0xbf25 (correct), seq 4115688639, win 29200, options [mss 1460,sackOK,TS val 2390126443 ecr 0,nop,wscale
7], length 0
23:40:01.510745 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], proto TCP (6), length 60)
    169.254.169.254.80 > 172.20.0.4.46706: Flags [S.], cksum 0xe3f3 (correct), seq 1829822633, ack 4115688640, win 28960, options [mss 1460,sackOK,TS val 812998113 e
cr 2390126443,nop,wscale 7], length 0
23:40:01.510919 IP (tos 0x0, ttl 64, id 38572, offset 0, flags [DF], proto TCP (6), length 52)
    172.20.0.4.46706 > 169.254.169.254.80: Flags [.], cksum 0x82fb (correct), seq 1, ack 1, win 229, options [nop,nop,TS val 2390126443 ecr 812998113], length 0
23:40:01.510974 IP (tos 0x0, ttl 64, id 38573, offset 0, flags [DF], proto TCP (6), length 229)
    172.20.0.4.46706 > 169.254.169.254.80: Flags [P.], cksum 0x5e40 (correct), seq 1:178, ack 1, win 229, options [nop,nop,TS val 2390126443 ecr 812998113], length 1
77: HTTP, length: 177
        GET /openstack HTTP/1.1
        Host: 169.254.169.254
        User-Agent: Cloud-Init/19.1-1-gbaa47854-0ubuntu1~18.04.1
        Accept-Encoding: gzip, deflate
        Accept: */*
        Connection: keep-alive

23:40:01.553043 IP (tos 0x0, ttl 64, id 62471, offset 0, flags [DF], proto TCP (6), length 52)
    169.254.169.254.80 > 172.20.0.4.46706: Flags [.], cksum 0x8219 (correct), seq 1, ack 178, win 235, options [nop,nop,TS val 812998156 ecr 2390126443], length 0
23:40:02.036984 IP (tos 0x0, ttl 64, id 62472, offset 0, flags [DF], proto TCP (6), length 252)
    169.254.169.254.80 > 172.20.0.4.46706: Flags [P.], cksum 0xd303 (correct), seq 1:201, ack 178, win 235, options [nop,nop,TS val 812998640 ecr 2390126443], length
 200: HTTP, length: 200
        HTTP/1.1 200 OK
        Content-Type: text/plain; charset=UTF-8
        Content-Length: 83
        Date: Thu, 13 Jun 2019 23:40:02 GMT

        2012-08-10
        2013-04-04
        2013-10-17
        2015-10-15
        2016-06-30
        2016-10-06
        2017-02-22
        latest[!http]

We can provi...

Read more...

David Ames (thedac) on 2019-06-14
Changed in neutron:
status: Incomplete → New
David Ames (thedac) on 2019-06-14
summary: - Checksum drop of metadata traffic on isolated provider networks
+ Checksum drop of metadata traffic on isolated provider networks with
+ DPDK

Further testing shows the provider network is irrelevant. With DPDK and an isolated network (qdhcp only no qrouter) either GRE or provider, any traffic initiated by the qdhcp netns, including response traffic, gets an incorrect TCP checksum.

This packet gets put on the "wire" and it is the VM that drops the packet due to an invalid TCP checksum.

In a DPDK isolated network environment from the qdhcp netns you can see this in action with an arbitrary netcat call:

nc -vz $VM_IP 73 (Any TCP port)

tcpdump on the VM side and you can see

22:06:31.424716 IP (tos 0x0, ttl 64, id 14532, offset 0, flags [DF], proto TCP (6), length 60)
    172.20.0.2.39784 > 172.20.0.6.73: Flags [S], cksum 0x585f (incorrect -> 0x0437), seq 114680395, win 26880, options [mss 8960,sackOK,TS val 1660321155 ecr 0,nop,wscale 7], length 0
22:06:39.616633 IP (tos 0x0, ttl 64, id 14533, offset 0, flags [DF], proto TCP (6), length 60)
    172.20.0.2.39784 > 172.20.0.6.73: Flags [S], cksum 0x585f (incorrect -> 0xe436), seq 114680395, win 26880, options [mss 8960,sackOK,TS val 1660329347 ecr 0,nop,wscale 7], length 0
22:06:55.744502 IP (tos 0x0, ttl 64, id 14534, offset 0, flags [DF], proto TCP (6), length 60)
    172.20.0.2.39784 > 172.20.0.6.73: Flags [S], cksum 0x585f (incorrect -> 0xa536), seq 114680395, win 26880, options [mss 8960,sackOK,TS val 1660345475 ecr 0,nop,wscale 7], length 0

So the VM sees the response traffic from the qdhcp netns but drops it because the TCP checksum is invalid.

When we turn on DVR and create a virtual router (unused) the qrouter netns does not have this problem. I have not root caused why but there are a number of other iptables settings in the qrouter netns that are not in the qdhcp that may be required.

So what we are looking for is differences in the setup of the qdhcp netns from the qrouter netns.

summary: - Checksum drop of metadata traffic on isolated provider networks with
- DPDK
+ Checksum drop of metadata traffic on isolated networks with DPDK
David Ames (thedac) wrote :

Also, note the qrouter nets works with or without the neutron-l3-agent-POSTROUTING checksum fill rule. This is important as this checksum fill was removed from the neutron code.

Miguel Lavalle (minsel) on 2019-06-14
Changed in neutron:
importance: Undecided → Medium
David Ames (thedac) wrote :

A bit more research. I tried to find differences in iptables rules between qdhcp and qrouter netns-es.

I went so far as to restore iptables-save of qrouter in the qdhcp with no change.

Any TCP connection (or response) in the qdhcp netns generates an invalid TCP checksum.

Liam Young (gnuoy) wrote :

The issue appears to be that ovs_use_veth=True was set to true in dhcp_agent.ini. This causes the qdhcp namespace to be connected to the bridge via a veth pair. This appears to leave checksum offloading enabled on the device in the qdhcp namespace. Manually turning off tx-checksum-ip-generic appears to fix the issue but for a permanent fix switch ovs_use_veth to False.

Brian Haley (brian-haley) wrote :

Liam - thanks for the information. Since the default value for ovs_use_veth is False in the neutron tree, was it the installer tools that changed the value?

Liam Young (gnuoy) wrote :

Hi Brian, yes it was, I'll be proposing a change to fix that today.

Slawek Kaplonski (slaweq) wrote :

Maybe we should add some warning about it to our docs too?

Liam, can you clarify what change you're proposing and where?

I'm assuming setting ovs_use_veth=false in dhcp_agent.ini by default?

(and not disabling checksumming somehow).

Liam Young (gnuoy) wrote :

Hi niveditasinghvi. Yes, I'm proposing ovs_use_veth be set to False rather than manually disabling checksumming. The changes are here: https://review.opendev.org/#/q/topic:bug/1832021+(status:open+OR+status:merged). We are having a few issue with CI but I hope to have them landed asap.

Reviewed: https://review.opendev.org/666277
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=051b58f566dd382f22df937d2e90c06e9ca2faad
Submitter: Zuul
Branch: master

commit 051b58f566dd382f22df937d2e90c06e9ca2faad
Author: Slawek Kaplonski <email address hidden>
Date: Wed Jun 19 14:27:18 2019 +0200

    Update DPDK docs with note about using veth pairs

    In case when ovs-dpdk is used together with ``ovs_use_veth`` config
    option set to True, it cause invalid checksum on all packets send from
    qdhcp namespace.
    This commit adds short info about this limitation to ovs-dpdk config
    guide.

    Change-Id: I6237abab3d9e625440e95e75f5091d09a1ec44f0
    Related-Bug: #1832021

Reviewed: https://review.opendev.org/666293
Committed: https://git.openstack.org/cgit/openstack/charm-neutron-openvswitch/commit/?id=7578326c592a81e6cc6737f360fc2b15ed175cf0
Submitter: Zuul
Branch: master

commit 7578326c592a81e6cc6737f360fc2b15ed175cf0
Author: Liam Young <email address hidden>
Date: Wed Jun 19 13:29:18 2019 +0000

    Stop using veth pairs to connect qdhcp ns

    veth pairs are currently being used to connect the qdhcp namespace
    to the underlying bridge. This behaviour appears to only be needed
    for old kernels with limited namespaces support (pre trusty).

    Change-Id: If1f669de09e2499e74e88e2b72203047e7f9f957
    Closes-Bug: #1832021

Changed in charm-neutron-openvswitch:
status: New → Fix Committed
Andre Ruiz (andre-ruiz) wrote :

Just a late note: the upgrade was applied to the customer in question and it indeed fixed the problem.

charm neutron-openvswitch-next-359 -> neutron-openvswitch-next-367

James Page (james-page) on 2019-08-07
Changed in charm-neutron-openvswitch:
milestone: none → 19.07
David Ames (thedac) on 2019-08-12
Changed in charm-neutron-openvswitch:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers