Traffic leaked from dhcp port before vlan tag is applied

Bug #1930414 reported by Bence Romsics
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Security Advisory
Won't Fix
Undecided
Unassigned
neutron
Fix Released
High
Bence Romsics

Bug Description

This is a bug with potential security implications. I don't see a clear way to exploit it at the moment, but to err on the safe side, I'm opening this as private to the security team.

Short summary: Using openvswitch-agent, traffic sent on some (at least dhcp) ports before ovs-agent applies the port's vlan tag can be seen and intercepted on ports from other networks on the same integration bridge.

We observed this bug:
* using vlan and vxlan networks
* using the noop and openvswitch firewall drivers
* on openstack versions mitaka, pike and master (commit 5a6f61af4a)

The time window between the port's creation and ovs-agent applying its vlan tag is usually very short. We observed this bug in the wild on a heavily loaded host. However to make the reproduction reliable on lightly loaded systems I inserted a sleep() into ovs-agent's source (just before the port's vlan tag is set):

$ git --no-pager format-patch --stdout 5a6f61af4a
From 8389b3e8e5c60c81ff2bb262e3ae2e8aab73d3f5 Mon Sep 17 00:00:00 2001
From: Bence Romsics <email address hidden>
Date: Mon, 31 May 2021 13:12:34 +0200
Subject: [PATCH] WIP

Change-Id: Ibef4278a2f6a85f52a8ffa43caef6de38cbb40cb
---
 .../plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py | 1 +
 1 file changed, 1 insertion(+)

diff --git a/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py b/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
index 2c209bd387..355584b325 100644
--- a/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
+++ b/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py
@@ -1190,6 +1190,7 @@ class OVSNeutronAgent(l2population_rpc.L2populationRpcCallBackTunnelMixin,
                 self.setup_arp_spoofing_protection(self.int_br,
                                                    port, port_detail)
             if cur_tag != lvm.vlan:
+ time.sleep(3)
                 self.int_br.set_db_attribute(
                     "Port", port.port_name, "tag", lvm.vlan)
·
--·
2.31.0

We discovered the bug by the following procedure:

* a test environment created multiple neutron nets in short succession with the exact same ipv6 subnet ranges
* therefore neutron selected the exact same ipv6 address for the subnet's dhcp port
* the host running the dhcp-agent and the ovs-agent was heavily loaded
* we observed that many (when ovs-agent is made slow enough, then all but one) of these networks' services relying on the dhcp port's address were unavailable
* because duplicate address detection (DAD) for the ipv6 dhcp port address failed
* we believe DAD failed because we have some temporary crosstalk between the dhcp namespaces of different networks
* we believe that this bug is not ipv6 specific, only the default DAD of ipv6 helped us discover it

Exact reproduction steps:

$ date --iso-8601=s
2021-05-31T13:10:14+00:00

# when the ovs-agent is slow enough, even 2 networks are sufficient
$ for i in {1..5}
do
   openstack network create xnet$i
   openstack subnet create xsubnet$i-v6 --ip-version 6 --network xnet$i --subnet-range 2001:db8::/32
done

# for the record
$ openstack subnet list -f value -c Name -c ID | egrep xsubnet
01d614da-820b-418d-8fa4-71952713f0ad xsubnet5-v6
72158e8e-5059-4abb-98a4-5adc9e4ef39c xsubnet2-v6
8f263143-a69b-4c42-b74c-6f30aca7b19d xsubnet4-v6
9ab4159e-12f8-44ed-8947-35a56b62eaf8 xsubnet1-v6
d4ed53e2-7b70-43d7-bd9f-d45f006a8179 xsubnet3-v6

# note that all dhcp ports got the same ip
$ openstack port list --device-owner network:dhcp -f value -c id -c mac_address -c fixed_ips | egrep 2001:db8::
130855be-ead1-40bb-9ca0-5336428aa74b fa:16:3e:24:76:41 [{'subnet_id': '01d614da-820b-418d-8fa4-71952713f0ad', 'ip_address': '2001:db8::1'}]
19fcabfd-f32a-40ea-b68e-ced41f394822 fa:16:3e:43:80:fe [{'subnet_id': '9ab4159e-12f8-44ed-8947-35a56b62eaf8', 'ip_address': '2001:db8::1'}]
46963dbd-c844-4986-a07f-fb78adbd95e9 fa:16:3e:4f:23:bf [{'subnet_id': '72158e8e-5059-4abb-98a4-5adc9e4ef39c', 'ip_address': '2001:db8::1'}]
b8bf2fb1-d52a-41af-90bb-01aa23529015 fa:16:3e:90:40:8e [{'subnet_id': 'd4ed53e2-7b70-43d7-bd9f-d45f006a8179', 'ip_address': '2001:db8::1'}]
ba67f2c0-c714-45fd-aec8-7233ba379dfa fa:16:3e:35:10:8d [{'subnet_id': '8f263143-a69b-4c42-b74c-6f30aca7b19d', 'ip_address': '2001:db8::1'}]

# all but one dhcp port (and by the way metadata) addresses are in DAD failed mode
$ for net in $( openstack network list -f value -c Name -c ID | awk '/ xnet/ { print $1 }' ) ; do sudo ip netns exec qdhcp-$net ip a ; done | egrep '(link/ether|inet6 (2001:db8::|fe80::a9fe:a9fe))'
    link/ether fa:16:3e:90:40:8e brd ff:ff:ff:ff:ff:ff
    inet6 2001:db8::1/32 scope global dadfailed tentative·
    inet6 fe80::a9fe:a9fe/64 scope link dadfailed tentative·
    link/ether fa:16:3e:4f:23:bf brd ff:ff:ff:ff:ff:ff
    inet6 2001:db8::1/32 scope global dadfailed tentative·
    inet6 fe80::a9fe:a9fe/64 scope link dadfailed tentative·
    link/ether fa:16:3e:24:76:41 brd ff:ff:ff:ff:ff:ff
    inet6 2001:db8::1/32 scope global dadfailed tentative·
    inet6 fe80::a9fe:a9fe/64 scope link dadfailed tentative·
    link/ether fa:16:3e:35:10:8d brd ff:ff:ff:ff:ff:ff
    inet6 2001:db8::1/32 scope global dadfailed tentative·
    inet6 fe80::a9fe:a9fe/64 scope link dadfailed tentative·
    link/ether fa:16:3e:43:80:fe brd ff:ff:ff:ff:ff:ff
    inet6 2001:db8::1/32 scope global·
    inet6 fe80::a9fe:a9fe/64 scope link·

# dmesg also shows the DAD failures
# man dmesg: Be aware that the timestamp could be inaccurate!
$ LC_TIME=en_US sudo dmesg -T
[snip]
[Mon May 31 13:10:10 2021] device tap19fcabfd-f3 entered promiscuous mode
[Mon May 31 13:10:15 2021] device tap46963dbd-c8 entered promiscuous mode
[Mon May 31 13:10:15 2021] IPv6: tap46963dbd-c8: IPv6 duplicate address fe80::a9fe:a9fe used by fa:16:3e:43:80:fe detected!
[Mon May 31 13:10:15 2021] IPv6: tap46963dbd-c8: IPv6 duplicate address 2001:db8::1 used by fa:16:3e:43:80:fe detected!
[Mon May 31 13:10:18 2021] device tapb8bf2fb1-d5 entered promiscuous mode
[Mon May 31 13:10:20 2021] IPv6: tapb8bf2fb1-d5: IPv6 duplicate address 2001:db8::1 used by fa:16:3e:43:80:fe detected!
[Mon May 31 13:10:20 2021] IPv6: tapb8bf2fb1-d5: IPv6 duplicate address fe80::a9fe:a9fe used by fa:16:3e:43:80:fe detected!
[Mon May 31 13:10:22 2021] device tapba67f2c0-c7 entered promiscuous mode
[Mon May 31 13:10:23 2021] IPv6: tapba67f2c0-c7: IPv6 duplicate address 2001:db8::1 used by fa:16:3e:43:80:fe detected!
[Mon May 31 13:10:24 2021] IPv6: tapba67f2c0-c7: IPv6 duplicate address fe80::a9fe:a9fe used by fa:16:3e:43:80:fe detected!
[Mon May 31 13:10:26 2021] device tap130855be-ea entered promiscuous mode
[Mon May 31 13:10:27 2021] IPv6: tap130855be-ea: IPv6 duplicate address 2001:db8::1 used by fa:16:3e:43:80:fe detected!
[Mon May 31 13:10:28 2021] IPv6: tap130855be-ea: IPv6 duplicate address fe80::a9fe:a9fe used by fa:16:3e:43:80:fe detected!

# while there were no errors in ovs-agent
$ sudo LC_TIME=en_US journalctl -u devstack@q-agt -S '2021-05-31 13:10:14' | egrep ERROR
[empty]

# clean up
$ openstack network list | awk '/xnet/ { print $2 }' | xargs -r openstack network delete

In a bit of further analysis I only created two networks and ran tcpdump on the first's dhcp port, while creating the second:

$ openstack network create xnet0
$ openstack subnet create xsubnet0-v6 --ip-version 6 --network xnet0 --subnet-range 2001:db8::/32

# run tcpdump on the 1st net's dhcp port
$ sudo ip netns exec qdhcp-$( openstack network show -f value -c id xnet0 ) tcpdump -n -vvv -i tapcaa92c34-53

# create the 2nd net while tcpdump is already running
$ openstack network create xnet1
$ openstack subnet create xsubnet1-v6 --ip-version 6 --network xnet1 --subnet-range 2001:db8::/32

# the 2nd net's dhcp port's mac address
$ openstack port list --device-owner network:dhcp --network xnet1 -f value -c mac_address
fa:16:3e:0d:be:aa

# tcpdump's capture contains packets with the 2nd net's dhcp port's mac address
tcpdump: listening on tapcaa92c34-53, link-type EN10MB (Ethernet), capture size 262144 bytes
^C14:24:14.541893 fa:16:3e:0d:be:aa > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 90: (hlim 1, next-header Options (0) payload length: 36) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 1 group record(s) [gaddr ff02::1:ff0d:beaa to_ex, 0 source(s)]
14:24:14.929686 fa:16:3e:0d:be:aa > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 110: (hlim 1, next-header Options (0) payload length: 56) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 2 group record(s) [gaddr ff02::1:fffe:a9fe to_ex, 0 source(s)] [gaddr ff02::1:ff0d:beaa to_ex, 0 source(s)]
14:24:14.985673 fa:16:3e:0d:be:aa > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 130: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff00:1 to_ex, 0 source(s)] [gaddr ff02::1:fffe:a9fe to_ex, 0 source(s)] [gaddr ff02::1:ff0d:beaa to_ex, 0 source(s)]
14:24:14.993930 fa:16:3e:0d:be:aa > 33:33:ff:0d:be:aa, ethertype IPv6 (0x86dd), length 86: (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:ff0d:beaa: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::f816:3eff:fe0d:beaa
          unknown option (14), length 8 (1):·
          0x0000: bfdb db94 d695
14:24:15.209668 fa:16:3e:0d:be:aa > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 130: (hlim 1, next-header Options (0) payload length: 76) :: > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff00:1 to_ex, 0 source(s)] [gaddr ff02::1:fffe:a9fe to_ex, 0 source(s)] [gaddr ff02::1:ff0d:beaa to_ex, 0 source(s)]
14:24:15.858999 fa:16:3e:0d:be:aa > 33:33:ff:00:00:01, ethertype IPv6 (0x86dd), length 86: (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:ff00:1: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has 2001:db8::1
          unknown option (14), length 8 (1):·
          0x0000: 6e67 6a99 e72e
14:24:15.859815 fa:16:3e:6a:ed:cb > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 86: (hlim 255, next-header ICMPv6 (58) payload length: 32) 2001:db8::1 > ff02::1: [icmp6 sum ok] ICMP6, neighbor advertisement, length 32, tgt is 2001:db8::1, Flags [override]
          destination link-address option (2), length 8 (1): fa:16:3e:6a:ed:cb
14:24:15.860188 fa:16:3e:0d:be:aa > 33:33:ff:fe:a9:fe, ethertype IPv6 (0x86dd), length 86: (hlim 255, next-header ICMPv6 (58) payload length: 32) :: > ff02::1:fffe:a9fe: [icmp6 sum ok] ICMP6, neighbor solicitation, length 32, who has fe80::a9fe:a9fe
          unknown option (14), length 8 (1):·
          0x0000: 11e1 9aab 157a
14:24:15.860731 fa:16:3e:6a:ed:cb > 33:33:00:00:00:01, ethertype IPv6 (0x86dd), length 86: (hlim 255, next-header ICMPv6 (58) payload length: 32) fe80::a9fe:a9fe > ff02::1: [icmp6 sum ok] ICMP6, neighbor advertisement, length 32, tgt is fe80::a9fe:a9fe, Flags [override]
          destination link-address option (2), length 8 (1): fa:16:3e:6a:ed:cb
14:24:16.017855 fa:16:3e:0d:be:aa > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 130: (hlim 1, next-header Options (0) payload length: 76) fe80::f816:3eff:fe0d:beaa > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff00:1 to_ex, 0 source(s)] [gaddr ff02::1:fffe:a9fe to_ex, 0 source(s)] [gaddr ff02::1:ff0d:beaa to_ex, 0 source(s)]
14:24:16.137879 fa:16:3e:0d:be:aa > 33:33:00:00:00:16, ethertype IPv6 (0x86dd), length 130: (hlim 1, next-header Options (0) payload length: 76) fe80::f816:3eff:fe0d:beaa > ff02::16: HBH (rtalert: 0x0000) (padn) [icmp6 sum ok] ICMP6, multicast listener report v2, 3 group record(s) [gaddr ff02::1:ff00:1 to_ex, 0 source(s)] [gaddr ff02::1:fffe:a9fe to_ex, 0 source(s)] [gaddr ff02::1:ff0d:beaa to_ex, 0 source(s)]

11 packets captured
11 packets received by filter
0 packets dropped by kernel

At the moment I did not test yet:
* whether the leaked traffic can be intercepted on a vm port
* whether a vm port can similarly leak traffic
* whether ovs-agent can be attacked to intentionally slow it down

Revision history for this message
Jeremy Stanley (fungi) wrote :

Since this report concerns a possible security risk, an incomplete
security advisory task has been added while the core security
reviewers for the affected project or projects confirm the bug and
discuss the scope of any vulnerability along with potential
solutions.

description: updated
Changed in ossa:
status: New → Incomplete
Revision history for this message
Bence Romsics (bence-romsics) wrote :

Interestingly when I modify the last test from the original report to run tcpdump inside a vm, on its vnic attached to xnet0 (instead of the dhcp port) then I don't see the crosstalk traffic. From security perspective that sounds good, though I don't understand the reason for the difference.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

This is a bit strange for me as our ovs_lib's replace_port method is setting DEAD_VLAN_TAG on such ports in the same transaction in which port is created. See https://github.com/openstack/neutron/blob/0bdf3b56e0d4ede2d46eed09a4bb07dd3c00807d/neutron/agent/common/ovs_lib.py#L348 and that is called from the openvswitch interface driver: https://github.com/openstack/neutron/blob/master/neutron/agent/linux/interface.py#L376

Can You check if You DHCP ports got that 4095 vlan set properly?

Revision history for this message
Bence Romsics (bence-romsics) wrote :

Hi Slawek,

Thanks for your answer.

First, I made a mistake above. Python sleep() of course takes seconds, not milliseconds, so the sleep I inserted wanted to be sleep(3), not sleep(3000). Having said that, I can still reproduce the DAD failure. And now I think I understand the bug a bit better.

This is happening in ovs while creating two networks:

$ sudo ovsdb-client monitor Port name,tag
[snip]
row action name tag
------------------------------------ ------ -------------- ----
0ed93be1-1cd7-4029-b38a-46236ec89ae9 insert tapdfdd492f-bc 4095

row action name tag
------------------------------------ ------ -------------- ----
6dac850b-078d-4fab-91f1-4a4d856dd69c insert tapddce8cfc-13 4095

row action name tag
------------------------------------ ------ -------------- ----
0ed93be1-1cd7-4029-b38a-46236ec89ae9 old 4095
                                     new tapdfdd492f-bc 6

row action name tag
------------------------------------ ------ -------------- ----
6dac850b-078d-4fab-91f1-4a4d856dd69c old 4095
                                     new tapddce8cfc-13 7

I believe the "crosstalk" happens while both ports have the DEAD_VLAN_TAG. So we probably have some crosstalk possible between dhcp ports (ports having the DEAD_VLAN_TAG), but not between dhcp-vm ports or vm-vm ports. Which is good news because that sounds like there's no security implication, since no tenant has direct access to the traffic of a dhcp port, right? There's still some slight chance that the wrong dhcp server could respond (of which the potential attacker have some control over), but that's not something I could ever trigger/observe.

Maybe this is just an ordinary bug that manifests when ovs-agent is very slow.

What do you think?

Changed in neutron:
assignee: nobody → Bence Romsics (bence-romsics)
Revision history for this message
Jeremy Stanley (fungi) wrote :

It's been well over a month now and still nobody's come up with a practical exploit scenario whereby a malicious party could cause and then take advantage of this condition. Is there a strong desire to continue (very slowly) discussing this in private? If not, we should switch it to public security both to help determine relative criticality and so as to make it easier to design a possible fix.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I agree with Bence. It doesn't seems like really security bug which we should keep private. IMO we can make it public now.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Thanks, both of you. I've switched the report to public now but left it as a suspected vulnerability for the moment, pending further feedback.

description: updated
information type: Private Security → Public Security
Revision history for this message
Bence Romsics (bence-romsics) wrote :

I hope I managed to reproduce this bug without OpenStack. So it seem quite likely to me that the root cause is an ovs internal bug.

On my Linux laptop I installed ovs (2.15.0), created a bridge called br0.
Defined and started a virtual machine with the attached libvirt domain xml file (vm0.xml).
In that file there are two interfaces, I used the first for management access.
The second interface was plugged to br0 (vnet8 from the host, enp3s0 from the guest).

From the libvirt vm I configured a vlan interface:

# ip link add link enp3s0 name enp3s0.4094 type vlan id 4094
# ip link set up dev enp3s0
# ip link set up dev enp3s0.4094 # maybe not even needed
# ip address add 10.0.0.1/24 dev enp3s0.4094

Here I could not use Neutron's vlan 4095, because Linux correctly recognizes this as a reserved vlan id and rejects it.
So everywhere I used vlan id 4094 instead.

Outside the vm, on my laptop, I added the catchall drop rule for this vlan:

# ovs-ofctl add-flow br0 cookie=0xdeadbeef,priority=65535,vlan_tci=0x0ffe/0x1fff,actions=drop
# ovs-ofctl dump-flows br0
 cookie=0xdeadbeef, duration=14166.655s, table=0, n_packets=0, n_bytes=0, priority=65535,vlan_tci=0x0ffe/0x1fff actions=drop
 cookie=0x0, duration=24492.684s, table=0, n_packets=11796, n_bytes=543744, priority=0 actions=NORMAL

For the sake of completeness I tried in the state as libvirt plugged the port, that is in its original trunk state.
Also tried with the port tagged:

# ovs-vsctl set port vnet8 tag=4094

In both tries I saw the same outcome.

From the vm I started pinging 10.0.0.2, where of course nobody answers, but this generated arp queries.

As you can see in the above output all traffic generated by me (and other traffic too) was caught by the NORMAL action rule and not by the drop action rule.

I also did see the traffic with vlan tag 4094 by 'tcpdump -e -i vnet8'.
Also the packet counter of the NORMAL action rule was counting as I started the ping, and it stopped as I stopped it.

I used the following software versions. Open vSwitch used the Linux kernel module datapath.

$ ovs-vsctl --version
ovs-vsctl (Open vSwitch) 2.15.0
DB Schema 8.2.0

$ libvirtd --version
libvirtd (libvirt) 7.0.0

$ uname -a
Linux dfx 5.10.0-6-amd64 #1 SMP Debian 5.10.28-1 (2021-04-09) x86_64 GNU/Linux

Revision history for this message
Jeremy Stanley (fungi) wrote :

Thanks Bence! Based on your findings, the OpenStack VMT will consider this a class C2 report (vulnerability in a dependency) per our taxonomy:

https://security.openstack.org/vmt-process.html#report-taxonomy

If new information comes to light, please update this bug and we'll revisit the classification.

information type: Public Security → Public
Changed in ossa:
status: Incomplete → Won't Fix
tags: added: security
Changed in neutron:
importance: Undecided → High
Revision history for this message
Bence Romsics (bence-romsics) wrote (last edit ):

I have updates to this bug report.
First, in previous analysis I made a mistake and skipped over an erroneously set bit by neutron.
Now I have a patch, fixing that part of the bug.
Uploading that soon.
However even after applying that fix, the original error is still present.
So there must be another part to this bug.
Despite the long time since this report was opened we did not succeed in identifying the root cause of the remaining part.
I can only speculate about that root cause (and in which component it is), please see that at the end of this comment.

Just to sum it up again I insert the reproduction here that reflects the partial fix I'll upload.
The one bit difference is in the vlan_tci field.
I also succeeded in reproducing the bug with dpdk based ovs.
I attached the libvirt domain definition xml I used in the dpdk environment.

This reproduction works for me every time both with dpdk-based userspace ovs datapath and the default linux kernel module ovs datapath.
However colleagues reported to me that on various other dpdk-based deployments these reproduction steps do not work.
I'll return to this strange fact at the end.

So the steps (this time in a dpdk env):

[edit: added the missing add-port command after testing]

host# ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
host# ovs-vsctl add-port br0 vhu-vm0-if0 -- set Interface vhu-vm0-if0 type=dpdkvhostuserclient options:vhost-server-path=/tmp/vhu-vm0-if0
host# virsh define vm0-dpdk.xml
host# virsh start vm0

guest# ip link set up dev enp3s0
guest# ip address add 10.0.0.1/24 dev enp3s0

host# ovs-vsctl set port vhu-vm0-if0 tag=4094

guest# ping 10.0.0.2

Please note that the bit in vlan_tci at position 0x1000 is now different.
I believe vlan_tci=0x1ffe/0x1fff is equivalent to dl_vlan=4094.

host# ovs-ofctl add-flow br0 cookie=0x1,priority=65535,vlan_tci=0x1ffe/0x1fff,actions=drop
host# ovs-ofctl dump-flows br0

In the dump-flows output you can observe that the flow with cookie=0x1 is not matching the arp traffic.
I believe it should.

The presence of the traffic can also be verified with tcpdump:

ovs-tcpdump -vne -i vhu-vm0-if0
ovs-tcpdump -vne -i vhu-vm0-if0 ether src 52:54:00:2a:d8:a0

Where the MAC is the proper vNIC's MAC.

In summary I believe the neutron's dead vlan is leaking.
During my investigation I found a problem, for which I'm uploading a fix.
But that fix does not solve the original problem.
The remaining problem I believe can be:

* a bug in neutron: not configuring ovs as it should be configured to drop all traffic of the dead vlan
* a bug in ovs: traffic not matching a flow it should be matching

However if ovs had such a bug, then that bug would be probably widely noticed, since this is a very basic flow match.
The fact that this is not being noticed and the failure of reproduction in other environments (which I personally never saw) also hints at another possibility: that we have some slight difference between the environments where the bug can and cannot be reproduced, however we could not find that difference so far.

Revision history for this message
Bence Romsics (bence-romsics) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/820897

Changed in neutron:
status: New → In Progress
Revision history for this message
Bence Romsics (bence-romsics) wrote :

To get some help with understanding the remaining part of the bug's root cause I sent the following mail to <email address hidden>:
https://mail.openvswitch.org/pipermail/ovs-discuss/2021-December/051646.html

Revision history for this message
Bence Romsics (bence-romsics) wrote :

We have got a really helpful response:
https://mail.openvswitch.org/pipermail/ovs-discuss/2021-December/051647.html

Now I think I understand how to approach a fix.

Revision history for this message
Mark Goddard (mgoddard) wrote :

I hit https://bugs.launchpad.net/neutron/+bug/1953165, which is a dup of this issue. I commented there, because it seemed more similar to my issue than this generic one.

https://bugs.launchpad.net/neutron/+bug/1953165/comments/7

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/neutron/+/825122

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/825123

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/825124

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/820897
Committed: https://opendev.org/openstack/neutron/commit/7aae31c9f9ed938760ca0be3c461826b598c7004
Submitter: "Zuul (22348)"
Branch: master

commit 7aae31c9f9ed938760ca0be3c461826b598c7004
Author: Bence Romsics <email address hidden>
Date: Tue Oct 5 17:02:41 2021 +0200

    Make the dead vlan actually dead

    All ports plugged into the dead vlan (DEAD_VLAN_TAG 4095 or 0xfff)
    should not be able to send or receive traffic. We install a flow
    to br-int to drop all traffic of the dead vlan [1]. However before
    this patch the flow we install looks like:

    priority=65535,vlan_tci=0x0fff/0x1fff actions=drop

    Which is wrong and it usually does not match anything.

    According to ovs-fields (7) section Open vSwitch Extension VLAN Field,
    VLAN TCI Field [2] (see especially the usage example
    vlan_tci=0x1123/0x1fff) we need to explicitly set the bit 0x1000
    to match the presence of an 802.1Q header.

    Setting that bit this flow becomes:
    priority=65535,vlan_tci=0x1fff/0x1fff actions=drop

    which is equivalent to:
    priority=65535,dl_vlan=4095 actions=drop

    which should match and drop dead vlan traffic.

    However there's a second problem: ovs access ports were designed to
    work together with the NORMAL action. The NORMAL action considers the
    vlan of an access port, but the openflow pipeline does not. An openflow
    rule does not see the vlan set for an access port, because that vlan
    tag is only pushed to the frame if and when the frame leaves the switch
    on a trunk port [3][4].

    So we have to explicitly push the DEAD_VLAN_TAG if we want the dead
    vlan's drop flow match anything.

    That means we are adding a flow to push the dead vlan tag from
    dhcp-agent/l3-agent but we are deleting that flow from ovs-agent right
    after ovs-agent sets the vlan tag of the port to a non-dead vlan. Which
    is ugly but we have to keep adding the flow as early as possible if we
    want to minimize the window until frames can leak onto the dead vlan.
    Even with this change there's a short time window in which the dead vlan
    could theoretically leak.

    [1] https://opendev.org/openstack/neutron/src/commit/ecdc11a56448428f77f5a64fd028f1e4c9644ea3/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py#L60-L62
    [2] http://www.openvswitch.org/support/dist-docs/ovs-fields.7.html
    [3] https://mail.openvswitch.org/pipermail/ovs-discuss/2021-December/051647.html
    [4] https://docs.openvswitch.org/en/latest/faq/vlan/
        see 'Q: My OpenFlow controller doesn’t see the VLANs that I expect.'

    Change-Id: Ib6b70114efb140cf1393b57ebc350fea4b0a2443
    Closes-Bug: #1930414

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/825122
Committed: https://opendev.org/openstack/neutron/commit/5025a6a7274df13436fe540a044710337a3d8236
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 5025a6a7274df13436fe540a044710337a3d8236
Author: Bence Romsics <email address hidden>
Date: Tue Oct 5 17:02:41 2021 +0200

    Make the dead vlan actually dead

    All ports plugged into the dead vlan (DEAD_VLAN_TAG 4095 or 0xfff)
    should not be able to send or receive traffic. We install a flow
    to br-int to drop all traffic of the dead vlan [1]. However before
    this patch the flow we install looks like:

    priority=65535,vlan_tci=0x0fff/0x1fff actions=drop

    Which is wrong and it usually does not match anything.

    According to ovs-fields (7) section Open vSwitch Extension VLAN Field,
    VLAN TCI Field [2] (see especially the usage example
    vlan_tci=0x1123/0x1fff) we need to explicitly set the bit 0x1000
    to match the presence of an 802.1Q header.

    Setting that bit this flow becomes:
    priority=65535,vlan_tci=0x1fff/0x1fff actions=drop

    which is equivalent to:
    priority=65535,dl_vlan=4095 actions=drop

    which should match and drop dead vlan traffic.

    However there's a second problem: ovs access ports were designed to
    work together with the NORMAL action. The NORMAL action considers the
    vlan of an access port, but the openflow pipeline does not. An openflow
    rule does not see the vlan set for an access port, because that vlan
    tag is only pushed to the frame if and when the frame leaves the switch
    on a trunk port [3][4].

    So we have to explicitly push the DEAD_VLAN_TAG if we want the dead
    vlan's drop flow match anything.

    That means we are adding a flow to push the dead vlan tag from
    dhcp-agent/l3-agent but we are deleting that flow from ovs-agent right
    after ovs-agent sets the vlan tag of the port to a non-dead vlan. Which
    is ugly but we have to keep adding the flow as early as possible if we
    want to minimize the window until frames can leak onto the dead vlan.
    Even with this change there's a short time window in which the dead vlan
    could theoretically leak.

    [1] https://opendev.org/openstack/neutron/src/commit/ecdc11a56448428f77f5a64fd028f1e4c9644ea3/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py#L60-L62
    [2] http://www.openvswitch.org/support/dist-docs/ovs-fields.7.html
    [3] https://mail.openvswitch.org/pipermail/ovs-discuss/2021-December/051647.html
    [4] https://docs.openvswitch.org/en/latest/faq/vlan/
        see 'Q: My OpenFlow controller doesn’t see the VLANs that I expect.'

    Change-Id: Ib6b70114efb140cf1393b57ebc350fea4b0a2443
    Closes-Bug: #1930414
    (cherry picked from commit 7aae31c9f9ed938760ca0be3c461826b598c7004)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/825123
Committed: https://opendev.org/openstack/neutron/commit/9f5b745a5eaceb503b5a4eb85f786dab7e8e071e
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 9f5b745a5eaceb503b5a4eb85f786dab7e8e071e
Author: Bence Romsics <email address hidden>
Date: Tue Oct 5 17:02:41 2021 +0200

    Make the dead vlan actually dead

    All ports plugged into the dead vlan (DEAD_VLAN_TAG 4095 or 0xfff)
    should not be able to send or receive traffic. We install a flow
    to br-int to drop all traffic of the dead vlan [1]. However before
    this patch the flow we install looks like:

    priority=65535,vlan_tci=0x0fff/0x1fff actions=drop

    Which is wrong and it usually does not match anything.

    According to ovs-fields (7) section Open vSwitch Extension VLAN Field,
    VLAN TCI Field [2] (see especially the usage example
    vlan_tci=0x1123/0x1fff) we need to explicitly set the bit 0x1000
    to match the presence of an 802.1Q header.

    Setting that bit this flow becomes:
    priority=65535,vlan_tci=0x1fff/0x1fff actions=drop

    which is equivalent to:
    priority=65535,dl_vlan=4095 actions=drop

    which should match and drop dead vlan traffic.

    However there's a second problem: ovs access ports were designed to
    work together with the NORMAL action. The NORMAL action considers the
    vlan of an access port, but the openflow pipeline does not. An openflow
    rule does not see the vlan set for an access port, because that vlan
    tag is only pushed to the frame if and when the frame leaves the switch
    on a trunk port [3][4].

    So we have to explicitly push the DEAD_VLAN_TAG if we want the dead
    vlan's drop flow match anything.

    That means we are adding a flow to push the dead vlan tag from
    dhcp-agent/l3-agent but we are deleting that flow from ovs-agent right
    after ovs-agent sets the vlan tag of the port to a non-dead vlan. Which
    is ugly but we have to keep adding the flow as early as possible if we
    want to minimize the window until frames can leak onto the dead vlan.
    Even with this change there's a short time window in which the dead vlan
    could theoretically leak.

    [1] https://opendev.org/openstack/neutron/src/commit/ecdc11a56448428f77f5a64fd028f1e4c9644ea3/neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py#L60-L62
    [2] http://www.openvswitch.org/support/dist-docs/ovs-fields.7.html
    [3] https://mail.openvswitch.org/pipermail/ovs-discuss/2021-December/051647.html
    [4] https://docs.openvswitch.org/en/latest/faq/vlan/
        see 'Q: My OpenFlow controller doesn’t see the VLANs that I expect.'

    Conflicts:
        neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py
            Trivial merge conflict by changed adjacent lines.

    Change-Id: Ib6b70114efb140cf1393b57ebc350fea4b0a2443
    Closes-Bug: #1930414
    (cherry picked from commit 7aae31c9f9ed938760ca0be3c461826b598c7004)
    (cherry picked from commit 5025a6a7274df13436fe540a044710337a3d8236)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (stable/victoria)

Related fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/826291

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/826460

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (stable/victoria)

Change abandoned by "Bence Romsics <email address hidden>" on branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/826291

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to neutron (master)

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/827315

description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by "Bence Romsics <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/neutron/+/826460
Reason: https://review.opendev.org/c/openstack/neutron/+/827315

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/827315
Committed: https://opendev.org/openstack/neutron/commit/0ddca284542aed89df4a22607a2da03f193f083c
Submitter: "Zuul (22348)"
Branch: master

commit 0ddca284542aed89df4a22607a2da03f193f083c
Author: Oleg Bondarev <email address hidden>
Date: Tue Feb 1 18:56:02 2022 +0300

    Make sure "dead vlan" ports cannot transmit packets

    https://review.opendev.org/c/openstack/neutron/+/820897 added
    a dead vlan flow that pushes the dead vlan tag onto frames
    belonging to dead ports before these ports are reassigned to
    their proper vlans. However add_flow and delete_flows race and
    delete_flows may run before add_flow, in this case deleting 0 flows
    but not giving us a chance to detect this: neither does it throw
    an error nor does it return the number of deleted flows.
    This leads to port staying inaccessible forever and hence
    breaks corresponding DHCP or router.

    Current patch suggests another approach to make sure no packets are
    leaked from newly plugged ports: setting their "vlan_mode" attribute
    to "trunk" and "trunks"=[4095] (along with assigning dead VLAN tag).
    With this OVS normal pipeline will allow only packets tagged with 4095
    from such ports [1], which normally not happens, but even if it does -
    default rule in br-int will drop them anyway.
    Thus untagged packets from such ports will also be dropped until
    ovs agent sets proper VLAN tag and clears vlan_mode to default
    ("access").

    This approach avoids the race between dhcp/l3 and ovs agents because
    dhcp/l3 agents no longer modify flow table.

    This partially reverts commit 7aae31c9f9ed938760ca0be3c461826b598c7004

    [1] https://docs.openvswitch.org/en/latest/ref/ovs-actions.7/?highlight=ovs-actions#the-ovs-normal-pipeline

    Closes-Bug: #1930414
    Closes-Bug: #1959564
    Change-Id: I0391dd24224f8656a09ddb002e7dae8783ba37a4

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/neutron/+/828230

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/neutron/+/828231

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/828230
Committed: https://opendev.org/openstack/neutron/commit/78c63d4ec6a94ba7bf9efb576850f7b38f1f8722
Submitter: "Zuul (22348)"
Branch: stable/xena

commit 78c63d4ec6a94ba7bf9efb576850f7b38f1f8722
Author: Oleg Bondarev <email address hidden>
Date: Tue Feb 1 18:56:02 2022 +0300

    Make sure "dead vlan" ports cannot transmit packets

    https://review.opendev.org/c/openstack/neutron/+/820897 added
    a dead vlan flow that pushes the dead vlan tag onto frames
    belonging to dead ports before these ports are reassigned to
    their proper vlans. However add_flow and delete_flows race and
    delete_flows may run before add_flow, in this case deleting 0 flows
    but not giving us a chance to detect this: neither does it throw
    an error nor does it return the number of deleted flows.
    This leads to port staying inaccessible forever and hence
    breaks corresponding DHCP or router.

    Current patch suggests another approach to make sure no packets are
    leaked from newly plugged ports: setting their "vlan_mode" attribute
    to "trunk" and "trunks"=[4095] (along with assigning dead VLAN tag).
    With this OVS normal pipeline will allow only packets tagged with 4095
    from such ports [1], which normally not happens, but even if it does -
    default rule in br-int will drop them anyway.
    Thus untagged packets from such ports will also be dropped until
    ovs agent sets proper VLAN tag and clears vlan_mode to default
    ("access").

    This approach avoids the race between dhcp/l3 and ovs agents because
    dhcp/l3 agents no longer modify flow table.

    This partially reverts commit 7aae31c9f9ed938760ca0be3c461826b598c7004

    [1] https://docs.openvswitch.org/en/latest/ref/ovs-actions.7/?highlight=ovs-actions#the-ovs-normal-pipeline

    Closes-Bug: #1930414
    Closes-Bug: #1959564
    Change-Id: I0391dd24224f8656a09ddb002e7dae8783ba37a4
    (cherry picked from commit 0ddca284542aed89df4a22607a2da03f193f083c)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/828231
Committed: https://opendev.org/openstack/neutron/commit/9d5cea0e2bb85b3b6ea27eb71279c57c419b0102
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 9d5cea0e2bb85b3b6ea27eb71279c57c419b0102
Author: Oleg Bondarev <email address hidden>
Date: Tue Feb 1 18:56:02 2022 +0300

    Make sure "dead vlan" ports cannot transmit packets

    https://review.opendev.org/c/openstack/neutron/+/820897 added
    a dead vlan flow that pushes the dead vlan tag onto frames
    belonging to dead ports before these ports are reassigned to
    their proper vlans. However add_flow and delete_flows race and
    delete_flows may run before add_flow, in this case deleting 0 flows
    but not giving us a chance to detect this: neither does it throw
    an error nor does it return the number of deleted flows.
    This leads to port staying inaccessible forever and hence
    breaks corresponding DHCP or router.

    Current patch suggests another approach to make sure no packets are
    leaked from newly plugged ports: setting their "vlan_mode" attribute
    to "trunk" and "trunks"=[4095] (along with assigning dead VLAN tag).
    With this OVS normal pipeline will allow only packets tagged with 4095
    from such ports [1], which normally not happens, but even if it does -
    default rule in br-int will drop them anyway.
    Thus untagged packets from such ports will also be dropped until
    ovs agent sets proper VLAN tag and clears vlan_mode to default
    ("access").

    This approach avoids the race between dhcp/l3 and ovs agents because
    dhcp/l3 agents no longer modify flow table.

    This partially reverts commit 7aae31c9f9ed938760ca0be3c461826b598c7004

    [1] https://docs.openvswitch.org/en/latest/ref/ovs-actions.7/?highlight=ovs-actions#the-ovs-normal-pipeline

    Closes-Bug: #1930414
    Closes-Bug: #1959564
    Change-Id: I0391dd24224f8656a09ddb002e7dae8783ba37a4
    (cherry picked from commit 0ddca284542aed89df4a22607a2da03f193f083c)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/828497

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/neutron/+/828497
Committed: https://opendev.org/openstack/neutron/commit/88abffcb48d3237a3758e415cc9db86c312b0f68
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 88abffcb48d3237a3758e415cc9db86c312b0f68
Author: Bence Romsics <email address hidden>
Date: Tue Oct 5 17:02:41 2021 +0200

    Make sure "dead vlan" ports cannot transmit packets

    https://review.opendev.org/c/openstack/neutron/+/820897 added
    a dead vlan flow that pushes the dead vlan tag onto frames
    belonging to dead ports before these ports are reassigned to
    their proper vlans. However add_flow and delete_flows race and
    delete_flows may run before add_flow, in this case deleting 0 flows
    but not giving us a chance to detect this: neither does it throw
    an error nor does it return the number of deleted flows.
    This leads to port staying inaccessible forever and hence
    breaks corresponding DHCP or router.

    Current patch suggests another approach to make sure no packets are
    leaked from newly plugged ports: setting their "vlan_mode" attribute
    to "trunk" and "trunks"=[4095] (along with assigning dead VLAN tag).
    With this OVS normal pipeline will allow only packets tagged with 4095
    from such ports [1], which normally not happens, but even if it does -
    default rule in br-int will drop them anyway.
    Thus untagged packets from such ports will also be dropped until
    ovs agent sets proper VLAN tag and clears vlan_mode to default
    ("access").

    This approach avoids the race between dhcp/l3 and ovs agents because
    dhcp/l3 agents no longer modify flow table.

    backport-only:
    On newer than victoria branches we merged two changes of which the first
    got problematic side effects (a race condition between agents
    manipulating flows) and a second change partially reverting the first
    and using the above mentioned other approach. The first change never got
    merged on victoria or older, so this backport squashes the two changes
    together.

    [1] https://docs.openvswitch.org/en/latest/ref/ovs-actions.7/?highlight=ovs-actions#the-ovs-normal-pipeline

    Conflicts:
        neutron/plugins/ml2/drivers/openvswitch/agent/openflow/native/br_int.py

    Closes-Bug: #1930414
    Change-Id: I0391dd24224f8656a09ddb002e7dae8783ba37a4
    (squashed from cherry-picked commits 7aae31c9f9ed938760ca0be3c461826b598c7004, 0ddca284542aed89df4a22607a2da03f193f083c)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (stable/victoria)

Change abandoned by "Bence Romsics <email address hidden>" on branch: stable/victoria
Review: https://review.opendev.org/c/openstack/neutron/+/825124
Reason: https://review.opendev.org/c/openstack/neutron/+/828497

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 20.0.0.0rc1

This issue was fixed in the openstack/neutron 20.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 17.4.0

This issue was fixed in the openstack/neutron 17.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 18.3.0

This issue was fixed in the openstack/neutron 18.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 19.2.0

This issue was fixed in the openstack/neutron 19.2.0 release.

Revision history for this message
Pavlo Shchelokovskyy (pshchelo) wrote :

Is there any recommended procedure on how to apply this change as it is backported now? Should all the agents be stopped first to not let e.g. the old L3 agents run at the same time as new OVS agent runs? Or can the old and new (patched and unpatched) combinations of agents be running on the same node during update?

Cuz we are now investigating a failure during victoria update that seems like connected with this patch and different versions of agents running at the same time...

Revision history for this message
Oleg Bondarev (obondarev) wrote :

@Pavlo, actually yes: the recommended procedure is to upgrade neutron ovs agent first and only after that upgrade l3/dhcp agents. Running new l3/dhcp agents together with old OVS agents on the same node may cause connectivity issues.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.