DVR denial of service observed when using DVR+VLAN project networks

Bug #1866725 reported by Tom Carroll
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Expired
Undecided
Unassigned

Bug Description

Neturon: 15.0.1
Open vSwitch: 2.12
Linux kernel: 5.3

Summary: If a qr interface transmits data to another hypervisor on the vlan, a datapath rule is installed that will inhibit communication between qr and virtual machines colocated on a hypervisor. The problematic datapath is defined only in context of qr in_port, source MAC address, and output port. Consequently, traffic transmitted by qr interface will be disrupted. Loss of floating IPs and routing services is experienced until the datapath rule expires or is flushed.

Description:

        hypervisor M1 hypervisor M2
==================================== ====================================
           VM1 VM2
       10.10.0.10 10.10.0.20
           | |
     qr -- br-int -- br-vlan -- ethX -----vlan 500---- ethX -- br-vlan -- br-int -- qr
10.10.0.1 10.10.0.1

In this setup VM1 and VM2 are executing on hypervisors M1 and M2, respectively. The DVR interface is designated qr. The br-vlan provides access to vlan id 500 via interface ethX

br-vlan has a rule to rewrite the qr source MAC (DVR MAC) to its unique local instance (LOCAL DVR MAC) before transmitting the frames onto the physical network. If this rule is used, a datapath rule DP1 is installed that is structured as:

in_port(qr),eth(src=DVR MAC),eth_type(0x0800),ipv4(frag=no) actions: set(eth(src=LOCAL DVR MAC)),push_vlan(vid=500,pcp=0),ethX

Waiting a moment for any existing datapaths to expire, traffic originating from the qr will have its source MAC address rewritten and forwarded out of ethX. If this is happening on M1, this will disrupt communication between qr and VM1, including routing and floating ip access. Open vswitch will not inject a new datapath as the datapath matches the traffic from the qr.

To cause this to happen, configure VM1 to communicate with VM2 by bouncing traffic off of qr.
On VM1, configure the routing table

ip route add 10.10.0.20/32 via 10.10.0.1

and then ping 10.10.0.20

On H1, the datapath DP1 will be installed and access to VM1 via floating IP will be lost. This is probably not the only method to get the datapath installed.

When the system is properly working datapath explicitly expresses both source and destination MAC addresses.

Revision history for this message
Jeremy Stanley (fungi) wrote :

Since this report concerns a possible security risk, an incomplete
security advisory task has been added while the core security
reviewers for the affected project or projects confirm the bug and
discuss the scope of any vulnerability along with potential
solutions.

description: updated
Changed in ossa:
status: New → Incomplete
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Is my understanding correct that user needs to set route to 10.10.0.20/32 via 10.10.0.1 on own vm and then communication to this vm using floating IP will be broken? Or all other floating ips on the compute will be broken?

tags: added: l3-dvr-backlog
removed: dvr vlan
Revision history for this message
Tom Carroll (h-thomas-carroll) wrote :

The virtual machines on M1 connected to network 10.10.0.0/24 will have routing and floating ips disrupted. One cannot use floating ips to access VM1, nor can machines on different subnets transmit to VM1.

The problems seems to be easy to trigger. Consider below:

                    monitor1 monitor2
               =================== ===================
                   vm1
                 10.10.0.10/24 vm2
                    | 10.10.0.20/24
qr1 10.10.0.1/24 -- | |
                 br-int - br-vlan - ethX - vlan 500 - ethX - br-vlan - br-int - qr1 10.10.0.1/24
qr2 10.20.0.1/24 -- |
                    |
                  vm3
               10.20.0.10/24

There are two subnetworks. vm1 and vm2 are in network 10.10.0/24, while vm3 is in 10.20.0.0/24.

If vm3 pings 10.10.0.20, qr1 will be not be able to communicate with vm1. While vm3 can successfully ping 10.10.0.20, a datapath is generated where qr1 -> ethX. floating ip is then broken for vm1, but remains available for vm3.

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

Hi Tom,

But I'm still not sure what is the security concern here (if any). If my understanding of Your bug description is correct, than connectivity to VM1 can be broken ONLY by configuring specific route INSIDE the VM1. So there is no risk that someone else can attack my VM1 and break connectivity to it. Am I right or am I missing something here?

Jeremy Stanley (fungi)
description: updated
Revision history for this message
Jeremy Stanley (fungi) wrote :

The embargo for this report has expired and is now lifted, so it's acceptable to discuss further in public.

description: updated
information type: Private Security → Public Security
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I'm marking this bug as Incomplete for now as I still see only way when owner of the vm can break connectivity to own vm when he will do something strange (strange routing) inside the vm.
Feel free to reopen it when You will provide more info about how other users can break this connectivity also.

Changed in neutron:
status: New → Incomplete
Revision history for this message
Jeremy Stanley (fungi) wrote :

I've switched this to a regular Public bug and removed our security advisory task, but marked it with the security tag as a possible hardening opportunity.

information type: Public Security → Public
no longer affects: ossa
tags: added: security
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.