Eavesdropping private traffic ============================= Abstract -------- We've discovered a security issue that allows end users within their own private network to receive from, and send traffic to, other private networks on the same compute node. Description ----------- During live-migration there is a small time window where the ports of instances are untagged. Instances have a port trunked to the integration bridge and receive 802.1Q tagged private traffic from other tenants. If the port is administratively down during live migration, the port will remain in trunk mode indefinitely. Traffic is possible between ports is that are administratively down, even between tenants self-service networks. Conditions ---------- The following conditions are necessary. * Openvswitch Self-service networks * An Openstack administrator or an automated process needs to schedule a Live migration We tested this on newton. Issues ------ This outcome is the result of multiple independent issues. We will list the most important first, and follow with bugs that create a fragile situation. Issue #1 Initially creating a trunk port When the port is initially created, it is in trunk mode. This creates a fail-open situation. See: https://github.com/openstack/os-vif/blob/newton-eol/vif_plug_ovs/linux_net.py#L52 Recommendation: create ports in the port_dead state, don't leave it dangling in trunk-mode. Add a drop-flow initially. Issue #2 Order of creation. The instance is actually migrated before the (networking) configuration is completed. Recommendation: wait with finishing the live migration until the underlying configuration has been applied completely. Issue #3 Not closing the port when it is down. Neutron calls the port_dead function to ensure the port is down. It sets the tag to 4095 and adds a "drop" flow if (and only if) there is already another tag on the port. The port_dead function will keep untagged ports untagged. https://github.com/openstack/neutron/blob/stable/newton/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L995 Recommendation: Make port_dead also shut the port if no tag is found. Log a warning if this happens. Issue #4 Putting the port administratively down actually puts the port on a compute node shared vlan Instances from different projects on different private networks can talk to each other if they put their ports down. The code does install an openflow "drop" rule but it has a lower priority (2) than the allow rules. Recommendation: Increase the port_dead openflow drop rule priority to MAX Timeline -------- 2017-09-14 Discovery eavesdropping issue 2017-09-15 Verify workaround. 2017-10-04 Discovery port-down-traffic issue 2017-11-24 Vendor Disclosure to Openstack Steps to reproduce ------------------ 1. Attach an instance to two networks: admin$ openstack server create --nic net-id= --nic net-id= --image --flavor instance_temp 2. Attach a FIP to the instance to be able to log in to this instance 3. Verify: admin$ openstack server show -c name -c addresses fe28a2ee-098f-4425-9d3c-8e2cd383547d +-----------+-----------------------------------------------------------------------------+ | Field | Value | +-----------+-----------------------------------------------------------------------------+ | addresses | network1=192.168.99.8, ; network2=192.168.80.14 | | name | instance_temp | +-----------+-----------------------------------------------------------------------------+ 4. Ssh to the instance using network1 and run a tcpdump on the other port network2 [root@instance_temp]$ tcpdump -eeenni eth1 5. Get port-id of network2 admin$ nova interface-list fe28a2ee-098f-4425-9d3c-8e2cd383547d +------------+--------------------------------------+--------------------------------------+---------------+-------------------+ | Port State | Port ID | Net ID | IP addresses | MAC Addr | +------------+--------------------------------------+--------------------------------------+---------------+-------------------+ | ACTIVE | a848520b-0814-4030-bb48-49e4b5cf8160 | d69028f7-9558-4f14-8ce6-29cb8f1c19cd | 192.168.80.14 | fa:16:3e:2d:8b:7b | | ACTIVE | fad148ca-cf7a-4839-aac3-a2cd8d1d2260 | d22c22ae-0a42-4e3b-8144-f28534c3439a | 192.168.99.8 | fa:16:3e:60:2c:fa | +------------+--------------------------------------+--------------------------------------+---------------+-------------------+ 6. Force port down on network 2 admin$ neutron port-update a848520b-0814-4030-bb48-49e4b5cf8160 --admin-state-up False 7. Port gets tagged with vlan 4095, the dead vlan tag, which is normal: compute1# grep a848520b-0814-4030-bb48-49e4b5cf8160 /var/log/neutron/neutron-openvswitch-agent.log | tail -1 INFO neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-e008feb3-8a35-4c97-adac-b48ff88165b2 - - - - -] VIF port: a848520b-0814-4030-bb48-49e4b5cf8160 admin state up disabled, putting on the dead VLAN 8. Verify the port is tagged with vlan 4095 compute1# ovs-vsctl show | grep -A3 qvoa848520b-08 Port "qvoa848520b-08" tag: 4095 Interface "qvoa848520b-08" 9. Now live-migrate the instance: admin# nova live-migration fe28a2ee-098f-4425-9d3c-8e2cd383547d 10. Verify the tag is gone on compute2, and take a deep breath compute2# ovs-vsctl show | grep -A3 qvoa848520b-08 Port "qvoa848520b-08" Interface "qvoa848520b-08" Port... compute2# echo "Wut!" 11. Now traffic of all other self-service networks present on compute2 can be sniffed from instance_temp [root@instance_temp] tcpdump -eenni eth1 13:14:31.748266 fa:16:3e:6a:17:38 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 10, p 0, ethertype ARP, Request who-has 10.103.12.160 tell 10.103.12.152, length 28 13:14:31.804573 fa:16:3e:e8:a2:d2 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.70, length 28 13:14:31.810482 fa:16:3e:95:ca:3a > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.154, length 28 13:14:31.977820 fa:16:3e:6f:f4:9b > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.150, length 28 13:14:31.979590 fa:16:3e:0f:3d:cc > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 9, p 0, ethertype ARP, Request who-has 10.103.9.163 tell 10.103.9.1, length 28 13:14:32.048082 fa:16:3e:65:64:38 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.101, length 28 13:14:32.127400 fa:16:3e:30:cb:b5 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 10, p 0, ethertype ARP, Request who-has 10.103.12.160 tell 10.103.12.165, length 28 13:14:32.141982 fa:16:3e:96:cd:b0 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.100, length 28 13:14:32.205327 fa:16:3e:a2:0b:76 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.153, length 28 13:14:32.444142 fa:16:3e:1f:db:ed > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 58: vlan 72, p 0, ethertype IPv4, 192.168.99.212 > 224.0.0.18: VRRPv2, Advertisement, vrid 50, prio 103, authtype none, intvl 1s, length 20 13:14:32.449497 fa:16:3e:1c:24:c0 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.20, length 28 13:14:32.476015 fa:16:3e:f2:3b:97 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 33, p 0, ethertype ARP, Request who-has 10.0.1.9 tell 10.0.1.22, length 28 13:14:32.575034 fa:16:3e:44:fe:35 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 10, p 0, ethertype ARP, Request who-has 10.103.12.160 tell 10.103.12.163, length 28 13:14:32.676185 fa:16:3e:1e:92:d7 > ff:ff:ff:ff:ff:ff, ethertype 802.1Q (0x8100), length 46: vlan 10, p 0, ethertype ARP, Request who-has 10.103.12.160 tell 10.103.12.150, length 28 13:14:32.711755 fa:16:3e:99:6c:c8 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 62: vlan 10, p 0, ethertype IPv4, 10.103.12.154 > 224.0.0.18: VRRPv2, Advertisement, vrid 2, prio 49, authtype simple, intvl 1s, length 24 13:14:32.711773 fa:16:3e:f5:23:d5 > 01:00:5e:00:00:12, ethertype 802.1Q (0x8100), length 58: vlan 12, p 0, ethertype IPv4, 10.103.15.154 > 224.0.0.18: VRRPv2, Advertisement, vrid 1, prio 49, authtype simple, intvl 1s, length 20 Workaround ---------- We temporary fixed this issue by forcing the dead vlan tag on port creation on compute nodes: /usr/lib/python2.7/site-packages/vif_plug_ovs/linux_net.py: def _create_ovs_vif_cmd(bridge, dev, iface_id, mac, instance_id, interface_type=None, vhost_server_path=None): + # ODCN: initialize port as dead + # ODCN: TODO: set drop flow cmd = ['--', '--if-exists', 'del-port', dev, '--', 'add-port', bridge, dev, + 'tag=4095', '--', 'set', 'Interface', dev, 'external-ids:iface-id=%s' % iface_id, 'external-ids:iface-status=active', 'external-ids:attached-mac=%s' % mac, 'external-ids:vm-uuid=%s' % instance_id] if interface_type: cmd += ['type=%s' % interface_type] if vhost_server_path: cmd += ['options:vhost-server-path=%s' % vhost_server_path] return cmd https://github.com/openstack/neutron/blob/stable/newton/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L995 def port_dead(self, port, log_errors=True): '''Once a port has no binding, put it on the "dead vlan". :param port: an ovs_lib.VifPort object. ''' # Don't kill a port if it's already dead cur_tag = self.int_br.db_get_val("Port", port.port_name, "tag", log_errors=log_errors) + # ODCN GM 20170915 + if not cur_tag: + LOG.error('port_dead(): port %s has no tag', port.port_name) + # ODCN AJS 20170915 + if not cur_tag or cur_tag != constants.DEAD_VLAN_TAG: - if cur_tag and cur_tag != constants.DEAD_VLAN_TAG: LOG.info('port_dead(): put port %s on dead vlan', port.port_name) self.int_br.set_db_attribute("Port", port.port_name, "tag", constants.DEAD_VLAN_TAG, log_errors=log_errors) self.int_br.drop_port(in_port=port.ofport) plugins/ml2/drivers/openvswitch/agent/openflow/ovs_ofctl/ovs_bridge.py def drop_port(self, in_port): + # ODCN AJS 20171004: - self.install_drop(priority=2, in_port=in_port) + self.install_drop(priority=65535, in_port=in_port) Regards, ODC Noord. Gerhard Muntingh Albert Siersema Paul Peereboom