Modifying security groups when using openvswitch firewall causes existing connections to drop

Bug #1731953 reported by James Denton
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Confirmed
Medium
Slawek Kaplonski

Bug Description

Environment: OpenStack Newton
Driver: ML2 w/ OVS
Firewall: openvswitch

Clients using an OpenStack cloud based on the Newton release are facing network issues when updating security groups/rules. We are able to replicate the issue by modifying security group rules in an existing security group applied to a port.

Test scenario:
--------------
1. Built a test instance. Example:

root@osctrl-utility-container-8ad9622f:~# openstack server show rackspace-jamesdenton-01
WARNING: openstackclient.common.utils is deprecated and will be removed after Jun 2017. Please use osc_lib.utils
+--------------------------------------+----------------------------------------------------------------------------+
| Field | Value |
+--------------------------------------+----------------------------------------------------------------------------+
| OS-DCF:diskConfig | MANUAL |
| OS-EXT-AZ:availability_zone | nova |
| OS-EXT-SRV-ATTR:host | oscomp-h126 |
| OS-EXT-SRV-ATTR:hypervisor_hostname | oscomp-h126 |
| OS-EXT-SRV-ATTR:instance_name | instance-00014fed |
| OS-EXT-STS:power_state | Running |
| OS-EXT-STS:task_state | None |
| OS-EXT-STS:vm_state | active |
| OS-SRV-USG:launched_at | 2017-11-13T14:57:09.000000 |
| OS-SRV-USG:terminated_at | None |
| accessIPv4 | |
| accessIPv6 | |
| addresses | Public=2001:ffff:ffff:ffff:f816:3eff:fef2:457a, 192.168.2.200 |
| config_drive | |
| created | 2017-11-13T14:56:54Z |
| flavor | m1.medium (103) |
| hostId | 1599f0caa6bb0775a5b8b2b4ee76a23a9135e9d84e7844c53543541f |
| id | 5d5afb5b-778c-46fc-8dbb-31c62a4e45d5 |
| image | Ubuntu-Trusty-20170310 (80267974-d0fc-4016-9338-3a057671782a) |
| key_name | rpc_support |
| name | rackspace-jamesdenton-01 |
| os-extended-volumes:volumes_attached | [] |
| progress | 0 |
| project_id | 723cdf11c4dd41ca9eeb47cb0576eb71 |
| properties | |
| security_groups | [{u'name': u'rpc-support'}] |
| status | ACTIVE |
| updated | 2017-11-13T14:57:10Z |
| user_id | 74cebd9525a843fcb374af1ea3a91fea |
+--------------------------------------+----------------------------------------------------------------------------+

2. Initiate a 4G image download from the VM

# wget -4 -O /dev/null http://centos.mirror.constant.com/7.4.1708/isos/x86_64/CentOS-7-x86_64-DVD-1708.iso

--2017-11-13 15:00:59-- http://centos.mirror.constant.com/7.4.1708/isos/x86_64/CentOS-7-x86_64-DVD-1708.iso
Resolving centos.mirror.constant.com (centos.mirror.constant.com)... 108.61.5.83
Connecting to centos.mirror.constant.com (centos.mirror.constant.com)|108.61.5.83|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4521459712 (4.2G) [application/octet-stream]
Saving to: ‘/dev/null’

20% [===============================> ]

3. Add a rule to security group

root@osctrl-utility-container-8ad9622f:~# openstack security group rule create --ingress --protocol tcp --dst-port 443 rpc-support
WARNING: openstackclient.common.utils is deprecated and will be removed after Jun 2017. Please use osc_lib.utils
+-------------------+--------------------------------------+
| Field | Value |
+-------------------+--------------------------------------+
| created_at | 2017-11-13T15:01:11Z |
| description | |
| direction | ingress |
| ethertype | IPv4 |
| headers | |
| id | d9b28673-d7bd-49af-b4b1-c1830c16af4a |
| port_range_max | 443 |
| port_range_min | 443 |
| project_id | 723cdf11c4dd41ca9eeb47cb0576eb71 |
| project_id | 723cdf11c4dd41ca9eeb47cb0576eb71 |
| protocol | tcp |
| remote_group_id | None |
| remote_ip_prefix | 0.0.0.0/0 |
| revision_number | 1 |
| security_group_id | 2870f0a0-fa34-4c7a-b419-2c13eacfd3d6 |
| updated_at | 2017-11-13T15:01:11Z |
+-------------------+--------------------------------------+

4. Observe download stalls after few seconds

Saving to: ‘/dev/null’

24% [=================================> ] 1,104,898,752 --.-K/s eta 76s
24% [=================================> ] 1,104,898,752 --.-K/s eta 82s
24% [=================================> ] 1,104,898,752 --.-K/s eta 2m 9s
24% [=================================> ] 1,104,898,752 --.-K/s eta 42m 44s

After 20 minutes, I cancelled the transfer.

Trying again immediately results in a successful write:

ubuntu@rackspace-jamesdenton-01:~$ wget -4 -O /dev/null http://centos.mirror.constant.com/7.4.1708/isos/x86_64/CentOS-7-x86_64-DVD-1708.iso
--2017-11-13 15:15:29-- http://centos.mirror.constant.com/7.4.1708/isos/x86_64/CentOS-7-x86_64-DVD-1708.iso
Resolving centos.mirror.constant.com (centos.mirror.constant.com)... 108.61.5.83
Connecting to centos.mirror.constant.com (centos.mirror.constant.com)|108.61.5.83|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4521459712 (4.2G) [application/octet-stream]
Saving to: ‘/dev/null’

100%[===========================================================================================================================================>] 4,521,459,712 103MB/s in 48s

2017-11-13 15:16:17 (89.9 MB/s) - ‘/dev/null’ saved [4521459712/4521459712]

--

We have identified areas in the code we feel may be responsible for this:

Newton: https://github.com/openstack/neutron/blob/newton-eol/neutron/agent/linux/openvswitch_firewall/firewall.py#L312
Master: https://github.com/openstack/neutron/blob/master/neutron/agent/linux/openvswitch_firewall/firewall.py#L511

This has had a negative impact to the user experience. Thanks for taking a look and let me know if you have any questions.

Tags: ovs-fw
tags: added: ovs-fw
Changed in neutron:
status: New → Confirmed
importance: Undecided → Medium
Changed in neutron:
assignee: nobody → Slawek Kaplonski (slaweq)
Changed in neutron:
assignee: Slawek Kaplonski (slaweq) → nobody
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I checked it today on devstack with Neutron from master branch and it looks that it is still the same issue

Changed in neutron:
assignee: nobody → Slawek Kaplonski (slaweq)
Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I'm not sure but I think that https://review.openstack.org/#/c/451257/ will be needed to fix this issue with SG rules update

Revision history for this message
IWAMOTO Toshihiro (iwamoto) wrote :

Could you check if https://review.openstack.org/#/c/460395 fix this issue?

Revision history for this message
Slawek Kaplonski (slaweq) wrote :

I checked if patch proposed as a fix for https://bugs.launchpad.net/neutron/+bug/1708731 will fix issue described here and it fixes it

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.