[OVS][FW] Remote SG IDs left behind when a SG is removed
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ubuntu Cloud Archive |
Fix Released
|
Undecided
|
Unassigned | ||
Queens |
Fix Released
|
High
|
Unassigned | ||
Stein |
Fix Released
|
Undecided
|
Unassigned | ||
Train |
Fix Released
|
Undecided
|
Unassigned | ||
Ussuri |
Fix Released
|
Undecided
|
Unassigned | ||
Victoria |
Fix Released
|
Undecided
|
Unassigned | ||
neutron |
Fix Released
|
Medium
|
Rodolfo Alonso | ||
neutron (Ubuntu) |
Fix Released
|
Medium
|
Unassigned | ||
Bionic |
Fix Released
|
High
|
Unassigned | ||
Focal |
Fix Released
|
Medium
|
Unassigned | ||
Groovy |
Fix Released
|
Medium
|
Unassigned |
Bug Description
[Impact]
neutron does not remove all trace of remote sg conj ids when deleting a security group.
[Test Case]
* deploy openstack (no particular feature needed)
* create two networks N1, N2 with security groups SG1, SG2 respectively
* SG2 must have a custom ingress tcp rule from remote SG1
* create a vm on each network, make a note of their fixed_ip then delete those vms
* on compute host running VM2 do the following:
* sudo ovs-ofctl dump-flows br-int table=82| grep <vm1-ip>
* sudo ovs-ofctl dump-flows br-int table=82| egrep "conjunction(
* the above should not return anything
[Regression Potential]
Since the flows being deleted belong to deleted ports their deletion is not expected to have a noticeable impact but as this bug describes, their existance could be having an unexpected impact on ports that have a security that happens to share the same conjunction id.
-------
When any port in the OVS agent is using a SG, is marked to be deleted. This deletion process is done in [1].
The SG deletion process consists on removing any reference of this SG from the firewall and the SG port map. The firewall removes this SG in [2].
The information of a SG is stored in:
- ConjIPFlowManag
ConjIdMap.
- ConjIPFlowManag
self.
When a SG is removed, this reference should be deleted both from "conj_id_map" and "conj_ids". From "conj_id_map" is correctly removed in [3]. But from "conj_ids" is not being deleted properly. Instead of the current logic, what we should do is to walk through the nested dictionary and remove any entry with "remote_sg_id" == "sg_id" (<-- SG ID to be removed).
The current implementation leaves some "remote_sg_id" in the nested dictionary "conj_ids". That could cause:
- A memory leak in the OVS agent, storing in memory those unneeded remote SG.
- A increase in the complexity of the OVS rules, adding those unused SG (actually the conj_ids related to those SG)
- A security breach between SGs if the conj_ids left in an unused SG is deleted and reused again (the FW stores the unused conj_ids to be recycled in later rules).
[1]https:/
[2]https:/
[3]https:/
Related branches
- Corey Bryant: Approve
-
Diff: 425 lines (+403/-0)3 files modifieddebian/changelog (+7/-0)
debian/patches/lp1881157-remote-sg-is-left-behind.patch (+395/-0)
debian/patches/series (+1/-0)
Changed in neutron: | |
assignee: | nobody → Rodolfo Alonso (rodolfo-alonso-hernandez) |
Changed in neutron: | |
importance: | Undecided → Medium |
Changed in neutron (Ubuntu Groovy): | |
status: | New → Fix Committed |
Changed in neutron (Ubuntu Focal): | |
status: | New → Fix Released |
description: | updated |
tags: | added: sts-sru-needed |
Changed in neutron (Ubuntu): | |
importance: | Undecided → Medium |
Changed in neutron (Ubuntu Bionic): | |
importance: | Undecided → Medium |
Changed in neutron (Ubuntu Focal): | |
importance: | Undecided → Medium |
Changed in neutron (Ubuntu Groovy): | |
importance: | Undecided → Medium |
Changed in neutron (Ubuntu Bionic): | |
importance: | Medium → High |
status: | New → Triaged |
Changed in neutron (Ubuntu Groovy): | |
status: | Fix Committed → Fix Released |
Changed in neutron: | |
status: | New → Fix Released |
Hello:
I found an "easy way" to reproduce this issue. We need first to create two SGs:
- The first one without any specific rule (the default ones), "SG1"
- The other one, "SG2", accepting a custom TCP rule (ingress) from "SG1". It is important to use custom TCP rule because the way the conj_id is generated depends on this [1].
Then we need to create two networks (with one subnet per network).
Then we need to create two VMs. VM1 in net1 and SG1, VM2 in net2 and SG2. When we delete both, because of [2], some rules are still in the OVS. This is because in [2], we use the SG ID to retrieve, from "self.conj_id_map" the conj_ids. Then we use those conj_ids to clean "self.conj_id_map".
The problem we have here: in "self.conj_id_map" we store the conj_id generated in "_conj_id_factory". This conj_id is a number divisible by 8. But in "self.conj_id_map" we store the conj_id plus the priority given [3].
Than means the "sg_removed" method [2] does not clean correctly the flows for some specific ports (and the assigned IPs). If create again VMs with ports using those IP addresses, even if those VMs/ports are not assigned to SG1, they will still have a rules like: 0x2f9dd929399d8 1fa, duration=3090.772s, table=82, n_packets=0, n_bytes=0, idle_age=3090, 71,ct_state= +new-est, ip,reg6= 0x3,nw_ src=10. 2.0.29 actions= conjunction( 27,1/2) 0x2f9dd929399d8 1fa, duration=2916.283s, table=82, n_packets=0, n_bytes=0, idle_age=2923, 71,ct_state= +new-est, icmp,reg5= 0x16 actions= conjunction( 27,2/2)
cookie=
priority=
cookie=
priority=
That could be use by the VM with IP address 10.2.0.29 to connect to the (IP addresses/port) represented by conj_id 27.
Regards.
[1]https:/ /github. com/openstack/ neutron/ blob/31280695a2 6cdcf211cb964ac 5f401296398a19f /neutron/ agent/linux/ openvswitch_ firewall/ rules.py# L158-L174 /github. com/openstack/ neutron/ blob/31280695a2 6cdcf211cb964ac 5f401296398a19f /neutron/ agent/linux/ openvswitch_ firewall/ firewall. py#L399- L422 /github. com/openstack/ neutron/ blob/31280695a2 6cdcf211cb964ac 5f401296398a19f /neutron/ agent/linux/ openvswitch_ firewall/ firewall. py#L390
[2]https:/
[3]https:/