Comment 5 for bug 1934917

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

First, a bit of context. When the OVS agent is restarted, the conjunction IDs are not stored (same as, for example, the internal VLAN IDs for the different networks); these are internal variables.

When a port SG is updated (for example during the OVS agent restart), the former rules are overwritten. This is because OVS cannot have two rules with the same condition. That means we can change the conjunction ID of a port rule during the OVS agent restart. E.g.:

(conj_id=1008)
 cookie=0x1, duration=6299.950s, table=82, n_packets=0, n_bytes=0, idle_age=6434, priority=70,conj_id=1008,ct_state=+est-rel-rpl,ip,reg5=0xc actions=load:0x3f0->NXM_NX_REG7[],output:12

will be overwritten by:

(conj_id=2008)
 cookie=0x2, duration=7000.000s, table=82, n_packets=0, n_bytes=0, idle_age=6434, priority=70,conj_id=2008,ct_state=+est-rel-rpl,ip,reg5=0xc actions=load:0x3f0->NXM_NX_REG7[],output:12

However, as exposed by Thomas, if a port SG rule receives a new conjunction ID matching a previous one and (1) the match rules (those matching the traffic parameters) are set and (2) the action rules (those matching the conjunction ID) are not, then we'll have the issue presented in this bug. In other words: we'll match for traffic A and we'll apply a rule for traffic B.

We can do two actions (both independently will help):
1) When (re)starting the OVS agent, dump the flows of br-int. Then parse all rules looking for the biggest conjunction ID and then limit the minimum new conjunction ID to be provided in the OVS agent instance to a bigger number. Of course, we should consider the maximum conjunction ID number (unsigned 32bit [1]). The method generating the conjunction IDs should take care of it, same as the method calculating the minimum conjunction ID of the instance (based on the current OF rules).

2) As Thomas commented, the flows are written in batches [2]. The maximum number of flows to be written is 100. If the traffic match flows and the actions flows (those ones sending the traffic to the correct port) are written in different batches, we can end in the same scenario. All flows related to one port should be written in the same batch, regardless of the number of flows.

About the embargo of this bug, it is very unlikely that a VM user could use this behaviour to inject traffic to other destination. The user cannot know when the OVS agent is restarted and in case of hitting this issue, what the destination is. I suggest to make it public.

Regards.

[1]https://www.openvswitch.org/support/dist-docs-2.5/ovs-ofctl.8.html
[2]https://github.com/openstack/neutron/blob/0ccfed0ae13182f820e6a8c11a2fa801506f3a3a/neutron/agent/common/ovs_lib.py#L471