ovs-fw does not reinstate GRE conntrack entry
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
neutron |
Triaged
|
High
|
Unassigned |
Bug Description
We have VMs running GRE tunnels between them with OVSFW and SG
implemented along with GRE conntrack helper loaded on the hypervisor.
GRE works as expected but the tunnel breaks whenever there is a
neutron ovs agent event causing some exception like the below AMQP
timeouts or OVSFW port not found :
AMQP Timeout:
2017-04-07 19:07:03.001 5275 ERROR
neutron.
MessagingTimeout: Timed out waiting for a reply to message ID
4035644808d24
2017-04-07 19:07:03.001 5275 ERROR
neutron.
2017-04-07 19:07:03.003 5275 WARNING oslo.service.
Function
'neutron.
run outlasted interval by 120.01 sec
2017-04-07 19:07:03.041 5275 INFO
neutron.
Agent has just been revived. Doing a full sync.
2017-04-07 19:07:06.747 5275 INFO
neutron.
[req-
full sync.
2017-04-07 19:07:06.841 5275 INFO
neutron.
[req-
with plugin!
OVSFWPortNOtFound:
2017-03-30 18:31:05.048 5160 ERROR
neutron.
self.
2017-03-30 18:31:05.048 5160 ERROR
neutron.
"/openstack/
line 272, in prepare_port_filter
2017-03-30 18:31:05.048 5160 ERROR
neutron.
of_port = self.get_
2017-03-30 18:31:05.048 5160 ERROR
neutron.
"/openstack/
line 246, in get_or_
2017-03-30 18:31:05.048 5160 ERROR
neutron.
OVSFWPortNotF
2017-03-30 18:31:05.048 5160 ERROR
neutron.
OVSFWPortNotF
managed by this agent.
2017-03-30 18:31:05.048 5160 ERROR
neutron.
2017-03-30 18:31:05.072 5160 INFO
neutron.
[req-
with plugin!
The agent throws out of sync messages and starts to initialize neutron
ports once again along with fresh SG rules.
2017-04-07 19:07:07.110 5275 INFO neutron.
[req-
for devices set([u'
2017-04-07 19:07:07.215 5275 ERROR
neutron.
[req-
4b14619f-
During this process, when it prepares new filters for all ports, its
marking the conntrack entry for certain GRE connection(high traffic)
as invalid.
root@server:
ipv4 2 gre 47 178 src=1.1.1.203 dst=2.2.2.66 srckey=0x0 dstkey=0x0
src=2.2.2.66 dst=1.1.1.203 srckey=0x0 dstkey=0x0 [ASSURED] mark=1
zone=5 use=1
ipv4 2 gre 47 179 src=5.5.5.104 dst=4.4.4.187 srckey=0x0 dstkey=0x0
src=4.4.4.187 dst=5.5.5.104 srckey=0x0 dstkey=0x0 [ASSURED] mark=0
zone=5 use=1
And that connection state remains invalid, unless someone reboots the
VM, or flushes the connection directly on the conntrack or through
OVS.
We have a blanket any protocol any port any IP SG rule during this
scenario, we even tried adding specific rules to allow IP 47 for GRE.
But nothing fixed this problem.
Was checking for ovs-conntrack helper specific bugs and came across
patchwork.
triggered in the above scenario ? Is this a bug in the ovs-fw code or
this something on the ovs-conntrack implementation.
OpenStack Version : Newton.
Hypervisor OS : Ubuntu 16.04.2
Kernel version : 4.4.0-70-generic
OVS version : 2.6.1
please see: https:/
affects: | neutron → null-and-void |
information type: | Public → Private |
Changed in null-and-void: | |
status: | New → Invalid |
information type: | Private → Public |
affects: | null-and-void → neutron |
Changed in neutron: | |
status: | Invalid → New |
affects: | neutron → null-and-void |
information type: | Public → Private |
Changed in null-and-void: | |
status: | New → Invalid |
information type: | Private → Public |
affects: | null-and-void → neutron |
Changed in neutron: | |
status: | Invalid → New |
Actually this bug is a dup of [1]; marking as such.
[1] https:/ /bugs.launchpad .net/neutron/ +bug/1708731