[OVS][FW] In some cases, OVS FW tries to set a OF rule when ofport = -1

Bug #1895677 reported by Rodolfo Alonso
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
In Progress
Undecided
Rodolfo Alonso

Bug Description

Error log: http://paste.openstack.org/show/797881/

That was reported in a internal company channel. I still need to reproduce this issue locally or catch when or how we are calling the FW with a port with ofport=-1. That behavior was captured implementing a fix in Nova for [1][2]; this is no happening right now with the current Nova code.

The possible culprit of this issue could be [3]. Because we need this patch to solve the related bugs, I need to find the condition that triggers the error reported to solve it instead of reverting the patch.

[1]https://bugs.launchpad.net/neutron/+bug/1734320
[2]https://bugs.launchpad.net/neutron/+bug/1815989
[3]https://review.opendev.org/#/c/640258/

Revision history for this message
sean mooney (sean-k-mooney) wrote :

if i swapp to the noop driver i also see

│Sep 15 13:40:06 numa-2 neutron-openvswitch-agent[186235]: DEBUG neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [None req-f619b595-eea9-4612-92e4-0fa0a│
bf05fe0 None None] Failed to remove accepted egress flows for port 449411db-5d36-4ba9-9199-c9b7e87735a2, error: 'NoneType' object has no attribute 'ofport' {{(pid=18623│
5) process_deleted_ports /opt/stack/neutron/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py:694}}

Revision history for this message
Hongbin Lu (hongbin.lu) wrote :

@Rodolfo,

As you mentioned, please provide the reproducing steps and reset the status of this bug to "new". I set the status to "incompleted" for now.

Changed in neutron:
status: New → Incomplete
Revision history for this message
sean mooney (sean-k-mooney) wrote :

the reproducer is to deploy with a pathced nova and boot any vm using the contrack security group driver.

so when you deploy nova with https://review.opendev.org/#/c/602432/ it uses the code in https://review.opendev.org/#/c/640258/ i removed the depens on since it is not merged in neutron.

that the highlights that there are code patch where -1 is still propagated to the security group driver.

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello:

I have detected several issues in the current code. Even without the Nova patch, we can hit some problems in the OVS agent.

ISSUE 1: even when "explicitly_egress_direct" is disabled (the default value is False), the flow deletion code could be called [2]. This section of code should be called only when "explicitly_egress_direct" is True. I'll push a patch for this.

ISSUE 2: if "explicitly_egress_direct" is True, when a port is deleted, this is treated as "removed" [3]. That means this port won't be processed in [2]. That implies when a port is deleted, the explicit egress flows are left behind in br-int. I'll open a bug for this problem.

ISSUE 3: if "explicitly_egress_direct" is True, when a port is deleted, as commented in ISSUE 2, the code does not delete the explicit egress flows. But could happen that this method could be called in the first polling cycle because the OVS agent has detected there is a deleted port. Then the OVS agent RPC server receives the port delete call [4], but this deleted port is treated in the next polling cycle. The "port_info" variable does not contain the deleted port (treated in the previous cycle). The code in [2] tries to read the port from the OVS DB, but is not there anymore.

This can be reproduced by adding an small delay in the "process_deleted_ports".

I'll add this information to the bug related to ISSUE 2.

ISSUE 4: as reported in [5], there is also a problem when using the OVS FW and the Nova patch [6]. This is still under investigation.

Regards.

[1]https://review.opendev.org/#/q/I14fefe289a19b718b247bf0740ca9bc47f8903f4
[2]https://github.com/openstack/neutron/blob/8575f60e86029cd91d5fa6f6be4596a22b1ee35b/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L685-L695
[3]https://github.com/openstack/neutron/blob/8575f60e86029cd91d5fa6f6be4596a22b1ee35b/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L676
[4]https://github.com/openstack/neutron/blob/8575f60e86029cd91d5fa6f6be4596a22b1ee35b/neutron/plugins/ml2/drivers/openvswitch/agent/ovs_neutron_agent.py#L625-L628
[5]http://paste.openstack.org/show/797881/
[6]https://review.opendev.org/#/c/602432/

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.opendev.org/752672

Changed in neutron:
assignee: nobody → Rodolfo Alonso (rodolfo-alonso-hernandez)
status: Incomplete → In Progress
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hi @Sean:

I'm trying to reproduce the case when adding new flows:
  ['ovs-ofctl', 'add-flows', '-O', 'OpenFlow10', 'br-int', '--bundle', '-']

This is with the OVS FW enabled (I see table=60, that means is enabled). I tried the nova patch [1] but I still can't reproduce it. Can you share some logs? Do you know when this happens? Do you have some steps to trigger the error?

Regards.

[1]https://review.opendev.org/#/c/602432/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Rodolfo Alonso Hernandez (<email address hidden>) on branch: master
Review: https://review.opendev.org/752672

Revision history for this message
sean mooney (sean-k-mooney) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.