neutron-openvswitch-agent failed to add default table

Bug #1642223 reported by sunzuohua
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
shihanzhang

Bug Description

Problem

After power down and power off the host, the tenant network is not available.

The cause is that default flow tables of br-int is not setup successfully when neutron-openvswitch-agent.starts:

    1) The neutron-openvswitch-agent fails to add the flow table 0 but adds the flow table 23 successfully in setup_default_table(). The flows look like as follows:

    cookie=0x8f4c30f934586d9c, duration=617166.781s, table=0, n_packets=31822416, n_bytes=2976996304, idle_age=0, hard_age=65534, priority=2,in_port=1 actions=drop
    cookie=0x8f4c30f934586d9c, duration=617167.023s, table=23, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=0 actions=drop
    cookie=0x8f4c30f934586d9c, duration=617167.007s, table=24, n_packets=0, n_bytes=0, idle_age=65534, hard_age=65534, priority=0 actions=drop

    2) In the rpc_roop, the neutron-openvswitch-agent will check the ovs status by checking the flow table 23, and the flow table 23 exists. The neutron-openvswitch-agent thinks the ovs is normal, but the flow table 0 does not exist and the network connection is not availble.

Affected Neutron version:
Newton

Possible Solution:
Check the default table 0 or check all the default flow tables in check_ovs_status().
Or add the default flow table 23 first and then add the default table 0 in setup_default_table()

Thanks

sunzuohua (zuohuasun)
Changed in neutron:
assignee: nobody → sunzuohua (zuohuasun)
Revision history for this message
Kevin Benton (kevinbenton) wrote :

If I understand what you are saying, traffic does not flow after a compute host restart.

Changed in neutron:
status: New → Triaged
importance: Undecided → High
tags: added: ovs
Revision history for this message
Rossella Sblendido (rossella-o) wrote :

It's not clear to me what's happening. Why is table 0 not set up correctly and table 23 is? If you restarted the machine, all the flows should be gone. Can you clarify what you are doing please and attach relevant log snippets or flows dumps?

Revision history for this message
sunzuohua (zuohuasun) wrote :

When the ovs agent is restarted, table 0 and table 23 will be set up. Then in the rpc loop, the ovs agent will check the ovs status by dumping table 23 and reset up flow tables if table 23 does not exist. If the ovs is not ready, table 0 and table 23 are not set up correctly. The log is as follows:
   .
2016-10-26 18:18:56.031 3862 ERROR neutron.agent.linux.utils [-]
Command: ['sudo', 'ovs-ofctl', 'add-flows', 'br-int', '-']
Exit code: 1
Stdin: hard_timeout=0,idle_timeout=0,priority=1,actions=normal
Stdout:
Stderr: ovs-ofctl: br-int is not a bridge or a socket

2016-10-26 18:18:56.031 3862 ERROR neutron.agent.common.ovs_lib [-] Unable to execute ['ovs-ofctl', 'add-flows', 'br-int', '-']. Exception:
Command: ['sudo', 'ovs-ofctl', 'add-flows', 'br-int', '-']
Exit code: 1
Stdin: hard_timeout=0,idle_timeout=0,priority=1,actions=normal
Stdout:
Stderr: ovs-ofctl: br-int is not a bridge or a socket

2016-10-26 18:18:56.041 3862 ERROR neutron.agent.linux.utils [-]
Command: ['sudo', 'ovs-ofctl', 'add-flows', 'br-int', '-']
Exit code: 1
Stdin: hard_timeout=0,idle_timeout=0,priority=0,table=23,actions=drop
Stdout:
Stderr: ovs-ofctl: br-int is not a bridge or a socket

2016-10-26 18:18:56.041 3862 ERROR neutron.agent.common.ovs_lib [-] Unable to execute ['ovs-ofctl', 'add-flows', 'br-int', '-']. Exception:
Command: ['sudo', 'ovs-ofctl', 'add-flows', 'br-int', '-']
Exit code: 1
Stdin: hard_timeout=0,idle_timeout=0,priority=0,table=23,actions=drop
Stdout:
Stderr: ovs-ofctl: br-int is not a bridge or a socket

But in our error log, we can see that the table 0 is not set up correctly and there is no error log about setting up table 23. The log is as follows:

2016-10-26 18:18:13.551 3832 ERROR neutron.agent.linux.utils [-]
Command: ['sudo', 'ovs-ofctl', 'add-flows', 'br-int', '-']
Exit code: 1
Stdin: hard_timeout=0,idle_timeout=0,priority=1,actions=normal
Stdout:
Stderr: ovs-ofctl: br-int is not a bridge or a socket

2016-10-26 18:18:13.551 3832 ERROR neutron.agent.common.ovs_lib [-] Unable to execute ['ovs-ofctl', 'add-flows', 'br-int', '-']. Exception:
Command: ['sudo', 'ovs-ofctl', 'add-flows', 'br-int', '-']
Exit code: 1
Stdin: hard_timeout=0,idle_timeout=0,priority=1,actions=normal
Stdout:
Stderr: ovs-ofctl: br-int is not a bridge or a socket

The table 23 is set up correctly and the table 0 is not set up correctly. In the rpc loop, the ovs agent find the table 23 exists and does not reset up flow tables.

Revision history for this message
sunzuohua (zuohuasun) wrote :

The log is from kilo, but the problem also exists in Newton

Revision history for this message
sunzuohua (zuohuasun) wrote :

I guess the problem is that:

When I restarted the machine and the ovs is not ready, the ovs agent began to set up table 0 and failed.
When the ovs agent began to set up table 23, the ovs is ready and table 23 is set up correctly.
In the rpc loop, the ovs agent checked the table 23 for ovs status and found it was NORMAL. The flows never became normal untill I restarted the ovs agent or the ovs.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/415804

Changed in neutron:
assignee: sunzuohua (zuohuasun) → shihanzhang (shihanzhang)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/415804
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=0af6e6ded04c994b788836906d94fe097b581a0b
Submitter: Jenkins
Branch: master

commit 0af6e6ded04c994b788836906d94fe097b581a0b
Author: shihanzhang <email address hidden>
Date: Fri Dec 30 14:23:12 2016 +0800

    Change the order of installing flows for br-int

    For ovs-agent, it uses CANARY_TABLE table to check ovs status, when
    ovs-agent restarts, it should firstly install flows for CANARY_TABLE
    table.

    Closes-bug: #1642223
    Change-Id: I2aebbe5faca2fd4ec137255f0413cc2c129a4588

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/422464

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/newton)

Reviewed: https://review.openstack.org/422464
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=7baeff219fe2af0c8789d7f35bac4fadcbf2ea06
Submitter: Jenkins
Branch: stable/newton

commit 7baeff219fe2af0c8789d7f35bac4fadcbf2ea06
Author: shihanzhang <email address hidden>
Date: Fri Dec 30 14:23:12 2016 +0800

    Change the order of installing flows for br-int

    For ovs-agent, it uses CANARY_TABLE table to check ovs status, when
    ovs-agent restarts, it should firstly install flows for CANARY_TABLE
    table.

    Closes-bug: #1642223
    Change-Id: I2aebbe5faca2fd4ec137255f0413cc2c129a4588
    (cherry picked from commit 0af6e6ded04c994b788836906d94fe097b581a0b)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 10.0.0.0b3

This issue was fixed in the openstack/neutron 10.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 9.3.0

This issue was fixed in the openstack/neutron 9.3.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.