When ovn-central 'leader: ovnnb_db' moves to a new unit, neutron-server needs restarting to reconnect

Bug #1921986 reported by Xav Paice
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Neutron API Charm
New
Undecided
Unassigned

Bug Description

cs:ovn-central-5, cs:neutron-api-292, Focal, openstack-origin=distro.

When the leader for ovnnb_db moved from one ovn-central unit to another (due to a service restart), the neutron-server logs started to show tracebacks (see https://pastebin.ubuntu.com/p/QhMy58NyFb/).

This resulted in the Neutron API being either unresponsive or showing that all Neutron agents were marked XXX, depending on whether the request timed out or not.

Workaround: restarting neutron-server after the move.

Revision history for this message
Frode Nordahl (fnordahl) wrote :

Hello Xav, thank you for reporting this issue.

We have recently triaged and have a SRU underway for a similar issue in bug 1907686. Would it be possible for you to look for any Traceback similar to https://bugs.launchpad.net/charm-ovn-chassis/+bug/1907686/comments/9 prior to the timeouts?

Revision history for this message
Xav Paice (xavpaice) wrote :

from neutron-ovn-metadata-agent.log:

2021-03-31 03:12:52.228 6968 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.130.12.66:6642: clustered database server is not cluster leader; trying another server
2021-03-31 03:12:52.230 6968 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.130.12.66:6642: connection closed by client
2021-03-31 03:12:59.195 4114 INFO ovsdbapp.backend.ovs_idl.vlog [-] ssl:10.130.11.131:6642: connected

I see similar in the neutron-server.log, but no evidence of ECONNREFUSED in either.

Once the SRU lands, we can determine if that fixes this issue.

Revision history for this message
Hua Zhang (zhhuabj) wrote :

The customer confirmed that using temporary PPA mentioned in [1] had fixed this problem as well.

[1] https://bugs.launchpad.net/charm-ovn-chassis/+bug/1907686/comments/15

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.