ovs-ctl ops may restart openvswitch during upgrade

Bug #1695893 reported by Brent Eagles on 2017-06-05
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
High
Unassigned
Newton
Undecided
Unassigned

Bug Description

Note: This bug is calling upgrades to fail for Mitaka->Newton but may affect newer releases as well.

os-net-config was updated in newton to introduce to set a default value for failmode on OVS bridges, resulting in a restart of the interface. The restart of OVS interfaces and bridges invokes ifdown-ovs and ifup-ovs which invokes ovs-ctl that appears to restart openvswitch when an upgrade to OVS 2.6 occurs. This results a break in connectivity and causes the upgrade to fail.

A possible workaround is to apply os-net-package upgrade and interface changes before openvswitch is updated.

Brent Eagles (beagles) wrote :

Log snippet:

May 22 19:09:44 localhost os-collect-config: [2017/05/22 11:09:44 PM] [INFO] No changes required for vlan interface: vlan301
May 22 19:09:44 localhost os-collect-config: [2017/05/22 11:09:44 PM] [INFO] running ifdown on interface: vlan200
May 22 19:09:44 localhost systemd: Starting Open vSwitch Database Unit...
May 22 19:09:44 localhost ovs-ctl: ovsdb-server is already running.
May 22 19:09:44 localhost ovs-ctl: Enabling remote OVSDB managers [ OK ]
May 22 19:09:44 localhost systemd: Stopping Open vSwitch...
May 22 19:09:44 localhost systemd: Stopped Open vSwitch.
May 22 19:09:44 localhost ovs-ctl: Killing ovsdb-server (830) [ OK ]
May 22 19:09:44 localhost systemd: Stopped Open vSwitch Database Unit.
May 22 19:09:44 localhost ovs-vsctl: ovs|00001|vsctl|INFO|Called as ovs-vsctl -t 10 -- --if-exists del-port br-infra vlan200
May 22 19:09:44 localhost ovs-vsctl: ovs|00002|db_ctl_base|ERR|unix:/var/run/openvswitch/db.sock: database connection failed (No such file or directory)
May 22 19:09:44 localhost os-collect-config: [2017/05/22 11:09:44 PM] [INFO] running ifdown on interface: eth2
May 22 19:09:44 localhost systemd: Starting Open vSwitch Database Unit...
May 22 19:09:44 localhost ovs-ctl: Backing up database to /etc/openvswitch/conf.db.backup7.12.1-2211824403 [ OK ]
May 22 19:09:44 localhost ovs-ctl: Compacting database [ OK ]
May 22 19:09:44 localhost ovs-ctl: Converting database schema [ OK ]
May 22 19:09:44 localhost ovs-ctl: Starting ovsdb-server [ OK ]

May 22 19:09:48 localhost systemd: Starting Open vSwitch Forwarding Unit...
May 22 19:09:48 localhost ovs-vswitchd: ovs|00006|bridge|ERR|another ovs-vswitchd process is running, disabling this process (pid 29426) until it goes away
May 22 19:12:53 localhost systemd: ovs-vswitchd.service start operation timed out. Terminating.
May 22 19:12:53 localhost systemd: Failed to start Open vSwitch Forwarding Unit.
May 22 19:12:53 localhost systemd: Dependency failed for Open vSwitch.
May 22 19:12:53 localhost systemd: Job openvswitch.service/start failed with result 'dependency'.

Changed in tripleo:
status: New → Confirmed
importance: Undecided → High
description: updated
Marios Andreou (marios-b) wrote :

adding for context, this is also discussed at https://bugzilla.redhat.com/show_bug.cgi?id=1454640

Brent Eagles (beagles) on 2017-06-06
Changed in tripleo:
assignee: nobody → Brent Eagles (beagles)
Changed in tripleo:
status: Confirmed → Triaged
milestone: none → pike-3

Reviewed: https://review.openstack.org/471381
Committed: https://git.openstack.org/cgit/openstack/tripleo-heat-templates/commit/?id=9f8ba2c052e04c1ba8db756a48181a54c9cd8f68
Submitter: Jenkins
Branch: stable/newton

commit 9f8ba2c052e04c1ba8db756a48181a54c9cd8f68
Author: Brent Eagles <email address hidden>
Date: Tue Jun 6 11:13:48 2017 -0230

    Reconfigure interfaces before updating openvswitch

    os-net-config might restart interfaces after the package has been
    updated if the interface configurations have changed. However,
    restarting interfaces may cause openvswitch to be started/restarted as a
    side-effect during updates, possibly breaking connectivity during
    upgrades.

    Note: this is a newton only fix. Network related upgrade operations
    function differently starting in Ocata.

    Change-Id: I187922d6017ea72f2b26caeaf742e4de9478aced
    Closes-bug: #1695893

tags: added: in-stable-newton
Emilien Macchi (emilienm) wrote :

There are no currently open reviews on this bug, changing the status back to the previous state and unassigning. If there are active reviews related to this bug, please include links in comments.

Changed in tripleo:
assignee: Brent Eagles (beagles) → nobody
Changed in tripleo:
milestone: pike-3 → pike-rc1
Ben Nemec (bnemec) wrote :

This was newton-specific and the fix has merged.

Changed in tripleo:
status: Triaged → Fix Released

This issue was fixed in the openstack/tripleo-heat-templates 5.3.1 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.