Migration from ml2/ovs fails because instances don't get ssh keys

Bug #1824984 reported by Jakub Libosvar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
networking-ovn
Fix Released
Undecided
Unassigned

Bug Description

Description of problem:
On Port_Binding event ovn metadata agent fails, that leads to no haproxy processes spawned in the namespace and newly created instances won't get access to metadata API.

2019-04-16 09:10:48.961 182681 INFO networking_ovn.agent.metadata.agent [-] Port 52e950ed-efad-4067-b14f-01c695e441e9 in datapath 20e76bff-3093-4a30-bb8a-3d7f17bf4be2 bound to our chassis
2019-04-16 09:10:49.728 182681 ERROR networking_ovn.agent.metadata.agent [-] Configured OVN bridge br-migration cannot be found in the system.: KeyError: u'br-migration'
2019-04-16 09:10:49.728 182681 ERROR networking_ovn.agent.metadata.agent Traceback (most recent call last):
2019-04-16 09:10:49.728 182681 ERROR networking_ovn.agent.metadata.agent File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/agent.py", line 367, in provision_datapath
2019-04-16 09:10:49.728 182681 ERROR networking_ovn.agent.metadata.agent ovs_bridges.remove(self.ovn_bridge)
2019-04-16 09:10:49.728 182681 ERROR networking_ovn.agent.metadata.agent KeyError: u'br-migration'
2019-04-16 09:10:49.728 182681 ERROR networking_ovn.agent.metadata.agent
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event [-] Unexpected exception in notify_loop: KeyError: u'br-migration'
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event Traceback (most recent call last):
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event File "/usr/lib/python2.7/site-packages/ovsdbapp/event.py", line 117, in notify_loop
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event match.run(event, row, updates)
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/agent.py", line 67, in wrapped
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event return f(*args, **kwargs)
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/agent.py", line 92, in run
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event self.agent.update_datapath(str(row.datapath.uuid))
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/agent.py", line 276, in update_datapath
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event self.provision_datapath(datapath)
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/agent.py", line 371, in provision_datapath
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event "the system.", self.ovn_bridge)
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 220, in __exit__
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event self.force_reraise()
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event File "/usr/lib/python2.7/site-packages/oslo_utils/excutils.py", line 196, in force_reraise
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event six.reraise(self.type_, self.value, self.tb)
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event File "/usr/lib/python2.7/site-packages/networking_ovn/agent/metadata/agent.py", line 367, in provision_datapath
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event ovs_bridges.remove(self.ovn_bridge)
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event KeyError: u'br-migration'
2019-04-16 09:10:49.729 182681 ERROR ovsdbapp.event

Version-Release number of selected component (if applicable):
python-networking-ovn-metadata-agent-5.0.2-0.20190307204430.6a774a0.el7ost.noarch

How reproducible:
Always

Steps to Reproduce:
1. Migrate to OVN
2. Spawn an instance with keypair
3.

Actual results:
Instance won't get SSH key injected because of failing metadata

Expected results:

Additional info:
The reason is that ovn-controller is spawned while br-migration bridge exists in the system. ovn-controller spawns ovn-metadata-agent that caches its bridge settings. Later, br-migration is removed and br-int is set in ovsdb, however agent doesn't reflect that because it uses cached value from the time process was started.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to networking-ovn (master)

Reviewed: https://review.opendev.org/655360
Committed: https://git.openstack.org/cgit/openstack/networking-ovn/commit/?id=2037bfe5626c8d451786feae879cfb7aa80c8123
Submitter: Zuul
Branch: master

commit 2037bfe5626c8d451786feae879cfb7aa80c8123
Author: Jakub Libosvar <email address hidden>
Date: Tue Apr 23 14:32:28 2019 +0000

    metadata: Resync agent when misconfiguration is detected

    In case of migration, the metadata agent is kept running with old
    br-migration bridge cached in agent's attribute. Everything else is
    configured correctly but agent still uses old value. This patch adds a
    resync mechanism that's triggered by PortBindingChassisEvent in case
    agent fails to use currently cached bridge and the bridge doesn't exist
    on the system.

    Closes-bug: #1824984
    Change-Id: Ia17ad3476e61a729136622b4356ce59234ab023a

Changed in networking-ovn:
status: New → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to networking-ovn (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/660986

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to networking-ovn (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/660987

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to networking-ovn (stable/rocky)

Reviewed: https://review.opendev.org/660987
Committed: https://git.openstack.org/cgit/openstack/networking-ovn/commit/?id=09ab1ff1f0c2811ab0b12ceb73492b5f81015ef0
Submitter: Zuul
Branch: stable/rocky

commit 09ab1ff1f0c2811ab0b12ceb73492b5f81015ef0
Author: Jakub Libosvar <email address hidden>
Date: Tue Apr 23 14:32:28 2019 +0000

    metadata: Resync agent when misconfiguration is detected

    In case of migration, the metadata agent is kept running with old
    br-migration bridge cached in agent's attribute. Everything else is
    configured correctly but agent still uses old value. This patch adds a
    resync mechanism that's triggered by PortBindingChassisEvent in case
    agent fails to use currently cached bridge and the bridge doesn't exist
    on the system.

    Closes-bug: #1824984
    Change-Id: Ia17ad3476e61a729136622b4356ce59234ab023a
    (cherry picked from commit 2037bfe5626c8d451786feae879cfb7aa80c8123)
    (cherry picked from commit 3ecd9618d3da1e99064924997ee090e9d6dbff7d)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to networking-ovn (stable/stein)

Reviewed: https://review.opendev.org/660986
Committed: https://git.openstack.org/cgit/openstack/networking-ovn/commit/?id=3ecd9618d3da1e99064924997ee090e9d6dbff7d
Submitter: Zuul
Branch: stable/stein

commit 3ecd9618d3da1e99064924997ee090e9d6dbff7d
Author: Jakub Libosvar <email address hidden>
Date: Tue Apr 23 14:32:28 2019 +0000

    metadata: Resync agent when misconfiguration is detected

    In case of migration, the metadata agent is kept running with old
    br-migration bridge cached in agent's attribute. Everything else is
    configured correctly but agent still uses old value. This patch adds a
    resync mechanism that's triggered by PortBindingChassisEvent in case
    agent fails to use currently cached bridge and the bridge doesn't exist
    on the system.

    Closes-bug: #1824984
    Change-Id: Ia17ad3476e61a729136622b4356ce59234ab023a
    (cherry picked from commit 2037bfe5626c8d451786feae879cfb7aa80c8123)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn 7.0.0.0b1

This issue was fixed in the openstack/networking-ovn 7.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn 6.0.1

This issue was fixed in the openstack/networking-ovn 6.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-ovn 5.1.0

This issue was fixed in the openstack/networking-ovn 5.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.