KeyError in ovs_neutron_agent._bind_devices

Bug #1452903 reported by Matt Riedemann
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Undecided
Kevin Benton

Bug Description

Seeing this all over the gate lately:

http://logs.openstack.org/95/176395/1/gate/gate-tempest-dsvm-neutron-full/37e1139/logs/screen-q-agt.txt.gz?level=TRACE#_2015-05-07_19_27_38_157

2015-05-07 19:27:38.157 ERROR neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-0266e813-4a74-47c6-a1ed-6acc81317cb9 None None] Error while processing VIF ports
2015-05-07 19:27:38.157 21868 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent Traceback (most recent call last):
2015-05-07 19:27:38.157 21868 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent File "/opt/stack/new/neutron/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py", line 1637, in rpc_loop
2015-05-07 19:27:38.157 21868 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent ovs_restarted)
2015-05-07 19:27:38.157 21868 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent File "/opt/stack/new/neutron/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py", line 1422, in process_network_ports
2015-05-07 19:27:38.157 21868 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent self._bind_devices(need_binding_devices)
2015-05-07 19:27:38.157 21868 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent File "/opt/stack/new/neutron/neutron/plugins/openvswitch/agent/ovs_neutron_agent.py", line 736, in _bind_devices
2015-05-07 19:27:38.157 21868 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent lvm = self.local_vlan_map[port_detail['network_id']]
2015-05-07 19:27:38.157 21868 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent KeyError: u'323b3bcb-1530-4c76-83b1-49a35f255179'
2015-05-07 19:27:38.157 21868 TRACE neutron.plugins.openvswitch.agent.ovs_neutron_agent

http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiRXJyb3Igd2hpbGUgcHJvY2Vzc2luZyBWSUYgcG9ydHNcIiBBTkQgbWVzc2FnZTpcIktleUVycm9yXCIgQU5EIHRhZ3M6XCJzY3JlZW4tcS1hZ3QudHh0XCIiLCJmaWVsZHMiOltdLCJvZmZzZXQiOjAsInRpbWVmcmFtZSI6IjYwNDgwMCIsImdyYXBobW9kZSI6ImNvdW50IiwidGltZSI6eyJ1c2VyX2ludGVydmFsIjowfSwic3RhbXAiOjE0MzEwMzIyNzE5MTR9

2321 hits in 7 days. It's not directly related to failures, the jobs are still 93% successful, but it's spiking in the last few days.

Revision history for this message
Matt Riedemann (mriedem) wrote :

Looks like this is in some polling loop so maybe expected and we're just in a teardown of a port or something, not sure.

Changed in neutron:
status: New → Confirmed
Revision history for this message
Sean M. Collins (scollins) wrote :
Revision history for this message
Matt Riedemann (mriedem) wrote :

Given the timing of the spike, this is probably the culprit:

https://github.com/openstack/neutron/commit/bd5373b670cdd7f21f8a1ece98fde6be9fda71ab

Revision history for this message
Sean M. Collins (scollins) wrote :

It's possibly being caused by https://review.openstack.org/#/c/118274/ which added code that modifies the local_vlan_map data structure in the agent

Changed in neutron:
assignee: nobody → Kevin Benton (kevinbenton)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/181185

Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/181185
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=3a1175b88a436eecf00b8f04e5cc9f5cbce3ee06
Submitter: Jenkins
Branch: master

commit 3a1175b88a436eecf00b8f04e5cc9f5cbce3ee06
Author: Kevin Benton <email address hidden>
Date: Sat May 2 23:10:52 2015 -0700

    Check for missing network in _bind_devices

    _bind_devices was making the assumption that the ports it
    was operating had local VLAN map entries for their network.
    This wasn't the case when a network was deleted right before
    _bind_ports was called because the VLAN was reclaimed.

    This patch just checks to see if the the network ID has an entry
    in the map. If not, it skips the port. The port will be handled
    on the next scan_ports iteration when the agent will discover that
    the port is no longer defined on the plugin and it will be placed
    in the DEAD vlan.

    Change-Id: Ica51d727aceb41848fec0f4edbd16916365941ee
    Closes-Bug: #1452903

Changed in neutron:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (neutron-pecan)

Fix proposed to branch: neutron-pecan
Review: https://review.openstack.org/185072

Thierry Carrez (ttx)
Changed in neutron:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: liberty-1 → 7.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.