Neutron ml2 linux bridge agent fails to clean up bridges on high volumes of deletes

Bug #1698271 reported by Collin M.
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Low
Kevin Benton

Bug Description

The neutron linux bridge ml2 plugin fails to clean up brq bridges on nodes hosting dhcp agents.

Environment:
Mirantis OpenStack 8 / Liberty
Linux Bridge ml2 plugin
vxlan segmentation

The workload on the environment involves large numbers of network and subnet creation and deletion over very short periods of time.

It appears that the issue arises from the network getting marked as deleted before the cleanup has an opportunity to take place:

An example log section:

2017-06-15 00:19:11.594 449231 DEBUG neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-d682e11a-f441-4417-b584-07ecc6f7737a ebfe3ac655d0babdd553ea5f8e1ef87bc94dece509a5bc26cd7d26fc58532cfc a5c292ffb8cf484ab74d7954a94cfad1 - - -] network_delete received network_delete /usr/lib/python2.7/dist-packages/neutron/plugins/ml2/drivers/linuxbridge/agent/linuxbridge_neutron_agent.py:770
2017-06-15 00:19:11.595 449231 ERROR neutron.plugins.ml2.drivers.linuxbridge.agent.linuxbridge_neutron_agent [req-d682e11a-f441-4417-b584-07ecc6f7737a ebfe3ac655d0babdd553ea5f8e1ef87bc94dece509a5bc26cd7d26fc58532cfc a5c292ffb8cf484ab74d7954a94cfad1 - - -] Network f834212c-1e11-4bc1-b3ee-89a61b5ee6ed is not available.

A collection of example bridges that failed to get cleaned up:
brqfed1d4bd-de 8000.92bb5a5feba5 no vxlan-77488
brqfeebf10d-df 8000.12e256acb4d7 no vxlan-72765
brqff281599-d2 8000.f26313c6ba80 no vxlan-66828
brqff2dca83-db 8000.c6f2263ea94d no vxlan-71111
brqff40fcfa-8e 8000.3621679c4a97 no vxlan-75989
brqff96bdd2-5c 8000.4efaf9b1ce1f no vxlan-69185
brqffe6412f-20 8000.ca59d8a1a1aa no vxlan-66860
brqffea1c9e-82 8000.8a917769c6e4 no vxlan-67359

Collin M. (ziggit)
description: updated
Revision history for this message
Kevin Benton (kevinbenton) wrote :

This is fundamentally an issue with the network being deleted either before the port was processed on the agent in that network or while the agent was offline.

We can adjust the logic to always try to delete the network associated with a deleted network's ID. However, that won't solve the case where bridges are left behind if a network is deleted while the agent is offline.

To address the offline case I think you will always need to run a manual cleanup script unless we change the agent to try to delete bridges it doesn't recognize, but that seems risky.

Changed in neutron:
status: New → Confirmed
importance: Undecided → Low
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/475317

Changed in neutron:
assignee: nobody → Kevin Benton (kevinbenton)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/475317
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=499faa307489811b9ae2373427f72d95e62aeab5
Submitter: Jenkins
Branch: master

commit 499faa307489811b9ae2373427f72d95e62aeab5
Author: Kevin Benton <email address hidden>
Date: Mon Jun 19 03:05:39 2017 -0700

    Always try to delete bridge for ID on network_delete

    If network_deletes are received before port creates
    are processed, the agent might not have the network in
    it's map even though it has a bridge to delete.

    This adjusts the logic to always try to delete the bridge
    corresponding to a network_id even if it's not in the
    network_map yet.

    Change-Id: I5e72bff2ffd9568f272ed48187ad543ab5a3d1ec
    Closes-Bug: #1698271

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 11.0.0.0b3

This issue was fixed in the openstack/neutron 11.0.0.0b3 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.