OVS port pulled from under dnsmasq

Bug #1624701 reported by Armando Migliaccio
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Critical
Kevin Benton

Bug Description

Change [1] has triggered an issue with the DHCP agent. This can be reproduced as below:

- A subnet gets deleted
- A dhcp port is deleted
- A notification is sent to the agent
- DHCP agent deletes the port from OVS without killing dnsmasq first
- dnsmasq is holding a file handle to the interface still
- To this many times
- OVS goes nuts with [2], where traces 'added interface tap%% on port ##' happens a gazillion time pointing to the same OVS and of port.

[1] https://review.openstack.org/#/c/355117/
[2] http://logs.openstack.org/74/370974/2/check/gate-tempest-dsvm-neutron-dvr-ubuntu-xenial/4d01980/logs/openvswitch/ovs-vswitchd.txt.gz

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :
Changed in neutron:
milestone: none → newton-rc2
importance: Undecided → High
tags: added: l3-ipam-dhcp
tags: added: newton-rc-potential
Changed in neutron:
status: New → Confirmed
assignee: nobody → Kevin Benton (kevinbenton)
Changed in neutron:
status: Confirmed → In Progress
Revision history for this message
Kevin Benton (kevinbenton) wrote :

This has to be fixed for RC2. It makes OVS completely unstable otherwise.

Changed in neutron:
importance: High → Critical
Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

Could this affect other stable branches?, checking the code.

Revision history for this message
Miguel Angel Ajo (mangelajo) wrote :

I mean, not by [1], but because of the way we delete ports in the linux/dhcp.py driver.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/371890
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=251922f5de2de35fb4766da6ab3e7a195c30310c
Submitter: Jenkins
Branch: master

commit 251922f5de2de35fb4766da6ab3e7a195c30310c
Author: Kevin Benton <email address hidden>
Date: Thu Sep 15 14:10:20 2016 -0700

    Disable DHCP on agent port removal

    The previous logic was just ripping the interface out without
    stopping dnsmasq. This would lead to a file handle remaining to the
    interface which would cause OVS to completely freak out and assign
    the same ofport to multiple ports.

    This preserves the behavior introduced in
    I40b85033d075562c43ce4d0e68296211b3241197 but just fully disables
    DHCP rather than relying on an exception generation to cause the
    resync.

    Closes-bug: #1624701
    Change-Id: Icdd9ac136eeb3707c912853b134dbb58109e6940

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/372595

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/newton)

Reviewed: https://review.openstack.org/372595
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=2f44402777a662fb68a069443b41c75b68b05287
Submitter: Jenkins
Branch: stable/newton

commit 2f44402777a662fb68a069443b41c75b68b05287
Author: Kevin Benton <email address hidden>
Date: Thu Sep 15 14:10:20 2016 -0700

    Disable DHCP on agent port removal

    The previous logic was just ripping the interface out without
    stopping dnsmasq. This would lead to a file handle remaining to the
    interface which would cause OVS to completely freak out and assign
    the same ofport to multiple ports.

    This preserves the behavior introduced in
    I40b85033d075562c43ce4d0e68296211b3241197 but just fully disables
    DHCP rather than relying on an exception generation to cause the
    resync.

    Closes-bug: #1624701
    Change-Id: Icdd9ac136eeb3707c912853b134dbb58109e6940
    (cherry picked from commit 251922f5de2de35fb4766da6ab3e7a195c30310c)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 9.0.0.0rc2

This issue was fixed in the openstack/neutron 9.0.0.0rc2 release candidate.

tags: removed: newton-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 10.0.0.0b1

This issue was fixed in the openstack/neutron 10.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.