[SRU] DHCP agent: interface unplug leads to exception

Bug #1498370 reported by Gary Kotton on 2015-09-22
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
High
Gary Kotton
neutron (Ubuntu)
High
Unassigned
Vivid
High
Edward Hope-Morley
Wily
High
Unassigned
Xenial
High
Unassigned

Bug Description

[Impact]

    There are edge cases when the DHCP agent attempts to unplug an interface
    and the device does not exist. This patch ensures that the agent can
    tolerate this case.

[Test Case]

    * create subnet with dhcp enabled
    * set pdb.set_trace() in neutron.agent.linux.dhcp.DeviceManager.destroy()
    * manually delete ns-<id> device in tenant namespace
    * pdb continue and should not raise any error

[Regression Potential]

    None

2015-09-22 01:23:42.612 ERROR neutron.agent.dhcp.agent [-] Unable to disable dhcp for c543db4d-e077-488f-b58c-5805f63f86b6.
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent Traceback (most recent call last):
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/dhcp/agent.py", line 115, in call_driver
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent getattr(driver, action)(**action_kwargs)
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/dhcp.py", line 221, in disable
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent self._destroy_namespace_and_port()
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/dhcp.py", line 226, in _destroy_namespace_and_port
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent self.device_manager.destroy(self.network, self.interface_name)
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/dhcp.py", line 1223, in destroy
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent self.driver.unplug(device_name, namespace=network.namespace)
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/interface.py", line 358, in unplug
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent tap_name = self._get_tap_name(device_name, prefix)
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/interface.py", line 299, in _get_tap_name
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent dev_name = dev_name.replace(prefix or self.DEV_NAME_PREFIX,
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent AttributeError: 'NoneType' object has no attribute 'replace'
2015-09-22 01:23:42.612 TRACE neutron.agent.dhcp.agent
2015-09-22 01:23:42.616 INFO neutron.agent.dhcp.agent [-] Synchronizing state complete

The reason is the device is None

Assaf Muller (amuller) wrote :

How does one reproduce this? Did you see this in a CI run?

Kevin Benton (kevinbenton) wrote :

Hi Gary,

I traced through this code and self.interface can't be set to None with the regular drivers in neutron/agent/linux/interface.py because get_device_name returns a string.

Are you using a different interface driver? If so, I would suggest looking at the get_device_name method and make sure it can't return None.

Changed in neutron:
status: New → Incomplete
Gary Kotton (garyk) wrote :

This happens when running a new devstack. It reproduces pretty much 100% of time. The difference between this and the vanilla devstack is that it uses iniset $Q_DHCP_CONF_FILE DEFAULT ovs_use_veth True
I was unable to get to the root cause.

Fix proposed to branch: master
Review: https://review.openstack.org/228229

Changed in neutron:
assignee: nobody → Gary Kotton (garyk)
status: Incomplete → In Progress
Gary Kotton (garyk) on 2015-09-27
Changed in neutron:
importance: Undecided → High

It seems the dhcp agent does not receive some messages before, but when network needs to delete, it received the msg.

I cannot reproduce this with latest devstack.

Reviewed: https://review.openstack.org/228229
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=caebc8fb8e8d9782746c3cc3ddc86f786342c819
Submitter: Jenkins
Branch: master

commit caebc8fb8e8d9782746c3cc3ddc86f786342c819
Author: Gary Kotton <email address hidden>
Date: Sun Sep 27 00:24:31 2015 -0700

    DHCP: protect against case when device name is None

    There are edge cases when the agent attempts to unplug an interface and
    the device does not exist.

    Change-Id: I6917ec94f685f3dd3bff6aa1d43dc56aab76274a
    Closes-bug: #1498370

Changed in neutron:
status: In Progress → Fix Committed
Kyle Mestery (mestery) on 2015-10-07
Changed in neutron:
milestone: none → liberty-rc2

Reviewed: https://review.openstack.org/231924
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=0b07910f33ed26fbdd13530eafbdefd74104424d
Submitter: Jenkins
Branch: stable/liberty

commit 0b07910f33ed26fbdd13530eafbdefd74104424d
Author: Gary Kotton <email address hidden>
Date: Sun Sep 27 00:24:31 2015 -0700

    DHCP: protect against case when device name is None

    There are edge cases when the agent attempts to unplug an interface and
    the device does not exist.

    Change-Id: I6917ec94f685f3dd3bff6aa1d43dc56aab76274a
    Closes-bug: #1498370
    (cherry picked from commit caebc8fb8e8d9782746c3cc3ddc86f786342c819)

tags: added: in-stable-liberty
Thierry Carrez (ttx) on 2015-10-08
Changed in neutron:
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2015-10-15
Changed in neutron:
milestone: liberty-rc2 → 7.0.0
Download full text (16.6 KiB)

Reviewed: https://review.openstack.org/235300
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=140ccc36d172bead2605968b3d61b36cca8a0040
Submitter: Jenkins
Branch: master

commit 6dcfe3a9362ae5fcf18e5cfb59663e43446cd59c
Author: Kevin Benton <email address hidden>
Date: Tue Oct 6 19:28:47 2015 -0700

    Mock oslo policy HTTPCheck instead of urllib

    We were mocking internal behavior of oslo policy by
    patching urllib. This will break with the upcoming oslo
    release that switches to requests.

    This patch changes the mock to the HTTPCheck level and we
    can leave implementation details testing up to oslo_policy.

    Change-Id: I07957f01307e25f1547197c720eea6e3e7f0ef5a
    Closes-Bug: #1503890
    (cherry picked from commit a0f1d9d6de1560be91d3001c8ac9f880a7a5a7e0)

    Add testresources used by oslo.db fixture

    If we use oslo.db fixtures, we'll need the package or
    the next version of oslo.db release will break us.

    Closes-Bug: #1503501
    Change-Id: I7dfbf240333095d91a414ba15a439bdc4804eb25
    (cherry picked from commit 86ad967e40c2c6752ec0fb46cfd3098ede0c7178)

    Fix functional test_server tests

    Now oslo.service 0.10.0 no longer sends SIGHUP to parent and
    children services.

    This was a chance introduced by 286a6ea, and since it invalidated
    the very logic under test, this must be revised.

    (cherry picked from commit 090fe713592c2b6398d999bfa03b80cbb2054609)

    Change-Id: I18a11283925369bc918002477774f196010a1bc3
    Closes-bug: #1505438
    (cherry picked from commit 090fe713592c2b6398d999bfa03b80cbb2054609)

    Make test_server work with older versions of oslo.service

    Change I18a11283925369bc918002477774f196010a1bc3 fixed the test for
    oslo.service >= 0.10.0, but it also broke it for older versions of
    oslo.service. Since the library has minimal version of >= 0.7.0 in
    requirements.txt, test should pass for those versions too.

    Now, instead of validating that either reset() or restart() of workers
    are triggered on SIGHUP, just validate that .start() is triggered the
    expected number of times (either way, no matter how oslo.service decide
    to clean up the children, they exit and then are respawned).

    Change-Id: I41f9d3af780b3178b075bc1e7084f417a2bd1378
    Closes-Bug: #1505645
    (cherry picked from commit 7bb40921660cf29beb68e338e205499efd6ffa36)

    Fixed multiple py34 gate issues

    1. Scope mock of 'open' to module

    By mocking 'open' at the module level, we can avoid affecting
    'open' calls from other modules.

    2. Stop using LOG.exception in contexts with no sys.exc_info set

    Python 3.4 logger fills in record.exc_info with sys.exc_info() result
    [1], and then it uses it to determine the current exception [2] to
    append to the log message. Since there is no exception, exc_info[1] is
    None, and we get AttributeError inside traceback module.

    It's actually a bug in Python interpreter that it attempt to access the
    attribute when there is no exception. It turns out that it's fixed in
    latest master of cPython [3] (...

tags: added: kilo-backport-potential
James Page (james-page) on 2015-11-05
Changed in neutron (Ubuntu Vivid):
status: New → Fix Released
Changed in neutron (Ubuntu Wily):
status: New → Fix Released
Changed in neutron (Ubuntu Vivid):
status: Fix Released → In Progress
Changed in neutron (Ubuntu Xenial):
status: New → Fix Released
Changed in neutron (Ubuntu Vivid):
assignee: nobody → Edward Hope-Morley (hopem)
summary: - DHCP agent: interface unplug leads to exeception
+ DHCP agent: interface unplug leads to exception

Reviewed: https://review.openstack.org/242003
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=a27b30d7263aefce0ffbae95316e4e5cd48165a5
Submitter: Jenkins
Branch: stable/kilo

commit a27b30d7263aefce0ffbae95316e4e5cd48165a5
Author: Gary Kotton <email address hidden>
Date: Sun Sep 27 00:24:31 2015 -0700

    DHCP: protect against case when device name is None

    There are edge cases when the agent attempts to unplug an interface and
    the device does not exist.

    Closes-bug: #1498370
    (cherry picked from commit caebc8fb8e8d9782746c3cc3ddc86f786342c819)
    (cherry picked from commit 0b07910f33ed26fbdd13530eafbdefd74104424d)
    Change-Id: I6917ec94f685f3dd3bff6aa1d43dc56aab76274a

tags: added: in-stable-kilo
description: updated
summary: - DHCP agent: interface unplug leads to exception
+ [SRU] DHCP agent: interface unplug leads to exception
Changed in neutron (Ubuntu Vivid):
importance: Undecided → High
Changed in neutron (Ubuntu Xenial):
importance: Undecided → High
Changed in neutron (Ubuntu Wily):
importance: Undecided → High
Martin Pitt (pitti) wrote :

The attached debdiff is for trusty, adding task. Vivid is EOL in about a month, so let's not worry about this any more.

Changed in neutron (Ubuntu Trusty):
assignee: nobody → Edward Hope-Morley (hopem)
Changed in neutron (Ubuntu Vivid):
assignee: Edward Hope-Morley (hopem) → nobody
status: In Progress → Won't Fix
Martin Pitt (pitti) wrote :

trusty has 2014.1.5-0ubuntu, this patch is against some 2015 "trusty-kilo" which is not Ubuntu. Is this actually an issue in trusty? If so, please adjust the patch and set back to new, otherwise invalidate the trusty task.

Unsubscribing sponsors, as there is nothing to do here.

Changed in neutron (Ubuntu Trusty):
status: New → Incomplete
Changed in neutron (Ubuntu Trusty):
importance: Undecided → High
Edward Hope-Morley (hopem) wrote :

@pitti Apologies, this SRU was incorrectly targeted. It is intended as an SRU for Openstack Kilo and therefore should be targeted at Ubuntu Vivid (then implicitly Trusty Kilo Ubuntu Cloud Archive). The current Kilo version in Vivid is 1:2015.1.2-0ubuntu1 so this SRU should have version 1:2015.1.2-0ubuntu2. I will resubmit an updated debdiff.

Edward Hope-Morley (hopem) wrote :
Changed in neutron (Ubuntu Trusty):
status: Incomplete → In Progress
Changed in neutron (Ubuntu Vivid):
status: Won't Fix → In Progress
Changed in neutron (Ubuntu Trusty):
status: In Progress → New
assignee: Edward Hope-Morley (hopem) → nobody
Changed in neutron (Ubuntu Vivid):
assignee: nobody → Edward Hope-Morley (hopem)
tags: added: patch
Corey Bryant (corey.bryant) wrote :

Thanks for the patch Ed. I've uploaded this to vivid-proposed and it's awaiting review.

Hello Gary, or anyone else affected,

Accepted neutron into vivid-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/neutron/1:2015.1.2-0ubuntu2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in neutron (Ubuntu Vivid):
status: In Progress → Fix Committed
tags: added: verification-needed
Corey Bryant (corey.bryant) wrote :

This fix is now available in the proposed pocket of the kilo cloud-archive and can be enabled with:

sudo add-apt-repository cloud-archive:kilo-proposed

James Page (james-page) on 2015-11-27
tags: added: verification-done
removed: verification-needed

The verification of the Stable Release Update for neutron has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package neutron - 1:2015.1.2-0ubuntu2

---------------
neutron (1:2015.1.2-0ubuntu2) vivid; urgency=medium

  * Fix DHCP agent delete non-existant interface (LP: #1498370)
    - d/p/dhcp-protect-against-case-when-device-name-is-none.patch

 -- Edward Hope-Morley <email address hidden> Tue, 10 Nov 2015 00:11:31 +0000

Changed in neutron (Ubuntu Vivid):
status: Fix Committed → Fix Released
no longer affects: neutron (Ubuntu Trusty)
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers