DHCP agent resync restarts all dnsmasq processes on any dhcp driver exception

Bug #1384402 reported by Terry Wilson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Medium
Terry Wilson

Bug Description

The sync_state/periodic_resync implementation will loop through and restart the dhcp process for all active networks any time there is an exception calling a dhcp driver function for a specific network. This allows a tenant who can create an unhandled exception to cause every dhcp process on the system to restart. On systems with lots of networks this can easily take longer than the default resync timeout leading to a system that becomes unresponsive because of the load continually restarting causes.

Tags: l3-ipam-dhcp
Changed in neutron:
assignee: nobody → Terry Wilson (otherwiseguy)
status: New → In Progress
Revision history for this message
Eugene Nikanorov (enikanorov) wrote :

Is there a patch proposed to fix the issue?
If not, please leave the bug in Confirmed state.

tags: added: l3-ipam-dhcp
Changed in neutron:
importance: Undecided → Medium
status: In Progress → Confirmed
Revision history for this message
Oleg Bondarev (obondarev) wrote :
Changed in neutron:
status: Confirmed → In Progress
Kyle Mestery (mestery)
Changed in neutron:
milestone: none → kilo-1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/129025
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=bb7cce32f3ca5ccf903e13d37c2c62a0222090da
Submitter: Jenkins
Branch: master

commit bb7cce32f3ca5ccf903e13d37c2c62a0222090da
Author: Terry Wilson <email address hidden>
Date: Wed Oct 15 20:56:17 2014 -0500

    Only resync DHCP for a particular network when their is a failure

    The previous implementation will loop through and restart the dhcp
    process for all active networks any time there is an exception calling
    a dhcp driver function. This allows a tenant who can create an exception
    to cause every dhcp process to restart. On systems with lots of networks
    this can easily take longer than the default resync timeout leading to a
    system that becomes unresponsive because of the load continually restarting
    causes.

    This patch restarts only dhcp processes related to the network on which
    operations are failing. It should be noted that if there was some kind
    of missed notification for a subnet update, the previous implementation
    may have incidentally fixed it by restarting everything on the off
    chance that something else caused an exception, but obviously relying
    on that would be a bad idea as exceptions should be, well, exceptional.

    Closes-bug: #1384402

    Change-Id: I0b348a1657a7eb3a595f9bf6b217716a37ce38c6

Changed in neutron:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: kilo-1 → 2015.1.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.