neutron

DHCP agent resync restarts all dnsmasq processes on any dhcp driver exception

Bug #1384402 reported by Terry Wilson on 2014-10-22

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	neutron	Fix Released	Medium	Terry Wilson	neutron 2015.1.0 "kilo"

Bug Description

The sync_state/periodic_resync implementation will loop through and restart the dhcp process for all active networks any time there is an exception calling a dhcp driver function for a specific network. This allows a tenant who can create an unhandled exception to cause every dhcp process on the system to restart. On systems with lots of networks this can easily take longer than the default resync timeout leading to a system that becomes unresponsive because of the load continually restarting causes.

Tags:

OpenStack Infra (hudson-openstack) on 2014-10-22

Changed in neutron:
assignee:	nobody → Terry Wilson (otherwiseguy)
status:	New → In Progress

Revision history for this message

Eugene Nikanorov (enikanorov) wrote on 2014-10-27:

Is there a patch proposed to fix the issue?
If not, please leave the bug in Confirmed state.

tags:	added: l3-ipam-dhcp
Changed in neutron:
importance:	Undecided → Medium
status:	In Progress → Confirmed

Revision history for this message

Oleg Bondarev (obondarev) wrote on 2014-10-29:

Proposed patch: https://review.openstack.org/#/c/129025/

Changed in neutron:
status:	Confirmed → In Progress

Kyle Mestery (mestery) on 2014-10-29

Changed in neutron:
milestone:	none → kilo-1

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-10-29: Fix merged to neutron (master)

Reviewed: https://review.openstack.org/129025
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=bb7cce32f3ca5ccf903e13d37c2c62a0222090da
Submitter: Jenkins
Branch: master

commit bb7cce32f3ca5ccf903e13d37c2c62a0222090da
Author: Terry Wilson <email address hidden>
Date: Wed Oct 15 20:56:17 2014 -0500

Only resync DHCP for a particular network when their is a failure

    The previous implementation will loop through and restart the dhcp
    process for all active networks any time there is an exception calling
    a dhcp driver function. This allows a tenant who can create an exception
    to cause every dhcp process to restart. On systems with lots of networks
    this can easily take longer than the default resync timeout leading to a
    system that becomes unresponsive because of the load continually restarting
    causes.

    This patch restarts only dhcp processes related to the network on which
    operations are failing. It should be noted that if there was some kind
    of missed notification for a subnet update, the previous implementation
    may have incidentally fixed it by restarting everything on the off
    chance that something else caused an exception, but obviously relying
    on that would be a bad idea as exceptions should be, well, exceptional.

Closes-bug: #1384402

Change-Id: I0b348a1657a7eb3a595f9bf6b217716a37ce38c6

Changed in neutron:
status:	In Progress → Fix Committed

Thierry Carrez (ttx) on 2014-12-18

Changed in neutron:
status:	Fix Committed → Fix Released

Thierry Carrez (ttx) on 2015-04-30

Changed in neutron:
milestone:	kilo-1 → 2015.1.0

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.