Caught Exception in dhcp agent sync_state may block or delay configuration of new networks

Bug #1202722 reported by Stephen Ma
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Low
Stephen Ma

Bug Description

In the dhcp-agent.log, sometimes this error is seen.
Dhcp_agent.ini is configured with no router_id defined. There is one dhcp agent managing all dhcp servers in one node.

This is the Traceback:
2013-07-10 16:16:15 ERROR [quantum.agent.dhcp_agent] Unable to sync network state.
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/quantum/agent/dhcp_agent.py", line 152, in sync_state
self.disable_dhcp_helper(deleted_id)
File "/usr/lib/python2.7/dist-packages/quantum/agent/dhcp_agent.py", line 197, in disable_dhcp_helper
self.disable_isolated_metadata_proxy(network)
File "/usr/lib/python2.7/dist-packages/quantum/agent/dhcp_agent.py", line 340, in disable_isolated_metadata_proxy
pm.disable()
File "/usr/lib/python2.7/dist-packages/quantum/agent/linux/external_process.py", line 67, in disable
ip_wrapper.netns.execute(cmd)
File "/usr/lib/python2.7/dist-packages/quantum/agent/linux/ip_lib.py", line 407, in execute
check_exit_code=check_exit_code)
File "/usr/lib/python2.7/dist-packages/quantum/agent/linux/utils.py", line 61, in execute
raise RuntimeError(m)
RuntimeError:...

The dhcp_agent.py in commit 1bd456371f9909d5cb33536e84a3fdd7aac40f8c shows:

    def sync_state(self):
        """Sync the local DHCP state with Neutron."""
        LOG.info(_('Synchronizing state'))
        pool = eventlet.GreenPool(cfg.CONF.num_sync_threads)
        known_network_ids = set(self.cache.get_network_ids())

        try:
            active_networks = self.plugin_rpc.get_active_networks_info()
            active_network_ids = set(network.id for network in active_networks)
            for deleted_id in known_network_ids - active_network_ids:
                self.disable_dhcp_helper(deleted_id)

            for network in active_networks:
                pool.spawn_n(self.configure_dhcp_for_network, network)

        except Exception:
            self.needs_resync = True
            LOG.exception(_('Unable to sync network state.'))

When an error happens in the loop of the self.disable_dhcp_helper routine, all networks in the "for network in active_networks" loop will be skipped. So either the configuration of dhcp of network will either be delayed or will never happen.

This routine touches many networks in a system. So any error exceptions should be caught inside disable_dhcp_helper() so that processing of other networks are not blocked -- for any exceptions, bugs or not.

Tags: l3-ipam-dhcp
Changed in neutron:
importance: Undecided → Low
status: New → Triaged
tags: added: l3-ipam-dhcp
removed: dhcp
tags: added: grizzly-backport-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/38655

Changed in neutron:
assignee: nobody → Stephen Ma (stephen-ma)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/38655
Committed: http://github.com/openstack/neutron/commit/3d5bec962c1e5bb923107f54585c94a9b24f2d30
Submitter: Jenkins
Branch: master

commit 3d5bec962c1e5bb923107f54585c94a9b24f2d30
Author: Stephen Ma <email address hidden>
Date: Thu Jul 25 07:25:48 2013 -0700

    Dhcp agent sync_state may block or delay
    configuration of new networks.

    Fixes Bug 1202722

    Change-Id: I368cb064057d48be1491df6825cc67c265706b50

Changed in neutron:
status: In Progress → Fix Committed
Changed in neutron:
milestone: none → next
milestone: next → havana-3
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: havana-3 → 2013.2
Alan Pevec (apevec)
tags: removed: grizzly-backport-potential
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.