dhcp-agent with reserved_dhcp_port raise cannot find tap device error

Bug #1513758 reported by ZongKai LI
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Expired
Undecided
Unassigned

Bug Description

=====
my env
=====
upstream code.
2 dhcp-agents, setting dhcp_agents_per_network to 2.
optional: checkout [1] https://review.openstack.org/#/c/239264/ .

===========
steps to reproduce
===========
1, create a private net and its subnet, enable_dhcp(True) by default.
2, verify both two dhcp-agents host net, by "ip netns", and we can find dhcp-port tapA is used by dhcp-agent-1, and dhcp-port tapB is used by dhcp-agent-2.
3, stop/kill two dhcp-agnets.
4, update two dhcp-ports device_id from previous one to "reserved_dhcp_port"
>>neutron port-update --device_id='reserved_dhcp_port' PORT-ID
5, start two dhcp-agents again, when dhcp-agent-1 try to setup tapB and dhcp-agent-2 try to setup tapA, error like 'Cannot find device "tapX" ' will raise.

---------------
explanation
---------------
1. step 4 is try to simulate case remove_networks_from_down_agents, when we stop/kill a dhcp-agent, even we can check it's no longer alive by "neutron agent-status", the dhcp-port it used will still not update its device_id to "reserved_dhcp_port" for a while. manually modify it will make things quick.
2, about patch in [1], it's optional, even without that patch, this issue can still raise. But sometime for stale ports existing, this issue will not raise, but that's not a good reason to keep stale dhcp-port. That patch will help to cleanup stale ports, and make this issue easier to be seen.

=======
TRACE log
=======
2015-11-06 05:46:41.634 DEBUG neutron.agent.linux.dhcp [req-6e9631c6-84b4-4283-a975-cc40819b638d admin b7adf07ab24c40cc98f0f4835bb2e43d] Reloading allocations for network: 79673257-aa5e-4d19-91b5-225391b2691c from (pid=20965) reload_allocations /opt/stack/neutron/neutron/agent/linux/dhcp.py:466
2015-11-06 05:46:41.635 DEBUG neutron.agent.linux.utils [req-6e9631c6-84b4-4283-a975-cc40819b638d admin b7adf07ab24c40cc98f0f4835bb2e43d] Running command (rootwrap daemon): ['ip', 'netns', 'exec', 'qdhcp-79673257-aa5e-4d19-91b5-225391b2691c', 'ip', 'route', 'list', 'dev', 'tapbcd64879-be'] from (pid=20965) execute_rootwrap_daemon /opt/stack/neutron/neutron/agent/linux/utils.py:99
2015-11-06 05:46:41.664 ERROR neutron.agent.linux.utils [req-6e9631c6-84b4-4283-a975-cc40819b638d admin b7adf07ab24c40cc98f0f4835bb2e43d]
Command: ['ip', 'netns', 'exec', u'qdhcp-79673257-aa5e-4d19-91b5-225391b2691c', 'ip', 'route', 'list', 'dev', 'tapbcd64879-be']
Exit code: 1
Stdin:
Stdout:
Stderr: Cannot find device "tapbcd64879-be"

2015-11-06 05:46:41.665 ERROR neutron.agent.dhcp.agent [req-6e9631c6-84b4-4283-a975-cc40819b638d admin b7adf07ab24c40cc98f0f4835bb2e43d] Unable to reload_allocations dhcp for 79673257-aa5e-4d19-91b5-225391b2691c.
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent Traceback (most recent call last):
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/dhcp/agent.py", line 115, in call_driver
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent getattr(driver, action)(**action_kwargs)
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/dhcp.py", line 467, in reload_allocations
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent self.device_manager.update(self.network, self.interface_name)
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/dhcp.py", line 1227, in update
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent self._set_default_route(network, device_name)
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/dhcp.py", line 1005, in _set_default_route
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent gateway = device.route.get_gateway()
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/ip_lib.py", line 710, in get_gateway
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent route_list_lines = self._run(options, tuple(args)).split('\n')
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/ip_lib.py", line 303, in _run
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent return self._parent._run(options, self.COMMAND, args)
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/ip_lib.py", line 67, in _run
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent return self._as_root(options, command, args)
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/ip_lib.py", line 82, in _as_root
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent log_fail_as_error=self.log_fail_as_error)
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/ip_lib.py", line 91, in _execute
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent log_fail_as_error=log_fail_as_error)
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent File "/opt/stack/neutron/neutron/agent/linux/utils.py", line 157, in execute
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent raise RuntimeError(m)
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent RuntimeError:
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent Command: ['ip', 'netns', 'exec', u'qdhcp-79673257-aa5e-4d19-91b5-225391b2691c', 'ip', 'route', 'list', 'dev', 'tapbcd64879-be']
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent Exit code: 1
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent Stdin:
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent Stdout:
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent Stderr: Cannot find device "tapbcd64879-be"
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent
2015-11-06 05:46:41.665 TRACE neutron.agent.dhcp.agent

ZongKai LI (zongkai)
Changed in neutron:
assignee: nobody → ZongKai LI (lzklibj)
ZongKai LI (zongkai)
Changed in neutron:
status: New → Incomplete
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

This bug is > 240 days without activity. We are unsetting assignee and milestone and setting status to Incomplete in order to allow its expiry in 60 days.

If the bug is still valid, then update the bug status.

Changed in neutron:
assignee: ZongKai LI (lzklibj) → nobody
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for neutron because there has been no activity for 60 days.]

Changed in neutron:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.