dhcp-agent does always provide IP address for instances with re-cycled IP addresses.

Bug #1189909 reported by James Page
88
This bug affects 17 people
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Undecided
Lawrance
CentOS
New
Undecided
Unassigned
quantum (Ubuntu)
Won't Fix
Undecided
Unassigned

Bug Description

Configuration: OpenStack Networking, OpenvSwitch Plugin (GRE tunnels), OpenStack Networking Security Groups
Release: Grizzly

Sometime when creating instances, the dnsmasq instance associated with the tenant l2 network does not have configuration for the requesting mac address:

Jun 11 09:30:23 d7m88-cofgod dnsmasq-dhcp[10083]: DHCPDISCOVER(tap98031044-d8) fa:16:3e:da:41:45 no address available
Jun 11 09:30:33 d7m88-cofgod dnsmasq-dhcp[10083]: DHCPDISCOVER(tap98031044-d8) fa:16:3e:da:41:45 no address available

Restarting the quantum-dhcp-agent resolved the issue:

Jun 11 09:30:41 d7m88-cofgod dnsmasq-dhcp[11060]: DHCPDISCOVER(tap98031044-d8) fa:16:3e:da:41:45
Jun 11 09:30:41 d7m88-cofgod dnsmasq-dhcp[11060]: DHCPOFFER(tap98031044-d8) 10.5.0.2 fa:16:3e:da:41:45

The IP address (10.5.0.2) was re-cycled from an instance that was destroyed just prior to creation of this one.

ProblemType: Bug
DistroRelease: Ubuntu 13.04
Package: quantum-dhcp-agent 1:2013.1.1-0ubuntu1
ProcVersionSignature: Ubuntu 3.8.0-23.34-generic 3.8.11
Uname: Linux 3.8.0-23-generic x86_64
ApportVersion: 2.9.2-0ubuntu8.1
Architecture: amd64
Date: Tue Jun 11 09:31:38 2013
MarkForUpload: True
PackageArchitecture: all
ProcEnviron:
 TERM=screen
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: quantum
UpgradeStatus: No upgrade log present (probably fresh install)
modified.conffile..etc.quantum.dhcp.agent.ini: [deleted]
modified.conffile..etc.quantum.rootwrap.d.dhcp.filters: [deleted]

Revision history for this message
James Page (james-page) wrote :
description: updated
Revision history for this message
James Page (james-page) wrote :

The message is received by the agent:

2013-06-11 09:28:30 DEBUG [quantum.openstack.common.rpc.amqp] received {u'_context_roles': [u'Member', u'_member_', u'admin'], u'_context_read_deleted': u'no', u'_context_tenant_id': u'670fedaf933a43d2b2a64148d781839f', u'args': {u'payload': {u'port': {u'status': u'DOWN', u'name': u'', u'admin_state_up': True, u'network_id': u'864facd7-0179-40fe-8928-48c3993b8c00', u'tenant_id': u'670fedaf933a43d2b2a64148d781839f', u'device_owner': u'compute:None', u'mac_address': u'fa:16:3e:da:41:45', u'fixed_ips': [{u'subnet_id': u'2128dbee-d153-47a1-86e4-5e2611233d58', u'ip_address': u'10.5.0.2'}], u'id': u'76672204-43b8-4407-b8f5-bcb847cde39a', u'security_groups': [u'b0dbbcf1-0512-429f-b40b-c24b0f6e08c7', u'b98efe39-08ac-45d3-925d-83c9f1307456'], u'device_id': u'a1e1d055-68c7-4c76-a0c1-47b1e116a8a7'}}}, u'_unique_id': u'e989617aad374a5e94284c174394377f', u'_context_is_admin': False, u'version': u'1.0', u'_context_project_id': u'670fedaf933a43d2b2a64148d781839f', u'_context_timestamp': u'2013-06-11 13:28:30.197746', u'_context_user_id': u'9b118873367f427e950790a0b1504881', u'method': u'port_create_end'}
2013-06-11 09:28:30 DEBUG [quantum.openstack.common.rpc.amqp] unpacked context: {'user_id': u'9b118873367f427e950790a0b1504881', 'roles': [u'Member', u'_member_', u'admin'], 'tenant_id': u'670fedaf933a43d2b2a64148d781839f', 'is_admin': False, 'timestamp': u'2013-06-11 13:28:30.197746', 'project_id': u'670fedaf933a43d2b2a64148d781839f', 'read_deleted': u'no'}
2013-06-11 09:28:30 DEBUG [quantum.openstack.common.lockutils] Got semaphore "agent" for method "port_update_end"...
2013-06-11 09:28:30 DEBUG [quantum.agent.linux.utils] Running command: ['sudo', '/usr/bin/quantum-rootwrap', '/etc/quantum/rootwrap.conf', 'ip', 'netns', 'exec', 'qdhcp-864facd7-0179-40fe-8928-48c3993b8c00', 'kill', '-HUP', '10083']
2013-06-11 09:28:30 DEBUG [quantum.agent.linux.utils]
Command: ['sudo', '/usr/bin/quantum-rootwrap', '/etc/quantum/rootwrap.conf', 'ip', 'netns', 'exec', 'qdhcp-864facd7-0179-40fe-8928-48c3993b8c00', 'kill', '-HUP', '10083']
Exit code: 0

Revision history for this message
James Page (james-page) wrote :

And a nice message from dnsmasq:

Jun 11 09:28:30 d7m88-cofgod dnsmasq[10083]: cleared cache
Jun 11 09:28:30 d7m88-cofgod dnsmasq[10083]: duplicate dhcp-host IP address 10.5.0.2 at line 7 of /var/lib/quantum/dhcp/864facd7-0179-40fe-8928-48c3993b8c00/host
Jun 11 09:28:30 d7m88-cofgod dnsmasq-dhcp[10083]: read /var/lib/quantum/dhcp/864facd7-0179-40fe-8928-48c3993b8c00/host
Jun 11 09:28:30 d7m88-cofgod dnsmasq-dhcp[10083]: read /var/lib/quantum/dhcp/864facd7-0179-40fe-8928-48c3993b8c00/opts

I guess the previously allocated IP address is already present so the new one gets ignored.

summary: - dhcp-agent does not provide IP address for instances
+ dhcp-agent does not provide IP address for instances with re-cycled IP
+ addresses.
James Page (james-page)
summary: - dhcp-agent does not provide IP address for instances with re-cycled IP
- addresses.
+ dhcp-agent does always provide IP address for instances with re-cycled
+ IP addresses.
tags: added: l3-ipam-dhcp
James Page (james-page)
tags: added: serverstack
Revision history for this message
James Page (james-page) wrote :

Similar issues being dicussed on openstack ML; might be a dnsmasq issue:

http://lists.thekelleys.org.uk/pipermail/dnsmasq-discuss/2013q2/007212.html

Raising task for dnsmasq as well.

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in quantum (Ubuntu):
status: New → Confirmed
Revision history for this message
guanxiaohua2k6 (guanxiaohua2k6) wrote :

According to message "duplicate dhcp-host IP address 10.5.0.2", there should be multiple entries of the same IP 10.5.0.2 in file /var/lib/quantum/dhcp/864facd7-0179-40fe-8928-48c3993b8c00/host, while MAC addresses of the entries may be different.

It seems that in the case dnsmasq server will report error and doesn't allocate IP address for the DHCP request.

Revision history for this message
guanxiaohua2k6 (guanxiaohua2k6) wrote :

In my environment, the problem occurred when I tried to start 20 instances in the same time with the following command.

nova boot --image IMAGE --num-instances 20 test

Some of the instances couldn't get dhcp IP address.

I found the following warning message in /var/log/quantum/server.log.

2013-07-04 18:25:41 WARNING [quantum.scheduler.dhcp_agent_scheduler] No active DHCP agents

And after checking the related code in file ocpcc1:/usr/lib/python2.6/site-packages/quantum/scheduler/dhcp_agent_scheduler.py

71 def is_agent_down(cls, heart_beat_time):
72 LOG.debug("now: %s, heart_beat_time: %s, agent_down_time: %d" % (str(timeutils.utcnow()), heart_beat_time, cfg.CONF.agent_down_time))
73 return timeutils.is_older_than(heart_beat_time,
74 cfg.CONF.agent_down_time)

The line 72 is what I inserted to debug.

After checking the output of line 72, I found the agent_down_time is too short. The default value is 5 seconds.

I added the agent_down_time to /etc/quantum/quantum.conf, and set it to 15 seconds, then all things worked fine.

Hope what I did can help you.

Revision history for this message
li,chen (chen-li) wrote :

met the same issue !

Revision history for this message
Lawrance (jing) wrote :

The same problems, who can reproduce the problem!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/40360

Changed in neutron:
assignee: nobody → Lawrance (jing)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/40361

Revision history for this message
Lawrance (jing) wrote :

i think neutron did not delete the port asap,maybe we can set dnsmasq lease time 3600 and wait for neutron deleting the port!

Revision history for this message
Mark McClain (markmcclain) wrote :

The problem is the lease database is not updated properly. This is being addressed by: https://review.openstack.org/#/c/37580/

Revision history for this message
James Page (james-page) wrote :

FWIW I still see this (or at least an issue with the same symptoms and effects) in Havana RC2.

The dnsmasq host file contains what I think is an old mac->IP mapping; the IP has been recycle for a new instance which appears lower down in the file but dnsmasq discards this record.

The mac address for the old instance is not present anywhere in the deployment (i checked ports across all tenants).

Revision history for this message
James Page (james-page) wrote :

Oct 16 13:07:22 cofgod dnsmasq[14049]: cleared cache
Oct 16 13:07:22 cofgod dnsmasq-dhcp[14049]: read /var/lib/neutron/dhcp/bab112ff-46c3-4034-8d27-936d7bb1ecd5/host
Oct 16 13:07:22 cofgod dnsmasq-dhcp[14049]: read /var/lib/neutron/dhcp/bab112ff-46c3-4034-8d27-936d7bb1ecd5/opts
Oct 16 13:07:22 cofgod dnsmasq-dhcp[14049]: duplicate IP address 10.5.0.45 in .

...

Oct 16 13:13:38 cofgod dnsmasq-dhcp[14049]: DHCPDISCOVER(ns-169187f7-9c) fa:16:3e:f5:a3:2d no address available

Revision history for this message
Tim Spriggs (tims-t) wrote :

Possibly a new vector, I have IPv6 as well as IPv4 on a network. I see this segment of the (rather large) host file:

fa:16:3e:9d:de:21,host-192-168-2-103.openstacklocal,192.168.2.103
fa:16:3e:9d:de:21,host-2607-f088-0-2--1339.openstacklocal,2607:f088:0:2::1339

When I remove the second line, and kill -HUP the daemon, IPv4 starts working. I tried running kill -HUP without removing the line and did not get a response.

Hopefully, whatever the problem is will be considered and fixed for both IPv4 and IPv6.

I am running dnsmasq from ubuntu packages: 2.66-4ubuntu1

Revision history for this message
Carl Baldwin (carl-baldwin) wrote :

It seems this has been fixed with https://review.openstack.org/#/c/37580/

Changed in neutron:
status: In Progress → Fix Released
Ryan Moats (rmoats)
affects: quantum (CentOS) → centos
Revision history for this message
James Page (james-page) wrote :

quantum is no longer found in any supported Ubuntu release

Changed in quantum (Ubuntu):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers