Mirantis OpenStack

[mos] dnsmasq (for neutron-dhcp-agent) is sometimes configured with duplicate leases

Bug #1295715 reported by Brad Durrow on 2014-03-21

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Mirantis OpenStack	Invalid	Medium	MOS Neutron	Mirantis OpenStack 6.0

Bug Description

Today I had an instance that didn't get an IP on boot. I looked in the dnsmasq logs and saw this:
2014-03-21T14:52:34.197649+00:00 err: duplicate dhcp-host IP address 10.29.8.6 at line 16 of /var/lib/neutron/dhcp/50644057-b518-4e85-843a-3321c9a4073f/host

I confirmed in horizon that there was no instance with a duplicate IP.
I went to the node that the log came from and removed the first instance of 10.29.8.6
then killed dnsmasq with -HUP (then confirmed it was still up with the same PID).

I rebooted the instance and it got an IP this time.

{"build_id": "2013-12-27_00-24-14", "ostf_sha": "83ada35fec2664089e07fdc0d34861ae2a4d948a", "build_number": "214", "nailgun_sha": "af1598bcc9faf468d4d9265cc5c51fa8cea53136", "fuelmain_sha": "17eed776b30886851ae0042fa7a30184f5cd8eb6", "astute_sha": "6ce36837882399e0d3bb1ffdb2c3b2d8dcb84b54", "release": "4.0", "fuellib_sha": "eebe07913ee09311c8e7c9231f6785081327dc0e"}

Tags:

Revision history for this message

Vladimir Kuklin (vkuklin) wrote on 2014-03-24:

Brad. We will try to reproduce the issue, but it is really hard to undestand which flow led to this problem. Would you please attach diagnostic snapshot as usual?

Changed in fuel:
milestone:	none → 5.0
tags:	added: backports-4.1.1

Vladimir Kuklin (vkuklin) on 2014-03-26

Changed in fuel:
importance:	Undecided → Medium
status:	New → Incomplete

Revision history for this message

Brad Durrow (l-brad) wrote on 2014-04-07:

This problem is quite a bit more serious now. I have several IPs that are duplicated in the lease file every time I add or remove an instance. I can reliably cause the problem by trying to launch an instance without a large enough root disk for the image.

Revision history for this message

Brad Durrow (l-brad) wrote on 2014-04-07:

I couldn't find the mac addresses of the duplicate (old) leases in any of the recent logs, so I thought it might have to do with the dhcp agent cache. `crm resource restart p_neutron-dhcp-agent` (at least temporarily) resolved the problem.

Revision history for this message

Mike Scherbakov (mihgen) wrote on 2014-04-26:

Thanks for posting and debugging this issue.
I have to move this to 5.1: we entered soft code freeze phase, so all bugs with Medium and Low are moved to the next release version. Please provide more details if you find, and feel free to raise questions / concerns about this issue in the mailing list.

Changed in fuel:
milestone:	5.0 → 5.1

Revision history for this message

tdsparrow (sqallowlee) wrote on 2014-05-07:

I met the same issue on neutron 2013.2.1. Only our vms will be destroyed in short time, so i can find mac address for the old record in dnsmasq. There're seven hosts with dhcp-agents and each of them report different duplication.

I suspect the reason of my issue is that dhcp agent cannot update the heartbeat timestamp on time, and agents_db.py took them as down, no release notification will be sent to this agent. my system uses the default conf for agent_down_time(5s) and report_interval(4s). After changing report_interval to 3 for one host, this issue was gone on the host for 12 hours.

It's almost the same logic in code from HEAD.

Revision history for this message

tdsparrow (sqallowlee) wrote on 2014-05-07:

I got it wrong, it's 2013.2, default value for agent_down_time has been changed to 9 in 2013.2.

Revision history for this message

Bogdan Dobrelya (bogdando) wrote on 2014-06-17:

The bug could be fixed in 4.1.1 release, please try to reproduce and provide a feedback

Changed in fuel:
assignee:	nobody → Fuel QA Team (fuel-qa)

Ilya Shakhat (shakhat) on 2014-06-20

Changed in mos:
assignee:	nobody → MOS Neutron (mos-neutron)

Ilya Shakhat (shakhat) on 2014-06-26

Changed in mos:
status:	New → Incomplete

Revision history for this message

Eugene Nikanorov (enikanorov) wrote on 2014-06-26:

I think this is the upstream version of this issue
https://bugs.launchpad.net/neutron/+bug/1288493

Revision history for this message

Eugene Nikanorov (enikanorov) wrote on 2014-06-26:

Looks like this issue is applicable for havana, where the same IP address could be reused after allocate-deallocate operation.
Under the load this could lead to duplicate entries.

In icehouse ip generation logic was changed so the same IP address is not reused immediately after deallocation, so such issue may not appear that often, so it would be much harder to repro this with Icehouse or upstream.

Dmitry Mescheryakov (dmitrymex) on 2014-07-15

tags:

added: neutron

Dmitry Ilyin (idv1985) on 2014-07-15

summary:

- dnsmasq (for neutron-dhcp-agent) is sometimes configured with duplicate
- leases
+ [mos] dnsmasq (for neutron-dhcp-agent) is sometimes configured with
+ duplicate leases

Alexander Ignatov (aignatov) on 2014-08-01

Changed in mos:
importance:	Undecided → Medium
milestone:	none → 6.0
Changed in fuel:
milestone:	5.1 → 6.0

Revision history for this message

Alexander Ignatov (aignatov) wrote on 2014-09-24:

#10

Moved to Confirmed state because it's not clear whether it's fixed in upstream before Juno.

no longer affects:	fuel
Changed in mos:
status:	Incomplete → Confirmed

Revision history for this message

Ilya Shakhat (shakhat) wrote on 2014-11-27:

#11

Unreproducible on 6.0, the issue suspected to be fixed in Icehouse

Changed in mos:
status:	Confirmed → Won't Fix

Revision history for this message

Dmitry Mescheryakov (dmitrymex) wrote on 2014-11-27:

#12

Invalid is a more proper state for unreproducible issues.

Changed in mos:
status:	Won't Fix → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.