Wrong expire date in nova-dhcpbridge init output

Bug #1104915 reported by PierreF
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
New
Undecided
Unassigned
nova (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

TL; DR;

nova-dhcpbridge init generated leases file with instance updated_at instead of fixed_ips updated_at, causing every leases to be expired for several month. So everytime dnsmasq is restarted will sent DHCPNAK the first time a client ask for renew its lease.

Long version:

dnsmasq expected the leases format to have:

* one line per lease
* first column is the expire date in second since epoc.
* we don’t care about other column for this issue :)

This is the case for dnsmasq 2.59 and 2.65 (precise version and raring version).

But the output of nova-dhcpbridge init (which is called by dnsmasq to load the leases):

1352420950 xx:xx:3e:01:7d:xx 10.0.0.3 app01.domain *
1352421657 xx:xx:3e:7e:0a:xx 10.0.0.4 app02.domain *
[...]

So the expire date for those entry are: 9 November 2012 around 1am. The script was run the 25 January 2013 at 9am (UTC). With our lease time of 1 day, we expected an expire date at 25 January 2013 at 10am.

So when loaded dnsmasq read all leases and found all leases expired and then discard all leases. This cause dnsmasq to reply DHCPNAK for DHCPREQUEST (since for dnsmasq the lease requested didn’t exist because it’s expired). Hopefully, when client come with a DHCPDISCOVER, dnsmasq will get information form configuration file (/var/lib/nova/networks/nova-brxxx.conf) and create a lease with correct expire time. But this lease is only tracker in memory, so next DHCPREQUEST will work until next restart of dnsmasq.

At the end everytime dnsmasq is restarted, when client try to renew a lease it will get a DHCPNAK and it’s interface goes down (loss all IP). Even if the DHCPDISCOVER send right after will re-add the IP, this can trouble some services (in our case, pacemaker which manage a virtual IP).

Digging a bit on how nova-dhcpbridge generated the leases file, it seems to come from:

* _host_lease function in nova/network/linux_net.py:

    if data['instance_updated']:
        timestamp = data['instance_updated']
    else:
        timestamp = data['instance_created']

    seconds_since_epoch = calendar.timegm(timestamp.utctimetuple())

    return '%d %s %s %s *' % (seconds_since_epoch + FLAGS.dhcp_lease_time,
                              data['vif_address'],
                              data['address'],
                              data['instance_hostname'] or '*')

data[‘instance_updated’] is took from table instances, and it match the date seen in output of nova-dhcpbridge init. It’s also the date of creation of our machine (more or less few minutes... probably the end of first boot).

From my understanding of how nova-dhcpbridge works, every time dnsmasq reply to a client with a new lease, it call the nova-dhcpbridge script which update the database (table fixed_ips, column updated_at). So I think instead of “instance_updated”, we sould use “fixed_ips.updated_at” when generating the leases.

Version of software (Ubuntu version):

* Ubuntu 12.04 (precise) amd64
* nova-* 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1
* dnsmasq 2.65-1~precise1

The way leases file are generated by nova-dhcpbridge (_host_lease function in nova/network/linux_net.py) is present in nova git repository at both tag 2012.1.3 (b00f759) and master (97a5274 - dated of Thu Jan 24).

Tags: patch
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nova (Ubuntu):
status: New → Confirmed
Revision history for this message
PierreF (pierre-fersing) wrote :

I see two way for fixing the issue. The first one don't change the db/api, but it's the a very nice fix. The second one seems to be the correct way to fix this issue.

patch1.diff : always use time.time() + lease_time to set expiry in nova/network/linux_net.py

patch2.diff : add "updated" (models.FixedIp.updated_at) in data returned by db.api.network_get_associated_fixed_ips, and use this time when generating the leases.

Revision history for this message
PierreF (pierre-fersing) wrote :
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "always use time.time() + lease_time to set expiry in nova/network/linux_net.py" of this bug report has been identified as being a patch. The ubuntu-reviewers team has been subscribed to the bug report so that they can review the patch. In the event that this is in fact not a patch you can resolve this situation by removing the tag 'patch' from the bug report and editing the attachment so that it is not flagged as a patch. Additionally, if you are member of the ubuntu-reviewers team please also unsubscribe the team from this bug report.

[This is an automated message performed by a Launchpad user owned by Brian Murray. Please contact him regarding any issues with the action taken in this bug report.]

tags: added: patch
Revision history for this message
Chet Burgess (cfb-n) wrote :

This bug looks like a duplicate of an existing bug that has a fix pending.

https://bugs.launchpad.net/nova/+bug/1103260

Revision history for this message
PierreF (pierre-fersing) wrote :

Yes, the root cause is the same (nova-dhcpbridge init which sent already expired lease). The fix proposed in bug #1103260 should fix this issue.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.