Thousands of duplicate leases issued for the same IP/MAC

Bug #1314692 reported by Jason Hobbs
18
This bug affects 4 people
Affects Status Importance Assigned to Milestone
MAAS
Triaged
High
Unassigned

Bug Description

This is with maas bzr2252-0ubuntu1, dhcpd 4.2.4-7ubuntu12.

rodsmith made me aware of an issue he's seeing on a MAAS server being used for certification.

One node, 'decathalete2', an Intel Decathalete system, is for some reason issuing DHCP requests very rapidly, and maas-dhcp-server is issuing new leases for it just as rapidly. I've observed 5 new leases created in a minute. This is causing the lease file to swell to hundreds of thousands of lines, even though only 6 nodes are being managed. This in turn causes celeryd to take forever to upload leases to MAAS.

MAAS/dhcpd should be more resilient here - it should not be creating so many leases for the same IP/MAC.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

Attached an example dhcpd leases file from the system.

description: updated
Changed in maas:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

As a work around, maas-dhcp-server can be restarted periodically. This causes the duplicate entries to be cleaned up.

Revision history for this message
Jason Hobbs (jason-hobbs) wrote :

There is a relevant option for DHCPD (from the dhcpd.conf man page):
       The one-lease-per-client statement

         one-lease-per-client flag;

         If this flag is enabled, whenever a client sends a DHCPREQUEST for a
         particular lease, the server will automatically free any other leases
         the client holds. This presumes that when the client sends a DHCPRE-
         QUEST, it has forgotten any lease not mentioned in the DHCPREQUEST -
         i.e., the client has only a single network interface and it does not
         remember leases it's holding on networks to which it is not currently
         attached. Neither of these assumptions are guaranteed or provable,
         so we urge caution in the use of this statement.

Revision history for this message
Rod Smith (rodsmith) wrote :

The system making the requests was running Ubuntu 14.04 server. It has four Ethernet ports, which were originally called eth0 to eth3; but after a firmware update, udev decided to rename them to em1 to em4. /var/log/syslog had lots of dhclient activity:

$ grep dhclient /var/log/syslog
Apr 30 07:11:33 decathlete2 dhclient: DHCPREQUEST of 10.20.30.60 on eth3 to 10.20.30.1 port 67 (xid=0x2a1ff0ee)
Apr 30 07:11:33 decathlete2 dhclient: DHCPACK of 10.20.30.60 from 10.20.30.1
Apr 30 07:11:33 decathlete2 dhclient: bound to 10.20.30.60 -- renewal in 19137 seconds.
Apr 30 08:49:02 decathlete2 dhclient: message repeated 1425 times: [ DHCPREQUEST of 10.20.30.58 on eth1 to 10.20.30.1 port 67 (xid=0x424c0758)]
Apr 30 08:49:15 decathlete2 dhclient: DHCPREQUEST of 10.20.30.58 on eth1 to 255.255.255.255 port 67 (xid=0x424c0758)
Apr 30 08:58:01 decathlete2 dhclient: message repeated 1400 times: [ DHCPREQUEST of 10.20.30.59 on eth2 to 10.20.30.1 port 67 (xid=0x7cc18273)]
Apr 30 08:58:18 decathlete2 dhclient: DHCPREQUEST of 10.20.30.59 on eth2 to 255.255.255.255 port 67 (xid=0x7cc18273)
Apr 30 10:19:05 decathlete2 dhclient: message repeated 398 times: [ DHCPREQUEST of 10.20.30.58 on eth1 to 255.255.255.255 port 67 (xid=0x424c0758)]
Apr 30 10:19:10 decathlete2 dhclient: DHCPDISCOVER on eth1 to 255.255.255.255 port 67 interval 3 (xid=0x3bd05ae3)
Apr 30 10:19:10 decathlete2 dhclient: DHCPREQUEST of 10.20.30.58 on eth1 to 255.255.255.255 port 67 (xid=0x3bd05ae3)
Apr 30 10:19:10 decathlete2 dhclient: DHCPOFFER of 10.20.30.58 from 10.20.30.1
Apr 30 10:19:10 decathlete2 dhclient: DHCPACK of 10.20.30.58 from 10.20.30.1
Apr 30 10:19:11 decathlete2 dhclient: bound to 10.20.30.58 -- renewal in 20622 seconds.
Apr 30 10:28:06 decathlete2 dhclient: message repeated 388 times: [ DHCPREQUEST of 10.20.30.59 on eth2 to 255.255.255.255 port 67 (xid=0x7cc18273)]
Apr 30 10:28:09 decathlete2 dhclient: DHCPDISCOVER on eth2 to 255.255.255.255 port 67 interval 3 (xid=0x2d134628)
Apr 30 10:28:09 decathlete2 dhclient: DHCPREQUEST of 10.20.30.59 on eth2 to 255.255.255.255 port 67 (xid=0x2d134628)
Apr 30 10:28:09 decathlete2 dhclient: DHCPOFFER of 10.20.30.59 from 10.20.30.1
Apr 30 10:28:09 decathlete2 dhclient: DHCPACK of 10.20.30.59 from 10.20.30.1
Apr 30 10:28:10 decathlete2 dhclient: bound to 10.20.30.59 -- renewal in 18798 seconds.
Apr 30 10:44:19 decathlete2 dhclient: DHCPREQUEST of 10.20.30.57 on em0 to 10.20.30.1 port 67 (xid=0x6d441934)
Apr 30 12:30:30 decathlete2 dhclient: DHCPREQUEST of 10.20.30.60 on eth3 to 10.20.30.1 port 67 (xid=0x2a1ff0ee)
Apr 30 12:30:30 decathlete2 dhclient: DHCPACK of 10.20.30.60 from 10.20.30.1
Apr 30 12:30:30 decathlete2 dhclient: bound to 10.20.30.60 -- renewal in 19290 seconds.
Apr 30 15:41:28 decathlete2 dhclient: DHCPREQUEST of 10.20.30.59 on eth2 to 10.20.30.1 port 67 (xid=0x2d134628)

I've shut it down for the moment, but will bring it up again soon to experiment, and with any luck find the cause of dhclient going berserk.

Revision history for this message
Raphaël Badin (rvb) wrote :

This is related (but not a dupe) to bug 1305102.

Revision history for this message
Rod Smith (rodsmith) wrote :

I must retract what I said earlier about the firmware update; I was confusing two different systems, and there was no firmware update on the computer that was sending out excessive DHCP requests.

Upon starting that system up again, it's been behaving normally, and I see no clues in its logs about what caused it to misbehave. Of course, from the perspective of the MAAS and DHCP server, this is irrelevant; the server should be robust against a client that's misbehaving like this.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

Here's a snippet from the leases:

lease 10.20.30.59 {
  starts 3 2014/04/30 15:41:39;
  ends 4 2014/05/01 03:41:39;
  cltt 3 2014/04/30 15:41:39;
  binding state active;
  next binding state free;
  rewind binding state free;
  hardware ethernet 00:1e:67:31:3b:eb;
  client-hostname "decathlete2";
}
lease 10.20.30.59 {
  starts 3 2014/04/30 15:41:43;
  ends 4 2014/05/01 03:41:43;
  cltt 3 2014/04/30 15:41:43;
  binding state active;
  next binding state free;
  rewind binding state free;
  hardware ethernet 00:1e:67:31:3b:eb;
  client-hostname "decathlete2";
}
lease 10.20.30.59 {
  starts 3 2014/04/30 15:41:50;
  ends 4 2014/05/01 03:41:50;
  cltt 3 2014/04/30 15:41:50;
  binding state active;
  next binding state free;
  rewind binding state free;
  hardware ethernet 00:1e:67:31:3b:eb;
  client-hostname "decathlete2";
}
lease 10.20.30.59 {
  starts 3 2014/04/30 15:42:00;
  ends 4 2014/05/01 03:42:00;
  cltt 3 2014/04/30 15:42:00;
  binding state active;
  next binding state free;
  rewind binding state free;
  hardware ethernet 00:1e:67:31:3b:eb;
  client-hostname "decathlete2";
}

Notice the lease times are overlapping.

Surely this is a bug in DHCPD? Why on earth is it writing a new lease out when one already exists? I guess it's just "renewing" it but perhaps we just need to make it more aggressively clean up old leases. (although having just read further it will only clean once per hour and that's a compiled-in setting)

Revision history for this message
Julian Edwards (julian-edwards) wrote : Re: [Bug 1314692] Re: Thousands of duplicate leases issued for the same IP/MAC

On 01/05/14 02:00, Jason Hobbs wrote:
> There is a relevant option for DHCPD (from the dhcpd.conf man page):
> The one-lease-per-client statement
>
> one-lease-per-client flag;
>
> If this flag is enabled, whenever a client sends a DHCPREQUEST for a
> particular lease, the server will automatically free any other leases
> the client holds. This presumes that when the client sends a DHCPRE-
> QUEST, it has forgotten any lease not mentioned in the DHCPREQUEST -
> i.e., the client has only a single network interface and it does not
> remember leases it's holding on networks to which it is not currently
> attached. Neither of these assumptions are guaranteed or provable,
> so we urge caution in the use of this statement.
>

This would still cause new leases to get appended to the leases file AFAICS.

The only way around the problem in MAAS is to make the lease parsing
faster, which is bug 1305102.

Any other fixes need to be done in DHCPD I think, unless there is more
config that MAAS can tweak.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

Actually I'm going to dupe that bug, its fix which is coming RSN will alleviate the symptoms of this one.

Revision history for this message
Zoltan Arnold Nagy (zoltan) wrote :

Did that fix ever happen? I'm still seeing this issue yet the referenced bug is "fixed".

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.