ifupdown calling dhclient with -1 causes it to fail when dhcp server unavailable

Bug #1196975 reported by Sven Mueller
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
isc-dhcp (Ubuntu)
Triaged
Medium
Unassigned

Bug Description

Present in the Precise version (0.7~beta2ubuntu8) at least:

ifupdown is hardcoded to call dhclient with -1, which causes it to exit on the first failure.
This can happen in two ways:
1) Failure to bring up the interface if the dhcp server doesn't respond in time (default 60s).
2) Failure to renew if the dhcp server was unavailable during a previous renewal try.

This is http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=694541 as well, it seems.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Ubuntu's dhclient's -1 is different from upstream in that it never times out on the first try, that was done specifically to avoid this bug.

Changed in ifupdown (Ubuntu):
status: New → Invalid
Revision history for this message
Sven Mueller (smu-u) wrote :

Well, that would cover the first failure mode I described, but we experience the second one.
I don't see any workaround, apart from removing the -1 from the dhclient options.

I'm sorry, but this means that a _part_ of my report might be invalid (though we think we saw that issue), but the other part is valid, unless you can provide instructions how to not experience this issue.

When an interface is set to use dhcp and is set to an online state, dhclient shouldn't give up retrying to renew a lease, but it does.

Changed in ifupdown (Ubuntu):
status: Invalid → New
Revision history for this message
Stéphane Graber (stgraber) wrote :

Can you provide a syslog showing that happening?

Changed in ifupdown (Ubuntu):
status: New → Incomplete
Revision history for this message
Clint Byrum (clint-fewbar) wrote : Re: [Bug 1196975] Re: ifupdown calling dhclient with -1 causes it to fail when dhcp server unavailable

Excerpts from Sven Mueller's message of 2013-07-02 15:43:57 UTC:
> Well, that would cover the first failure mode I described, but we experience the second one.
> I don't see any workaround, apart from removing the -1 from the dhclient options.
>
> I'm sorry, but this means that a _part_ of my report might be invalid
> (though we think we saw that issue), but the other part is valid, unless
> you can provide instructions how to not experience this issue.
>
> When an interface is set to use dhcp and is set to an online state,
> dhclient shouldn't give up retrying to renew a lease, but it does.
>

I have to agree with you that dhclient should not give up trying to renew
a lease that is still valid, or trying to obtain a new one once its lease
has expired. Basically if the interface is up and configured for DHCP,
I want it to never give up trying to obtain an interface.

It seems like we're abusing -1 instead of adding a more appropriate
"try forever and stay in the foreground" mode which is I think what we
actually want from -1.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Well, what we want is "try forever in the foreground and daemonize on success" so not to block ifupdown (which is single threaded) from executing the post-up scripts. And it's what our patch to dhclient is supposed to be doing, if it's not, I want to see a log of this so we can figure out exactly what case is causing it to exit.

Revision history for this message
Sven Mueller (smu-u) wrote :

I will see if I can dig up the logs, but this happened a while ago and I only got around to filing this bug now. Anyway, what happened on dozens of machines (might have been hundreds):
Machine has a working ethernet connection (to the switch)
The network itself works reliably for the whole time.
The DHCP servers fail for a prolonged timespan (several hours, long enough for the leases of a percentage of the machines to expire).
Client machines which retained a valid lease might have tried to renew (renewal time << expiry time) and failed, but dhclient kept going.
At some point, the leases for a number of machines expired.
DHCP servers came back up.
Machines that still had valid leases at this point just renewed and worked as expected.
Machines with expired leases didn't try to get a new lease. (I'm unsure if dhclient exited, or just failed to try a new discovery, but I think it actually exited).

I found this log snippet (that isn't very helpful, but it is from a time close to the dhcp servers coming back (around 10am) and from a host that failed to recover):

Nov 11 09:20:07 host dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 16
Nov 11 09:20:23 host dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 10
Nov 11 09:20:33 host dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 7
Nov 11 09:20:40 host dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 8
Nov 11 09:20:48 host dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 7
Nov 11 09:20:55 host dhclient: No DHCPOFFERS received.

No further dhclient entries after that point, until it was started again after a reboot (no login possible, because LDAP server was inaccessible), over 3 hours later (our renewal time is 2 hours, expiry 4 hours or more).

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for ifupdown (Ubuntu) because there has been no activity for 60 days.]

Changed in ifupdown (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Sven Mueller (smu-u) wrote :

This was previously marked incomplete, then expired. However, all the information that is available is in the bug. Here is a summary again:

dhclient stops trying to renew a lease if a previous renewal failed (due to server being unavailable).

See #6 for all the logs available. See how the client tried to renew, then tried a new DHCPDISCOVER, then didn't try again for at least 3 hours, with expiry set to 4 hours or more and renewal time set to 2 hours.

This is a different scenario from the server not being available on start, as clarified in #2.

Note that during the whole time, the client network interface was up, as the switch it was connected to didn't fail.

Changed in ifupdown (Ubuntu):
status: Expired → New
affects: ifupdown (Ubuntu) → isc-dhcp (Ubuntu)
Changed in isc-dhcp (Ubuntu):
importance: Undecided → Medium
status: New → Triaged
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.