dhclient does not remain running on boot
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | systemd (Ubuntu) |
High
|
Martin Pitt | ||
Bug Description
Not sure whether this is a systemd or ifupdown issue. I just happened to notice today for a Wily VM (set up from server iso) that I failed to get a DNS resolution for a VM I started earlier the day. DNS updates in my case are related to DHCP updates. And on the DHCP/DNS server I noticed that the IP address had been removed at some point. Looking at the VM, I find no dhclient being started. When I manually run ifdown and ifup on the interface this does happen. But not whenever I boot.
Curious about this I also checked a Vivid VM (also server install) and this looks the same. However for Vivid this seems fixed in the latest versions.
Network config in /etc/network/
auto eth0
iface eth0 inet dhcp
Expected result:
dhclient running for eth0
Actual result after boot:
No dhclient process is running.
| Stefan Bader (smb) wrote : | #2 |
| Martin Pitt (pitti) wrote : | #4 |
The journal clearly shows that ifup did get started:
Jun 19 13:49:08 lam-wily6401 systemd[1]: Starting ifup for eth0...
[...]
Jun 19 13:49:08 lam-wily6401 ifup[456]: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 3 (xid=0x15a1af31)
Jun 19 13:49:08 lam-wily6401 dhclient[488]: DHCPREQUEST of 192.168.2.189 on eth0 to 255.255.255.255 port 67 (xid=0x31afa115)
Jun 19 13:49:08 lam-wily6401 dhclient[488]: DHCPOFFER of 192.168.2.189 from 192.168.2.1
Jun 19 13:49:08 lam-wily6401 ifup[456]: DHCPREQUEST of 192.168.2.189 on eth0 to 255.255.255.255 port 67 (xid=0x31afa115)
Jun 19 13:49:08 lam-wily6401 ifup[456]: DHCPOFFER of 192.168.2.189 from 192.168.2.1
[...]
Jun 19 13:49:09 lam-wily6401 dhclient[488]: DHCPACK of 192.168.2.189 from 192.168.2.1
Jun 19 13:49:09 lam-wily6401 ifup[456]: DHCPACK of 192.168.2.189 from 192.168.2.1
Jun 19 13:49:09 lam-wily6401 dhclient[488]: bound to 192.168.2.189 -- renewal in 297 seconds.
Jun 19 13:49:09 lam-wily6401 ifup[456]: bound to 192.168.2.189 -- renewal in 297 seconds.
Jun 19 13:49:09 lam-wily6401 ifquery[748]: eth0=eth0
Jun 19 13:49:09 lam-wily6401 systemd[1]: Started ifup for eth0.
So I'm a bit confused what's wrong now. Can you please copy&paste the output of "sudo systemctl status -s ifup@eth0"?
| Stefan Bader (smb) wrote : | #5 |
Hm -s did not work. So I used -l instead. Clearly it gets started. It just feels like it gets ended when ifup exits. As if not damonized enough...
<email address hidden> - ifup for eth0
Loaded: loaded (/lib/systemd/
Active: active (exited) since Fr 2015-06-19 13:49:09 CEST; 1h 29min ago
Process: 748 ExecStartPost=
Process: 456 ExecStartPost=
Process: 455 ExecStart=
Main PID: 455 (code=exited, status=0/SUCCESS)
CGroup: /<email address hidden>
Jun 19 13:49:08 lam-wily6401 dhclient[488]: DHCPREQUEST of 192.168.2.189 on eth0 to 255.255.255.255 port 67 (xid=0x31afa115)
Jun 19 13:49:08 lam-wily6401 dhclient[488]: DHCPOFFER of 192.168.2.189 from 192.168.2.1
Jun 19 13:49:08 lam-wily6401 ifup[456]: DHCPREQUEST of 192.168.2.189 on eth0 to 255.255.255.255 port 67 (xid=0x31afa115)
Jun 19 13:49:08 lam-wily6401 ifup[456]: DHCPOFFER of 192.168.2.189 from 192.168.2.1
Jun 19 13:49:09 lam-wily6401 dhclient[488]: DHCPACK of 192.168.2.189 from 192.168.2.1
Jun 19 13:49:09 lam-wily6401 ifup[456]: DHCPACK of 192.168.2.189 from 192.168.2.1
Jun 19 13:49:09 lam-wily6401 dhclient[488]: bound to 192.168.2.189 -- renewal in 297 seconds.
Jun 19 13:49:09 lam-wily6401 ifup[456]: bound to 192.168.2.189 -- renewal in 297 seconds.
Jun 19 13:49:09 lam-wily6401 ifquery[748]: eth0=eth0
Jun 19 13:49:09 lam-wily6401 systemd[1]: Started ifup for eth0.
| Stefan Bader (smb) wrote : | #6 |
Seems on Vivid this was fixed in some way (at least systemd was very outdated in the VM I had quickly booted up). Before the update systemd was at version 219-4*, now updated to 219-7ubuntu3.
The wily VM was up-to-date and runs systemd 220-7ubuntu1. Could it be that something about keeping daemons started during ifup running be accidentally dropped?
| summary: |
- dhclient not started on boot + dhclient does not remain running on boot |
| description: | updated |
| Launchpad Janitor (janitor) wrote : | #7 |
Status changed to 'Confirmed' because the bug affects multiple users.
| Changed in ubuntu: | |
| status: | New → Confirmed |
| Stefan Bader (smb) wrote : | #8 |
One thing that I noticed on the Wily VM but did not happen on a server of someone else I asked: somehow ntpdate seems to get disrupted in some way that leaves the lockfile in /var/lock around. Which then causes the ominous "exceeded maximum number of lock attempts".
| Stefan Bader (smb) wrote : | #9 |
Addition to above statement. The ntpdate lock error happens when I manually down and up eth to get dhclient running.
| Martin Pitt (pitti) wrote : | #10 |
> Could it be that something about keeping daemons started during ifup running be accidentally dropped?
Not that I know of, and in my wily VMs dhclient runs happily.
I don't know whether ifupdown has some useful debugging facilities. I tried adding /usr/bin/strace -fvvs1024 -o /run/ifup.trace to /etc/init.
| affects: | ubuntu → ifupdown (Ubuntu) |
| Changed in ifupdown (Ubuntu): | |
| status: | Confirmed → Incomplete |
| Stefan Bader (smb) wrote : | #11 |
Seems on bare-metal servers I have, I do not see this. The VM as well as well as bare-metal server of the team mate I asked are booting rather quickly. That as side note. I upgraded the VM today and now run on systemd/udev 221-1ubuntu2.
Still the same. One observation that looks odd: There is a pid file in /run: /run/dhclient.
● <email address hidden> - ifup for eth0
Loaded: loaded (/lib/systemd/
Active: active (exited) since Di 2015-06-30 10:14:31 CEST; 10min ago
Process: 729 ExecStartPost=
Process: 450 ExecStartPost=
Process: 448 ExecStart=
Main PID: 448 (code=exited, status=0/SUCCESS)
CGroup: /<email address hidden>
Jun 30 10:14:31 lam-wily6401 dhclient[477]: DHCPREQUEST of 192.168.2.189 on ...)
Jun 30 10:14:31 lam-wily6401 dhclient[477]: DHCPOFFER of 192.168.2.189 from ...1
Jun 30 10:14:31 lam-wily6401 ifup[450]: DHCPREQUEST of 192.168.2.189 on eth0...)
Jun 30 10:14:31 lam-wily6401 ifup[450]: DHCPOFFER of 192.168.2.189 from 192....1
Jun 30 10:14:31 lam-wily6401 dhclient[477]: DHCPACK of 192.168.2.189 from 19...1
Jun 30 10:14:31 lam-wily6401 ifup[450]: DHCPACK of 192.168.2.189 from 192.16...1
Jun 30 10:14:31 lam-wily6401 dhclient[477]: bound to 192.168.2.189 -- renewa....
Jun 30 10:14:31 lam-wily6401 ifup[450]: bound to 192.168.2.189 -- renewal in....
Jun 30 10:14:31 lam-wily6401 ifquery[729]: eth0=eth0
Jun 30 10:14:31 lam-wily6401 systemd[1]: Started ifup for eth0.
| Changed in ifupdown (Ubuntu): | |
| status: | Incomplete → New |
| Stefan Bader (smb) wrote : | #12 |
Changed the disk from virtio to IDE without change. Added debug to the kernel options and got a bit more info in journal but it looks odd. There is a dhclient with PID 496 early on. Which does not seem to exit. But then later on there is a 699 getting killed which matches the pid file but never seems to be started.
Could be a gap from pivoting root but I cannot really say how to tell for sure.
| Stefan Bader (smb) wrote : | #13 |
Comparing to output of a vivid VM, the sysctl status differs in one important detail. The initial dhclient again is gone but under the system-ifup.slice there is the new pid...
● <email address hidden> - ifup for eth0
Loaded: loaded (/lib/systemd/
Active: active (exited) since Tue 2015-06-30 15:08:44 CEST; 20s ago
Process: 403 ExecStart=/bin/sh -ec ifup --allow=hotplug %I; ifup --allow=auto %I; if ifquery %I >/dev/null; then ifquery --state %I >/dev/null; fi (code=exited, status=0/SUCCESS)
Main PID: 403 (code=exited, status=0/SUCCESS)
CGroup: /<email address hidden>
└─736 dhclient -1 -v -pf /run/dhclient.
Jun 30 15:08:45 lam-vivid6401 sh[403]: Sending on Socket/fallback
Jun 30 15:08:45 lam-vivid6401 sh[403]: DHCPDISCOVER on eth0 to 255.255.255....7)
Jun 30 15:08:46 lam-vivid6401 dhclient[430]: DHCPREQUEST of 192.168.2.159 on...)
Jun 30 15:08:46 lam-vivid6401 dhclient[430]: DHCPOFFER of 192.168.2.159 from...1
Jun 30 15:08:46 lam-vivid6401 sh[403]: DHCPREQUEST of 192.168.2.159 on eth0...7)
Jun 30 15:08:46 lam-vivid6401 sh[403]: DHCPOFFER of 192.168.2.159 from 192.....1
Jun 30 15:08:47 lam-vivid6401 dhclient[430]: DHCPACK of 192.168.2.159 from 1...1
Jun 30 15:08:47 lam-vivid6401 sh[403]: DHCPACK of 192.168.2.159 from 192.168.2.1
Jun 30 15:08:47 lam-vivid6401 dhclient[430]: bound to 192.168.2.159 -- renew....
Jun 30 15:08:47 lam-vivid6401 sh[403]: bound to 192.168.2.159 -- renewal in...s.
Hint: Some lines were ellipsized, use -l to show in full.
ubuntu@
736
| Stefan Bader (smb) wrote : | #14 |
I think I may have a clue now what the problem is. Incidentally on my bare-metal servers the network interface seems to be brought up by something else than the ifup@.service. The status there reports it already up when the service runs. The same seems to be happening to MAAS instances after initial deployment. On first login dhclient is not running but after reboot it is, though then again the ifup@.service was not starting the interface.
The problem looks to be that systemd acts differently on ExecStart and ExecStart[Pre|Post] elements. In Vivid we still had only a ExecStart element which was a shell command running two ifup and some conditional ifquery command. In Wily this changed into a ExecStart which is ifup for hotplug and two ExecStartPost which are a second ifup (but this time for auto class) and an ifquery. The interface is brought up by the second ifup (as it is set to auto), but it seems systemd does not check/handle any processes started up in the background of a ExecStartPost.
So one way that fixed the dhclient issue was to exchange the hotplug and auto ifup calls, so auto was done in the ExecStart call. But that would only lead to problems if there was ever an interface in the hotplug class.
What seems to work for me (not sure this is all valid) is to make the service oneshot and have multiple ExecStart elements, like this:
[Service]
Type=oneshot
ExecStart=
ExecStart=
ExecStart=
ExecStop=
RemainAfterExit
| Launchpad Janitor (janitor) wrote : | #15 |
Status changed to 'Confirmed' because the bug affects multiple users.
| Changed in systemd (Ubuntu): | |
| status: | New → Confirmed |
| affects: | ifupdown (Ubuntu) → systemd (Ubuntu) |
| Changed in systemd (Ubuntu): | |
| status: | New → In Progress |
| importance: | Undecided → High |
| assignee: | nobody → Martin Pitt (pitti) |
| Changed in systemd (Ubuntu): | |
| status: | In Progress → Fix Committed |
| Launchpad Janitor (janitor) wrote : | #16 |
This bug was fixed in the package systemd - 222-2ubuntu1
---------------
systemd (222-2ubuntu1) wily; urgency=medium
* Merge with Debian unstable. Remaining Ubuntu changes:
- Hack to support system-image read-only /etc, and modify files in
/
- Keep our much simpler udev maintainer scripts (all platforms must
support udev, no debconf).
- initramfs init-bottom: If LVM is installed, settle udev,
otherwise we get missing LV symlinks. Workaround for LP #1185394.
- Add debian/
dependencies to "lvm2" which is handled with udev rules in Ubuntu.
- Add debian/
script.
- Provide shutdown fallback for upstart. (LP: #1370329)
- debian/
really support "allow-hotplug" in Ubuntu at the moment, so we need to
deal with "auto" devices appearing after "/etc/init.
already ran. (LP: #1374521) Also run ifup in the background during boot,
to avoid blocking network.target. (LP: #1425376)
- ifup@.service: Drop dependency on networking.service (i. e.
/
This avoids unnecessary dependencies/
cycles if hooks wait for other interfaces to come up (like ifenslave
with bonding interfaces). (LP: #1414544)
- Add Get-RTC-
Ubuntu we currently keep the setting whether the RTC is in local or UTC
time in /etc/default/rcS "UTC=yes|no", instead of /etc/adjtime.
(LP: #1377258)
- Put session scopes into all cgroup controllers. This makes unprivileged
user LXC containers work under systemd. (LP: #1346734)
- systemctl: Don't forward telinit u to upstart. This works around
upstart's Restart() always reexec'ing /sbin/init on Restart(), even if
that changes to point to systemd during the upgrade. This avoids running
systemd during a dist-upgrade. (LP: #1430479)
- Drop hwdb-update dependency from udev-trigger.
introduced in v219-stable. This causes udev and plymouth to start too
late and isn't really needed in Ubuntu yet as we don't support stateless
systems yet and handle hwdb.bin updates through dpkg triggers. This can
be dropped again with initramfs-tools 0.117.
- Lower Breaks: to plymouth version which has the udev inotify fix in
Ubuntu.
- Lower apparmor Breaks: to the Ubuntu version that dropped $remote_fs.
- Change systemd-sysv's conflicts to upstart-sysv. (LP: #1422681)
Upgrade fixes, keep until 16.04 LTS release:
- systemd Conflicts/
- Remove obsolete systemd-logind upstart job.
- Clean up obsolete /etc/udev/
- systemd.postinst: Migrate mountall specific fstab options to standard
util-linux "nofail" option.
* ifup@.service: Run ifup for "auto" class in ExecStart instead of
ExecStartPost, as it spawns a long-run...
| Changed in systemd (Ubuntu): | |
| status: | Fix Committed → Fix Released |


Can you please attach the "sudo journalctl" output after booting?