upstart hangs at boot when LDAP authentication is used

Bug #1453861 reported by rw
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
upstart (Ubuntu)
New
Undecided
Unassigned

Bug Description

We have a few machines running Ubuntu 14.04, fully patched. Reboots usually get scheduled for the weekend after a kernel update. After this weekend's reboot, most of the machines didn't come up again. We determined that upstart gets hung during the boot sequence.

We tried to re-install 14.04 on one of the affected machines and managed to boot it after the install, but the machine was in no usable state because again, upstart was hung, this time apparently after the boot process. The process list was filling up with zombies that didn't get reaped, and programs got stuck when trying to connect to the upstart socket.

Revision history for this message
rw (rwichmann) wrote :

We could reproduce the problem on a machine that was mothballed after installing 14.04 on March 12, 2015.

1) The machine still booted fine.
Linux 3.13.0-48-generic #80-Ubuntu SMP Thu Mar 12 11:16:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

2) We updated the machine with all patches released between March 12 and today.

3) The machine still booted fine.
Linux 3.13.0-52-generic #86-Ubuntu SMP Mon May 4 04:32:59 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

4) We switched from NIS to LDAP authentication. (We had implemented that sometime after the machine was mothballed.)

5) LDAP works fine.

6) Machine HANGS at reboot, just like the other affected machines.

The conclusion would be that upstart gets stuck at boot because of something related to the switch to LDAP.
The changes were:

a) installed packages
libnss-ldap:amd64 (264-2.2ubuntu4.14.04.1),
auth-client-config:amd64 (0.9ubuntu1, automatic),
ldap-auth-config:amd64 (0.5.3, automatic),
libpam-ldap:amd64 (184-8.5ubuntu3),
ldap-auth-client:amd64 (0.5.3, automatic)

b) edited files:
/etc/ldap.conf
/etc/ldap/ldap.conf
/etc/nsswitch.conf (added lines "passwd_compat: ldap", "group_compat: ldap". "shadow_compat: ldap")

In order to confirm this, we booted from CD and commented out the lines added to /etc/nsswitch.conf, to make the machine use NIS instead of LDAP again. The machine booted without any problem.

At this stage, our guess is that some upstart script forces a lookup of a user ID before the network is up, such that the LDAP server can't be reached, and the init process gets stuck. For some reason, with NIS this gets handled better.

Looking for a UID problem in the log files, there was only this (from a successful boot with NIS, because no logs are written when LDAP is used and the boot fails). I have no clue whether that is anyhow related to the issue.

May 12 13:27:38 machine_name NetworkManager[1195]: <warn> error requesting auth for org.freedesktop.NetworkManager.wifi.share.protected: (3) GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Could not get UID of name ':1.20': no such name
May 12 13:27:38 machine_name NetworkManager[1195]: <warn> error requesting auth for org.freedesktop.NetworkManager.wifi.share.open: (3) GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: GDBus.Error:org.freedesktop.DBus.Error.NameHasNoOwner: Could not get UID of name ':1.20': no such name

summary: - upstart hangs at boot
+ upstart hangs at boot when LDAP authentication is used
Revision history for this message
rw (rwichmann) wrote :

Output of initctl --list after boot with NIS.

Revision history for this message
rw (rwichmann) wrote :

Output of dpkg --get-selections

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.