libnss-ldap causes boot hang on 12.04 precise, 14.04 trusty, 16.04 xenial

Bug #1024475 reported by ghomem on 2012-07-13
100
This bug affects 18 people
Affects Status Importance Assigned to Milestone
libnss-ldap (Ubuntu)
High
Unassigned

Bug Description

A configuration that works perfectly after setup prevents an Ubuntu 12.04 Precise client from booting.

Checks before rebooting:

1. winbind authentication is working (console login, xrdp, etc)
2. libnss-ldap name resolution is working (getent passwd)

(this is the intended setup)

After booting the default Grub option we see the machine hung without printing anything.

Booting in recovery mode allows us to see that the last printed message is:

Begin: Running /scrips/init-bottom ... done.

The problem IS related to libnss-ldap because if we boot via cdrom and change nsswitch.conf to use local authentication the machine boots again perfectly. We can then change it back to use local authentication + ldap (compat ldap) and verify that it works. However the system won't come up after rebooting.

Even though the nss_initgroups_ignoreusers is correctly setup there is provavly some service that is trying to use ldap before networking is available. The extra options (see below) intended to lower timeouts seem to have no effect.

Configuration details:

/etc/ldap.conf
-----------------------------------

base dc=DOMAIN,dc=COM
binddn uid=ldapuser,ou=users,dc=DOMAIN,dc=COM
bindpw XXXXXYYYYZZZZ
ldap_version 3
uri ldap://192.168.1.8
nss_initgroups_ignoreusers avahi,avahi-autoipd,backup,bin,colord,daemon,games,gnats,hplip,irc,kernoops,libuuid,lightdm,list,lp,mail,man,messagebus,news,ntp,proxy,pulse,root,rtkit,saned,speech-dispatcher,sshd,sync,sys,syslog,usbmux,uucp,whoopsie,www-data,xrdp

/etc/nsswitch.conf
-----------------------------------

passwd: compat ldap
group: compat ldap
shadow: compat ldap

hosts: files mdns4_minimal [NOTFOUND=return] dns mdns4 wins
networks: files

protocols: db files
services: db files
ethers: db files
rpc: db files

netgroup: nis

extra options tried on /etc/ldap.conf
-----------------------------------

timelimit 2
bind_timelimit 1
nss_reconnect_sleeptime 1
nss_reconnect_maxsleeptime 1
bind_policy soft

ghomem (gustavo) wrote :

Further debug indicates that this is a group resolution problem at boot time because

passwd: compat ldap
group: compat
shadow: compat ldap

boots perfectly.

ghomem (gustavo) wrote :

Current workaround is changing the existing script /etc/init.d/libnss-ldap to include:

[...]

case "$1" in
        start)
                cp -f /etc/nsswitch.conf.ldap /etc/nsswitch.conf
                ;;
        stop)
                cp -f /etc/nsswitch.conf.local /etc/nsswitch.conf
                ;;

[...]

This will get you nss ldap working and no problems during reboot.

It's funny that this script is there to be used.

Clint Byrum (clint-fewbar) wrote :

Hi gustavo, thanks for taking the time to file this bug report.

Network interfaces have two modes. One is /etc/network/interfaces, and the other is NetworkManager. Which are you using? The former will bring up any interfaces as soon as udev detects them. It will also do a blanket bring up of any interfaces that aren't udev detected (like bridges and bonded interfaces) after udevtrigger finishes.

The latter starts later, and is my dynamic about interfaces.

Anyway, can you try booting with '--verbose noquiet' added to the kernel command line so we can see the messages being printed out. This should give an idea of what upstart jobs and services are being started that might be blocking the rest of the boot.

Changed in libnss-ldap (Ubuntu):
status: New → Incomplete
importance: Undecided → High
ghomem (gustavo) wrote :

Hello Clint,

We are using the default for networking, ie, Network Manager.

Booting with --verbose noquiet results in the single 2 lines:

Loading configuration from /etc/init.conf
Loading configuration fomt /etc/init

being printed.

The system hangs right there.

Robie Basak (racb) wrote :

I'm a bit confused.

If you're using Network Manager, then you can only expect to get a working network after login, right? But if you want to use LDAP to authenticate the login, how is this going to work before the network is brought up?

Changed in libnss-ldap (Ubuntu):
status: Incomplete → New
ghomem (gustavo) wrote :

Hi,

We are using the Ubuntu's default network configuration, nothing done manually on /etc/interfaces.

It does bring the network up before login and works perfectly with nss_winbind.

So the problem is likely the integration of nss_ldap on Ubuntu and not the network configuration.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in libnss-ldap (Ubuntu):
status: New → Confirmed
Craig Yoshioka (craigyk) wrote :

seriously? 6 months and this isn't fixed? I just got bit by this after upgrading and rebooting my LDAP server! Imagine the panic that set in at first.

Eric Martel (shrodi+launchpad) wrote :

Got bit by this, too; however, in my case it's adding "ldap" to the hosts line that got me into trouble.

jcat (jcat-l) wrote :

This is still an issue on 14.04 LTS.

This was fixed ages ago with this change:

libnss-ldap (251-5.2) unstable; urgency=high

  * Change the init script policy. Instead of stopping libnss-ldap.init on
    clean shutdown (touching a file) and starting it after networking (rm-ing
    it), we touch the file in /lib/init/rw as soon as possible (right before
    udev is started, touching a file) and stop it after initial system bootup.
    This fixes both issues with /var being on a separate partition, and
    unclean shutdown where the file would not be created. (To make sure we
    don't get similar problems during shutdown, we create it in runlevels 0
    and 6 as before, but we don't assume it's still there when we boot, since
    it's on a tmpfs now.) (Closes: #375077)

..but at some point got removed with this change:

libnss-ldap (259-1) unstable; urgency=low

  * Remove old kluge /etc/init.d/libnss-ldap

Not totally sure what was supposed to be replacing that "kluge", maybe it was the "nss_initgroups_ignoreusers" thing, but it's not working currently, that's for sure.
Boot time is well over 2 mins atm, verses about 5 seconds with the ldap entry removed for groups in nsswitch.conf

Someone must have some ideas for this.

Cheers,
jcat

husein (vonprichel) wrote :

hello i have a similar problem. im booting fatclients (ltsp). after configuring the ldap client (installing libnss ldap) and adding the paramater "ldap" to nsswitch.conf, the fatclient doesnt boot! if i removeing "ldap" all works fine, but this is no solution for me!

by the way, the same configuration works for my thinclients!

I hope you can help!

Ben Brown (ben-generik) wrote :

Yep, sad to say this is still happening, 2 years later. Brand new install of 14.04 LTS goes into high cpu usage during boot and you never make it to a usable login prompt with LDAP enabled. Tried setting the LDAP URI to localhost, LAN IP and it's FQDN, no effect.

Luckily, the workaround posted in #2 worked for me and I'm up and running... at least until the script is replaced in a future update.

Kludgy perhaps, but better than nothing.

husein (vonprichel) wrote :

the workarund in #2 isnt working for me. but im not sure if i did the right:

------------------------------------------------------------------------------------
#! /bin/sh -e

### BEGIN INIT INFO
# Provides: libnss-ldap
# Required-Start:
# Required-Stop: mountall.sh
# Default-Start:
# Default-Stop: 0 1 6
# Short-Description: Updates /etc/ldap.conf
# Description: Updates nss_initgroups_ignoreusers based on
# nss_initgroups_minimum_uid
### END INIT INFO

PATH="/sbin:/bin:/usr/sbin:/usr/bin"
. /lib/lsb/init-functions

#case "$1" in
# start)
# ;;
# restart|force-reload|stop)
# log_action_begin_msg "Running nssldap-update-ignoreusers"
# if nssldap-update-ignoreusers ; then
# log_action_end_msg 0
# else
# log_action_end_msg 1
# exit 1
# fi
# ;;
# *)
# echo "Usage: $0 {start|restart|force-reload|stop}"
# exit 1
# ;;

case "$1" in
        start)
                cp -f /etc/nsswitch.conf.ldap /etc/nsswitch.conf
                ;;
        stop)
                cp -f /etc/nsswitch.conf.local /etc/nsswitch.conf
                ;;

esac
exit 0
------------------------------------------------------------------------------------

Right? Please correct met!

Graham Eames (graham-eames) wrote :

Having recently run into this and spent some time debugging it would appear that one potential cause is if you have edited /etc/ldap.conf and do not have a new line at the end of the file (so in our case copying a standard config in as part of a deployment script)

When the libnss-ldap init script is run on shutdown it updates ldap.conf with the nss_initgroups_ignoreusers directive, but appends it to the end of the last line in the file rather than putting it on a new line

As such the last line which did read

ldap_version 3

ended up reading

ldap_version 3nss_initgroups_ignoreusers.........

rather than

ldap_version 3
nss_initgroups_ignoreusers..........

By ensuring we have a new line character at the end of the config file, when the init script runs we end up with the correct content in the file and the boot completes as expected rather than hanging

aaron jonen (ajonen) wrote :

I also tried fix #2 . I had to create my own files for :

       /etc/nsswitch.conf.ldap
      /etc/nsswitch.conf.local

I assume the ldap file contains the ldap and the .local files does not.

The fix did not work for me.
Please post the above files and the full working script for /etc/init.d/libnss-ldap.

Till then i just won't reboot. :)

jcat (jcat-l) wrote :

At some point, this started working again after a dist-upgrade, so nss_initgroups_ignoreusers is working for me.

The only thing I've done on top of that for house keeping purposes, is to modify the init script to remove the ignoreusers line on system start (the configs are controlled vi puppet, and the file will keep changing otherwise).

For anyone else with a similar issue, I'm using this:

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
#! /bin/sh -e

### BEGIN INIT INFO
# Provides: libnss-ldap
# Required-Start:
# Required-Stop: mountall.sh
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Updates /etc/ldap.conf
# Description: Updates nss_initgroups_ignoreusers based on
# nss_initgroups_minimum_uid
### END INIT INFO

PATH="/sbin:/bin:/usr/sbin:/usr/bin"
. /lib/lsb/init-functions

case "$1" in
 start)
  log_action_begin_msg "Removing nssldap-update-ignoreusers changes"
  if sed -i "/^nss_initgroups_ignoreusers/d" /etc/ldap.conf ; then
   log_action_end_msg 0
  else
   log_action_end_msg 1
   exit 1
  fi
  ;;
 restart|force-reload|stop)
  log_action_begin_msg "Running nssldap-update-ignoreusers"
  if nssldap-update-ignoreusers ; then
   log_action_end_msg 0
  else
   log_action_end_msg 1
   exit 1
  fi
  ;;
 *)
  echo "Usage: $0 {start|restart|force-reload|stop}"
  exit 1
 ;;
esac
exit 0
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Anaxim (anaxim) wrote :

Hi,

I lost more than 4 hours on this issue and found a solution. I hope it could help others.

Got the issue on a brand new 14.04.1 installation. Modification of the libnss-ldap (#2) script didn't solve the issue.

I found a solution on following the page : http://backdrift.org/how-to-get-pam-ldap-local-logins-to-work-when-networking-is-down and changed /etc/ldap.conf file and add following lines before nss_initgroups_ignoreusers:
nss_reconnect_tries 2
nss_reconnect_sleeptime 1
nss_reconnect_maxsleeptime 1
nss_reconnect_maxconntries 1

After this modification the system was able to boot normally and I could directly log in with ldap user.

Giuseppe Attardi (attardi-h) wrote :

I had a similar problem after an release upgrade from 12.04.3 to 14.04.1.

After the upgrade, only local users could log in and any command was very slooow.
In particular commands like add-user or add-group would never finish.

Removing mentions of ldap in nsswitch.conf seemed to solve the problem.

In the end I discovered that the apparent cause was removing this line from nsswitch:

passwd_compat: ldap

and using instead

passwd: compat ldap

After this change, everything worked normally.

Lumen (lumineszenz) wrote :

#1 worked for me, but now I have to set the groups manually which kind of defeats the purpose (fortunately they don't change much).
#17 did not work.
I also tested libpam-ldap on a ubuntu server (static ip, no network-manager) without having any problems,
so the timing of the upcoming network connection could be the source of the problem.
This bug was reported 1 1/2 years ago and and it is still broken, not even assigned to anyone...so abysmal.

Claudio Kuenzler (napsty) wrote :

I ran into this issue today and I'm very surprised this issue is that old. This bug makes Ubuntu as an LDAP client unusable.

#17 did help for faster booting of the machine, but that's it. Any command is still so slow, the machine is useless.

#2 tried it like this:
- Adapted init script /etc/init.d/libnss-ldap like mentioned
- created /etc/nsswitch.conf.local which does NOT contain ldap
- created /etc/nsswitch.conf.ldap which does contain ldap
This again works for a very fast boot, but once libnss-ldap is started (and therefore the ldap version of nsswitch.conf becomes active), the system is bloody slow again.

#1 works to regain a "normal" system with local logins but ldap logins still time out.

Björn K. (isolusx) wrote :

I ran into this issue several month ago. I tried every workaround I could find, but nothing worked.
The debian documentation already warns that libnss-ldap has been orphaned. This means that it does not have a real maintainer at the moment.
They suggest to use the alternative package libnss-ldapd. I tried it and all problems are gone now.

Eric (8-eric-i) wrote :

Dang! this is not a little thing! How does one get some attention on this issue? What does it take to get this bug fixed? Anyone? -
This has and is causing me countless hours.

Eric (8-eric-i) wrote :

OK, so it appears that libnss-ldapd also works for me. (looks more like RH anyway).

Thomas Werschlein (werschlein) wrote :

Solution #14 did work for us on Ubuntu 14.04 LTS. We were copying over /etc/ldap.conf via config mgmt for years now. Pretty big impact for a missing newline at the end of a file ... Thanks a lot for pointing out, Graham!

kay (kay-diam) wrote :

libnss-ldapd doesn't work if user doesn't exist in the system.
once user is created - ssh authentication works fine.

Here are ssh debug logs for: libnss-ldap, libnss-ldapd wo user, and libnss-ldapd w user

kay (kay-diam) wrote :

nss_initgroups_ignoreusers with users with id < 500 works for me.

Jack O'Quin (jack-oquin) wrote :

I recently experienced this problem on Trusty, and tried several of the recommended work-arounds, without success until I tried switching to libnss-ldapd in place of libnss-ldap.

That worked. Now I can successfully reboot my LDAP clients.

Amazing that something so fundamental is still broken several years later.

Richard Hansen (rhansen) on 2015-08-10
summary: - libnss-ldap causes boot hang on Ubuntu 12.04 Precise
+ libnss-ldap causes boot hang on 12.04 precise, 14.04 trusty
tags: added: precise trusty

Had the same problem, boot did hang for approximately 100s, bootchart etc... showed "interesting" processes like hostname hanging.

Replacing libnss-ldap with libnss-ldapd finally fixed the problem.

kay (kay-diam) wrote :

This stuff should be fixed by this init:

/etc/init.d/libnss-ldap

it runs special "/usr/sbin/nssldap-update-ignoreusers" script only on reboot.

This script adds exceptions for system users. Hangs appear when your systemd was not rebooted correctly or by power loss. I can recommend you to add this script into cron job.

kay (kay-diam) wrote :

s/only on reboot/only on correct reboot or shutdown/

gpothier (gpothier) wrote :

I also confirm that replacing libnss-ldap with libnss-ldapd fixes the problem. Regarding comment #25, either this is a bug that has already been fixed, or it is a configuration issue: in my case I have no problem logging in with users that are not created locally.

Karlheinz Salver (ksalver) wrote :

on my 14.04.4 after installing the metapackage ldap-auth-client
i detected the same problem because the metapackage
includes libnss-ldap (auth-client-config, ldap-auth-config, libnss-ldap and libpam-ldap)
I will try to install libnss-ldapd

The package libnss-ldap is old and contains
 the symlink libnss_ldap.so.2 -to-> libnss_ldap-2.15.so ( /lib/i386-linux-gnu/ )
    libnss_ldap-2.15.so is dated of 2012-07-18

The package libnss-ldapd is newer and contains
    libnss_ldap.so.2 ( /lib/i386-linux-gnu/ ) ;no symlink
     libnss_ldap.so.2 is dated of 2014-07-29

i think it is recommended to install libnss-ldapd
             immediately after the metapackage ldap-auth-client
.

kay (kay-diam) wrote :

k-salver, it is not necessary. Please read my comment above: https://bugs.launchpad.net/ubuntu/+source/libnss-ldap/+bug/1024475/comments/32

Karlheinz Salver (ksalver) wrote :

i've also posted the problem about the the metapackage ldap-auth-client
      https://bugs.launchpad.net/ubuntu/+source/ldap-auth-client/+bug/1593073
@kay
 i'll try also your proposal

Karlheinz Salver (ksalver) wrote :

the solution #18 did not work

Karlheinz Salver (ksalver) wrote :

for me , the solution #16
and the replacement of libnss_ldap.so.2 from package libnss-ldapd
worked fine. my machine reboots!

The package libnss-ldap is old and contains
 the symlink libnss_ldap.so.2 -to-> libnss_ldap-2.15.so ( /lib/i386-linux-gnu/ )
    libnss_ldap-2.15.so is dated of 2012-07-18

The package libnss-ldapd is newer and contains
    libnss_ldap.so.2 ( /lib/i386-linux-gnu/ ) ;no symlink
     libnss_ldap.so.2 is dated of 2014-07-29

i think it is recommended to install libnss-ldapd
             immediately after the metapackage ldap-auth-client
.

ben-Nabiy Derush (bennabiy) wrote :

In 16.04.1, replacing libnss-ldap with libnss-ldapd worked to get the machine to boot, and show me my ldap users at login screen, but when I go to log in, it just cycles back to login immediately. I can still log in from the tty, but not graphical.

summary: - libnss-ldap causes boot hang on 12.04 precise, 14.04 trusty
+ libnss-ldap causes boot hang on 12.04 precise, 14.04 trusty, 16.04
+ xenial
tags: added: xenial
Jonathan Gutow (gutow) wrote :

I may have encountered a related problem. My solution may also solve this problem. See Bug #1691826

xennex82 (xennex82) wrote :

I just submitted

#1739833

The postinst script for libnss-ldap invokes "invoke-rc.d" but invoke-rc.d will not do anything unless the Default-Start runlevel of /etc/init.d/libnss-ldap contains runlevel 5.

Because /etc/init.d/libnss-ldap comes without any default runlevels, the action does nothing and the service is not started, causing the service to also not be stopped, and nssldap-update-ignoreusers to not be run on reboot.

This in turn prevents the system users from being added to the ignore list which then causes the boot to fail because it tries to source them from LDAP.

This is probably of particular relevance to groups, but I don't remember, this is long ago for me.

(And it still hasn't been fixed, even though I have been sending emails about this too to devel-discuss).

The ostensibly newer package libnss-ldapd contains the option "ALLLOCAL" which generates this "exclusion list" automatically on boot.

xennex82 (xennex82) wrote :

Okay I don't know how to link to bugs. I guess it wants the specific short form URL:

https://bugs.launchpad.net/bugs/1739833

I have upgraded a 16.04 system which worked/booted perfectly with libnss-ldap being used by nsswitch for passwd, shadow and groups to 17.10. The system took a long time to boot, could not bring up networking properly (running dhclient in 90 second intervals, possibly a timeout) and could not start systemd-logind.

After using nss_initgroups_ignoreusers as stated by Graham Eames in #14 and adding a new line as suggested by Thomas Werschlein in #24, the system started bhaving normally again.

You can use the following command, which I stole from stackexchange, to populate the nss_initgroups_ignoreusers paramter automatically:
#NSS_IGNOREUSERS="$(cut -d: -f1 /etc/passwd | sort | tr '\n' ',' | sed 's|,$||')"
#sed -i "s|^nss_initgroups_ignoreusers.*|nss_initgroups_ignoreusers ${NSS_IGNOREUSERS}|" /etc/ldap.conf

However you will have to add a new line afterwards!

In short: This issue affects 17.10, too.

Suggestion: libnss-ldap should have a paramter which makes it check the passwd/group files and using names which are in there in the nss_initgroups_ignoreusers paramter automatically. This should also be the default configuration, since systemd is the default also.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers