concurrent nss_updatedb processes corrupt database

Bug #250806 reported by Heiko
2
Affects Status Importance Assigned to Milestone
nss-updatedb (Ubuntu)
New
Undecided
Unassigned

Bug Description

Binary package hint: nss-updatedb

### Ubuntu version
bash# lsb_release -rd
Description: Ubuntu 8.04.1
Release: 8.04

### Package versions
bash# apt-cache show nss-updatedb | grep '^Version\|^Depends'
Version: 8-2ubuntu1
Depends: libc6 (>= 2.7-1), libdb4.6

bash# apt-cache show libdb4.6 | grep ^Version
Version: 4.6.21-6ubuntu1

The problem:
When more than one instances of nss_updatedb are running, database corruption can occur easily. That is, 2 out of 2 times it did happen when I ran 4 instances. And 0 out of 1 times it did not happen, running only 2 instances.

Rational:
Though it seems like running multiple instances makes no sense in the first place, at least in my case It is not unlikely to happen. I'd like to cache a large number of users and groups (9000+ for this moment (at testing time), up to 40000 later (at production time)), so I wouldn't like to update too often to keep down the load an the LDAP-server. Also, accounts are added mostly once a day during th night. So I thought it was a good idea to run nss_update from cron.daily as well as once at boot time. This way it may happen that two instances of nss_updatedb my be running at the same time.

Also, the bug *may* be in libdb(?) locking mechanisms? In that case the bug could be quite important. I do not have enough knowledge about this to figure out if that is the case. But assume it is not.

To reproduce:
- Have a ldap server reachable with more that just a few posixAccount and posixGroup objects.
- intstall and configure package libnss-ldap by editing /etc/ldap.conf.
- Install package nss-updatedb (obviously)
- Run 3+ instances of nss_updatedb (as root):
      bash# nss_updatedb ldap & nss_updatedb ldap & nss_updatedb ldap & nss_updatedb ldap
- This prints some error messages, as one may expect. But the bad thing is that the database is pretty much broken now, and running nss_update normally (i.e. only one instance) is not able to use/update the database anymore:

      bash# nss_updatedb ldap
DB_LOGC->get: LSN 325/232777: invalid log record header
Log file corrupt at LSN: [325][232696]
PANIC: Invalid argument
unable to join the environment
Failed to open backing cache in /var/lib/misc/passwd.db: Unknown error 4294936321
passwd... nameservice unavailable.

if it does not happen the first you try to reproduce, please don't expect nothings wrong too early. This may depend on the number of object to fetch (9000+ user plus 9000+ groups in my setup), the hardware (also of the ldap-server), the number of concurrent instances trying to run etc.

regards, Heiko

Heiko (h-e-noordhof)
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.