slapd/slapcat hang in endless loops

Bug #15270 reported by Debian Bug Importer
16
Affects Status Importance Assigned to Milestone
openldap2.2 (Debian)
Fix Released
Unknown
openldap2.2 (Ubuntu)
Fix Released
High
Adam Conrad

Bug Description

Automatically imported from Debian bug report #255276 http://bugs.debian.org/255276

Revision history for this message
In , Roland Bauerschmidt (roland-hbg-bremen) wrote : Re: [debian-openldap] Bug#255276: slapd/slapcat hang in endless loops

Hadmut Danisch wrote:
> since I've upgraded my machine to 2.6.7 slapd
> (and even slapcat) hang. When tracing with strace,
> they are caught in an endless loop of
>
> sched_yield() = 0

Does running db4.2_recover in your database directory get your slapd
working again?

Roland

Revision history for this message
In , Hadmut Danisch (hadmut) wrote :

Roland Bauerschmidt wrote:

>
>Does running db4.2_recover in your database directory get your slapd
>working again?
>
>
>

It seems so. At least slapcat was working again after db4.2_recover.
slapd startet again and seems to be working properly. But it'll take
some time of use to be sure. Nevertheless, good hint. Thanks! :-)

regards
Hadmut

Revision history for this message
In , Stephen Frost (sfrost) wrote :

* Hadmut Danisch (<email address hidden>) wrote:
> since I've upgraded my machine to 2.6.7 slapd
> (and even slapcat) hang. When tracing with strace,
> they are caught in an endless loop of

Others have complained of similar problems. If running db*_recover
doesn't fix the problem then it may be the case that the new TLS-enabled
glibc is the problem. Try setting LD_ASSUME_KERNEL=2.4.1 prior to
starting slapd. That should cause it to link against the old
non-TLS-enable version of glibc. (NOTE: TLS in this case is
'thread-local storage', nothing to do with 'transport-layer security').

 Stephen

Revision history for this message
In , Marcus Better (marcus-better-abc) wrote : slapd: Similar problem
Download full text (3.2 KiB)

Package: slapd
Version: 2.1.30-3
Followup-For: Bug #255276

I am using LDAP for authentication with libnss_ldap. I also use
webmin-ldap-useradmin 1.160-1 for managing the LDAP database.

I am trying to add Samba login information to a user account using
webmin, but when I set "Samba login?" to "Yes" in the webmin Edit User
screen and press Save, I get
  "Failed to save user : Failed to modify user in LDAP database : I/O
  Error"
and subsequent attempts to access the LDAP database through webmin fail with
  "Failed to connect to LDAP server 127.0.0.1 port 389."

Moreover the LDAP server stops accepting connections:

~$ ldapsearch
ldap_sasl_interactive_bind_s: Can't contact LDAP server (81)
~$ ldapsearch -W -x
Enter LDAP Password:
ldap_bind: Can't contact LDAP server (81)

As a result, it is no longer possible to log in to the system.

Listing the database with slapcat works sometimes, but I have also
experienced that slapcat just hangs, giving no output at all.

Restarting slapd does not help, but if I run db4.2_recover and then
restart slapd, it works normally again.

The LDAP database is very small, containing only 3-4 user entries.

I tried putting the line
  export LD_ASSUME_KERNEL=2.4.1
in /etc/default/slapd and restarting slapd, but it did not help.

-- System Information:
Debian Release: 3.1
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.8custom
Locale: LANG=sv_SE.UTF-8, LC_CTYPE=sv_SE.UTF-8

Versions of packages slapd depends on:
ii coreutils [fileutils] 5.2.1-2 The GNU core utilities
ii debconf 1.4.30.5 Debian configuration management sy
ii libc6 2.3.2.ds1-16 GNU C Library: Shared libraries an
pi libdb4.2 4.2.52-17 Berkeley v4.2 Database Libraries [
ii libgcrypt11 1.2.0-4 LGPL Crypto library - runtime libr
ii libgnutls11 1.0.16-7 GNU TLS library - runtime library
ii libgpg-error0 1.0-1 library for common error values an
ii libiodbc2 3.51.2-5 iODBC Driver Manager
ii libldap2 2.1.30-3 OpenLDAP libraries
ii libltdl3 1.5.6-2 A system independent dlopen wrappe
ii libsasl2 2.1.19-1.1 Authentication abstraction library
ii libslp1 1.0.11-7 OpenSLP libraries
ii libwrap0 7.6.dbs-6 Wietse Venema's TCP wrappers libra
ii perl [libmime-base64-perl] 5.8.4-2.2 Larry Wall's Practical Extraction
ii psmisc 21.5-1 Utilities that use the proc filesy
ii zlib1g 1:1.2.1.1-7 compression library - runtime

-- debconf information:
  slapd/fix_directory: true
* shared/organization: Better Home
  slapd/upgrade_slapcat_failure:
  slapd/backend: BDB
* slapd/allow_ldap_v2: false
  slapd/no_configuration: false
  slapd/move_old_database: true
  slapd/suffix_change: false
  slapd/slave_databases_require_updateref:
  slapd/autoconf_modules: true
* slapd/domain: home.better.se
  slapd/password_mismatch:
  slapd/invalid_config: true
  slapd/upgrade_slap...

Read more...

Revision history for this message
In , Olivier Berger (olivierberger) wrote : slapd: Same here : slapcat hangs, and db4.2_verify too

Package: slapd
Version: 2.1.30-3
Followup-For: Bug #255276

Hi. I wanted to report the same thing here.

slapcat is also stuck in an endless loop on sched_yield()

And db4.2_verify gets stuck too.

straceing db4.2_verify gives endless messages like :
select(0, NULL, NULL, NULL, {0, 25000}) = 0 (Timeout)

db4.2_recover was successful though...

Probably a bug in db4.2_verify too ?

After the recover, slapcat works again as expected...

There is a problem somewhere still...

Hope this helps.

-- System Information:
Debian Release: 3.1
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)
Kernel: Linux 2.6.8-1-686-smp
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)

Versions of packages slapd depends on:
ii coreutils [fileutils] 5.2.1-2 The GNU core utilities
ii debconf 1.4.30.11 Debian configuration management sy
ii fileutils 5.2.1-2 The GNU file management utilities
ii libc6 2.3.2.ds1-18 GNU C Library: Shared libraries an
ii libdb4.2 4.2.52-17 Berkeley v4.2 Database Libraries [
ii libgcrypt11 1.2.0-4 LGPL Crypto library - runtime libr
ii libgnutls11 1.0.16-9 GNU TLS library - runtime library
ii libgpg-error0 1.0-1 library for common error values an
ii libiodbc2 3.52.1-2 iODBC Driver Manager
ii libldap2 2.1.30-3 OpenLDAP libraries
ii libltdl3 1.5.6-3 A system independent dlopen wrappe
ii libsasl2 2.1.19-1.5 Authentication abstraction library
ii libslp1 1.0.11-7 OpenSLP libraries
ii libwrap0 7.6.dbs-6 Wietse Venema's TCP wrappers libra
ii perl [libmime-base64-perl] 5.8.4-3 Larry Wall's Practical Extraction
ii psmisc 21.5-1 Utilities that use the proc filesy
ii zlib1g 1:1.2.2-3 compression library - runtime

-- debconf information:
* slapd/password2: (password omitted)
  slapd/internal/adminpw: (password omitted)
* slapd/password1: (password omitted)
  slapd/password_mismatch:
  slapd/fix_directory: true
  slapd/invalid_config: true
* shared/organization: nodomain
  slapd/upgrade_slapcat_failure:
  slapd/upgrade_slapadd_failure:
  slapd/backend: BDB
* slapd/allow_ldap_v2: false
  slapd/no_configuration: false
  slapd/move_old_database: true
  slapd/suffix_change: false
  slapd/slave_databases_require_updateref:
  slapd/autoconf_modules: true
  slapd/purge_database: false
  slapd/admin:
* slapd/domain: nodomain

Revision history for this message
In , Torsten Landschoff (torsten) wrote : Re: Bug#255276: slapd: Same here : slapcat hangs, and db4.2_verify too

Hi Olivier,

On Wed, Jan 05, 2005 at 10:49:17AM +0100, Olivier Berger wrote:

> slapcat is also stuck in an endless loop on sched_yield()
> And db4.2_verify gets stuck too.
>
> straceing db4.2_verify gives endless messages like :
> select(0, NULL, NULL, NULL, {0, 25000}) = 0 (Timeout)
>
> db4.2_recover was successful though...

Always the same question: Still applicable? I never was able to
reproduce this hefty problems...

Greetings

 Torsten

Revision history for this message
In , Torsten Landschoff (t-landschoff) wrote : tagging 255276

tags 255276 unreproducible

Revision history for this message
In , Matthew Hawkins (matthew-intology) wrote : sched_yield() loop

Hi Torsten,

My ldap box suffered two power outages within 24 hours this week (one
caused by the power company, the other by a well-intentioned but unclued
cow-orker). On inspection slapd was stuck in a sched_yield() loop.
Doing the db4.2_recover fixed it (I actually came across that hint
elsewhere, I'll try to remember to go to b.d.o first next time ;)

If there is no note, perhaps add one to the README.Debian that if slapd
appears to be consuming all cpu power and this is unexpected, and strace
shows its stuck in sched_yield(), then the bdb database is probably
corrupt and requires recovery using db4.2_recover.

Then you can close this bug and any others like it ;)

Cheers,

--
Matt

Revision history for this message
In , Torsten Landschoff (t-landschoff) wrote : merging 302992 255276

merge 302992 255276

Revision history for this message
In , Torsten Landschoff (t-landschoff) wrote : merging 303057 302992

merge 303057 302992

Revision history for this message
In , Torsten Landschoff (t-landschoff) wrote : severity of 303057 is serious

severity 303057 serious

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Automatically imported from Debian bug report #255276 http://bugs.debian.org/255276

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-Id: <email address hidden>
Date: Sun, 20 Jun 2004 01:23:37 +0200
From: Hadmut Danisch <email address hidden>
To: Debian Bug Tracking System <email address hidden>
Subject: slapd/slapcat hang in endless loops

Package: slapd
Version: 2.1.30-1
Severity: normal

Hi,

since I've upgraded my machine to 2.6.7 slapd
(and even slapcat) hang. When tracing with strace,
they are caught in an endless loop of

sched_yield() = 0

calls.

regards
Hadmut

-- System Information:
Debian Release: testing/unstable
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.7-danisch-p4-intel
Locale: LANG=C, LC_CTYPE=de_DE

Versions of packages slapd depends on:
ii coreutils [fileutils] 5.0.91-2 The GNU core utilities
ii debconf 1.4.25 Debian configuration management sy
ii libc6 2.3.2.ds1-13 GNU C Library: Shared libraries an
ii libdb4.2 4.2.52-16 Berkeley v4.2 Database Libraries [
ii libgcrypt7 1.1.90-1.1 LGPL Crypto library - runtime libr
ii libgnutls10 1.0.4-3 GNU TLS library - runtime library
ii libgpg-error0 0.7-1 library for common error values an
ii libiodbc2 3.51.2-2 iODBC Driver Manager
ii libldap2 2.1.30-1 OpenLDAP libraries
ii libltdl3 1.5.6-1 A system independent dlopen wrappe
ii libsasl2 2.1.18-4.1 Authentication abstraction library
ii libslp1 1.0.11-7 OpenSLP libraries
ii libtasn1-2 0.2.7-2 Manage ASN.1 structures (runtime)
ii libwrap0 7.6.dbs-4 Wietse Venema's TCP wrappers libra
ii perl [libmime-base64-perl] 5.8.4-2 Larry Wall's Practical Extraction
ii psmisc 21.5-1 Utilities that use the proc filesy
ii zlib1g 1:1.2.1.1-3 compression library - runtime

-- debconf information:
  slapd/fix_directory: true
* shared/organization: nodomain
  slapd/upgrade_slapcat_failure:
  slapd/backend: BDB
* slapd/allow_ldap_v2: false
  slapd/no_configuration: false
  slapd/move_old_database: true
  slapd/suffix_change: false
  slapd/slave_databases_require_updateref:
  slapd/autoconf_modules: true
* slapd/domain: nodomain
  slapd/password_mismatch:
  slapd/invalid_config: true
  slapd/upgrade_slapadd_failure:
  slapd/purge_database: false
  slapd/admin:

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-ID: <email address hidden>
Date: Sun, 20 Jun 2004 12:45:02 +0200
From: Roland Bauerschmidt <email address hidden>
To: Hadmut Danisch <email address hidden>, <email address hidden>
Subject: Re: [debian-openldap] Bug#255276: slapd/slapcat hang in endless loops

Hadmut Danisch wrote:
> since I've upgraded my machine to 2.6.7 slapd
> (and even slapcat) hang. When tracing with strace,
> they are caught in an endless loop of
>
> sched_yield() = 0

Does running db4.2_recover in your database directory get your slapd
working again?

Roland

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-ID: <email address hidden>
Date: Sun, 20 Jun 2004 16:48:35 +0200
From: Hadmut Danisch <email address hidden>
To: Roland Bauerschmidt <email address hidden>
CC: <email address hidden>
Subject: Re: [debian-openldap] Bug#255276: slapd/slapcat hang in endless loops

Roland Bauerschmidt wrote:

>
>Does running db4.2_recover in your database directory get your slapd
>working again?
>
>
>

It seems so. At least slapcat was working again after db4.2_recover.
slapd startet again and seems to be working properly. But it'll take
some time of use to be sure. Nevertheless, good hint. Thanks! :-)

regards
Hadmut

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-ID: <email address hidden>
Date: Mon, 21 Jun 2004 08:09:32 -0400
From: Stephen Frost <email address hidden>
To: Hadmut Danisch <email address hidden>, <email address hidden>
Subject: Re: [debian-openldap] Bug#255276: slapd/slapcat hang in endless loops

--VjdAulSyFLm6MpFE
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

* Hadmut Danisch (<email address hidden>) wrote:
> since I've upgraded my machine to 2.6.7 slapd
> (and even slapcat) hang. When tracing with strace,=20
> they are caught in an endless loop of=20

Others have complained of similar problems. If running db*_recover
doesn't fix the problem then it may be the case that the new TLS-enabled
glibc is the problem. Try setting LD_ASSUME_KERNEL=3D2.4.1 prior to
starting slapd. That should cause it to link against the old
non-TLS-enable version of glibc. (NOTE: TLS in this case is
'thread-local storage', nothing to do with 'transport-layer security').

 Stephen

--VjdAulSyFLm6MpFE
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFA1s/7rzgMPqB3kigRApVLAJ9YSpUoHq65pNymmPZ6oSZR1t11FwCfWl9H
JLHU+cyUeiqOq4bmMJ3IuUo=
=o+MK
-----END PGP SIGNATURE-----

--VjdAulSyFLm6MpFE--

Revision history for this message
Debian Bug Importer (debzilla) wrote :
Download full text (3.4 KiB)

Message-Id: <email address hidden>
Date: Fri, 24 Sep 2004 10:22:13 +0200
From: Marcus Better <email address hidden>
To: Debian Bug Tracking System <email address hidden>
Subject: slapd: Similar problem

Package: slapd
Version: 2.1.30-3
Followup-For: Bug #255276

I am using LDAP for authentication with libnss_ldap. I also use
webmin-ldap-useradmin 1.160-1 for managing the LDAP database.

I am trying to add Samba login information to a user account using
webmin, but when I set "Samba login?" to "Yes" in the webmin Edit User
screen and press Save, I get
  "Failed to save user : Failed to modify user in LDAP database : I/O
  Error"
and subsequent attempts to access the LDAP database through webmin fail with
  "Failed to connect to LDAP server 127.0.0.1 port 389."

Moreover the LDAP server stops accepting connections:

~$ ldapsearch
ldap_sasl_interactive_bind_s: Can't contact LDAP server (81)
~$ ldapsearch -W -x
Enter LDAP Password:
ldap_bind: Can't contact LDAP server (81)

As a result, it is no longer possible to log in to the system.

Listing the database with slapcat works sometimes, but I have also
experienced that slapcat just hangs, giving no output at all.

Restarting slapd does not help, but if I run db4.2_recover and then
restart slapd, it works normally again.

The LDAP database is very small, containing only 3-4 user entries.

I tried putting the line
  export LD_ASSUME_KERNEL=2.4.1
in /etc/default/slapd and restarting slapd, but it did not help.

-- System Information:
Debian Release: 3.1
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'unstable')
Architecture: i386 (i686)
Kernel: Linux 2.6.8custom
Locale: LANG=sv_SE.UTF-8, LC_CTYPE=sv_SE.UTF-8

Versions of packages slapd depends on:
ii coreutils [fileutils] 5.2.1-2 The GNU core utilities
ii debconf 1.4.30.5 Debian configuration management sy
ii libc6 2.3.2.ds1-16 GNU C Library: Shared libraries an
pi libdb4.2 4.2.52-17 Berkeley v4.2 Database Libraries [
ii libgcrypt11 1.2.0-4 LGPL Crypto library - runtime libr
ii libgnutls11 1.0.16-7 GNU TLS library - runtime library
ii libgpg-error0 1.0-1 library for common error values an
ii libiodbc2 3.51.2-5 iODBC Driver Manager
ii libldap2 2.1.30-3 OpenLDAP libraries
ii libltdl3 1.5.6-2 A system independent dlopen wrappe
ii libsasl2 2.1.19-1.1 Authentication abstraction library
ii libslp1 1.0.11-7 OpenSLP libraries
ii libwrap0 7.6.dbs-6 Wietse Venema's TCP wrappers libra
ii perl [libmime-base64-perl] 5.8.4-2.2 Larry Wall's Practical Extraction
ii psmisc 21.5-1 Utilities that use the proc filesy
ii zlib1g 1:1.2.1.1-7 compression library - runtime

-- debconf information:
  slapd/fix_directory: true
* shared/organization: Better Home
  slapd/upgrade_slapcat_failure:
  slapd/backend: BDB
* slapd/allow_ldap_v2: false
  slapd/no_configuration: false
  slapd/...

Read more...

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-Id: <email address hidden>
Date: Wed, 05 Jan 2005 10:49:17 +0100
From: Olivier Berger <email address hidden>
To: Debian Bug Tracking System <email address hidden>
Subject: slapd: Same here : slapcat hangs, and db4.2_verify too

Package: slapd
Version: 2.1.30-3
Followup-For: Bug #255276

Hi. I wanted to report the same thing here.

slapcat is also stuck in an endless loop on sched_yield()

And db4.2_verify gets stuck too.

straceing db4.2_verify gives endless messages like :
select(0, NULL, NULL, NULL, {0, 25000}) = 0 (Timeout)

db4.2_recover was successful though...

Probably a bug in db4.2_verify too ?

After the recover, slapcat works again as expected...

There is a problem somewhere still...

Hope this helps.

-- System Information:
Debian Release: 3.1
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)
Kernel: Linux 2.6.8-1-686-smp
Locale: LANG=C, LC_CTYPE=C (charmap=ANSI_X3.4-1968)

Versions of packages slapd depends on:
ii coreutils [fileutils] 5.2.1-2 The GNU core utilities
ii debconf 1.4.30.11 Debian configuration management sy
ii fileutils 5.2.1-2 The GNU file management utilities
ii libc6 2.3.2.ds1-18 GNU C Library: Shared libraries an
ii libdb4.2 4.2.52-17 Berkeley v4.2 Database Libraries [
ii libgcrypt11 1.2.0-4 LGPL Crypto library - runtime libr
ii libgnutls11 1.0.16-9 GNU TLS library - runtime library
ii libgpg-error0 1.0-1 library for common error values an
ii libiodbc2 3.52.1-2 iODBC Driver Manager
ii libldap2 2.1.30-3 OpenLDAP libraries
ii libltdl3 1.5.6-3 A system independent dlopen wrappe
ii libsasl2 2.1.19-1.5 Authentication abstraction library
ii libslp1 1.0.11-7 OpenSLP libraries
ii libwrap0 7.6.dbs-6 Wietse Venema's TCP wrappers libra
ii perl [libmime-base64-perl] 5.8.4-3 Larry Wall's Practical Extraction
ii psmisc 21.5-1 Utilities that use the proc filesy
ii zlib1g 1:1.2.2-3 compression library - runtime

-- debconf information:
* slapd/password2: (password omitted)
  slapd/internal/adminpw: (password omitted)
* slapd/password1: (password omitted)
  slapd/password_mismatch:
  slapd/fix_directory: true
  slapd/invalid_config: true
* shared/organization: nodomain
  slapd/upgrade_slapcat_failure:
  slapd/upgrade_slapadd_failure:
  slapd/backend: BDB
* slapd/allow_ldap_v2: false
  slapd/no_configuration: false
  slapd/move_old_database: true
  slapd/suffix_change: false
  slapd/slave_databases_require_updateref:
  slapd/autoconf_modules: true
  slapd/purge_database: false
  slapd/admin:
* slapd/domain: nodomain

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-ID: <email address hidden>
Date: Wed, 9 Mar 2005 13:58:59 +0100
From: Torsten Landschoff <email address hidden>
To: Olivier Berger <email address hidden>,
 <email address hidden>
Subject: Re: Bug#255276: slapd: Same here : slapcat hangs, and db4.2_verify too

--E7i4zwmWs5DOuDSH
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Olivier,=20

On Wed, Jan 05, 2005 at 10:49:17AM +0100, Olivier Berger wrote:
=20
> slapcat is also stuck in an endless loop on sched_yield()
> And db4.2_verify gets stuck too.
>=20
> straceing db4.2_verify gives endless messages like :
> select(0, NULL, NULL, NULL, {0, 25000}) =3D 0 (Timeout)
>=20
> db4.2_recover was successful though...

Always the same question: Still applicable? I never was able to
reproduce this hefty problems...

Greetings

 Torsten

--E7i4zwmWs5DOuDSH
Content-Type: application/pgp-signature; name="signature.asc"
Content-Description: Digital signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)

iD8DBQFCLvMTdQgHtVUb5EcRAqxcAJkBI611goGP37L7y3NBRWYoY+vsjwCePiCr
2/ULdlqZYJQRT9on2TChbNE=
=WgZH
-----END PGP SIGNATURE-----

--E7i4zwmWs5DOuDSH--

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-Id: <email address hidden>
Date: Wed, 9 Mar 2005 13:59:12 +0100 (CET)
From: <email address hidden> (Torsten Landschoff)
To: <email address hidden>
Subject: tagging 255276

tags 255276 unreproducible

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-ID: <email address hidden>
Date: Wed, 23 Mar 2005 11:39:37 +1100
From: Matthew Hawkins <email address hidden>
To: <email address hidden>
Subject: sched_yield() loop

--FCuugMFkClbJLl1L
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable

Hi Torsten,

My ldap box suffered two power outages within 24 hours this week (one
caused by the power company, the other by a well-intentioned but unclued
cow-orker). On inspection slapd was stuck in a sched_yield() loop.
Doing the db4.2_recover fixed it (I actually came across that hint
elsewhere, I'll try to remember to go to b.d.o first next time ;)

If there is no note, perhaps add one to the README.Debian that if slapd
appears to be consuming all cpu power and this is unexpected, and strace
shows its stuck in sched_yield(), then the bdb database is probably
corrupt and requires recovery using db4.2_recover.

Then you can close this bug and any others like it ;)

Cheers,

--=20
Matt

--FCuugMFkClbJLl1L
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.0 (GNU/Linux)

iD8DBQFCQLrJWzq7BJucGyIRAiIAAKCJFHDd8XriR5KRTOyXirfCdI7jwQCeIsau
WYLg8i8qj2Jesy0ARkfSipA=
=SjfD
-----END PGP SIGNATURE-----

--FCuugMFkClbJLl1L--

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-Id: <email address hidden>
Date: Mon, 4 Apr 2005 08:27:47 +0200 (CEST)
From: <email address hidden> (Torsten Landschoff)
To: <email address hidden>
Subject: merging 302992 255276

merge 302992 255276

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-Id: <email address hidden>
Date: Mon, 4 Apr 2005 18:32:29 +0200 (CEST)
From: <email address hidden> (Torsten Landschoff)
To: <email address hidden>
Subject: merging 303057 302992

merge 303057 302992

Revision history for this message
Debian Bug Importer (debzilla) wrote :

Message-Id: <email address hidden>
Date: Mon, 11 Apr 2005 23:19:19 +0200 (CEST)
From: <email address hidden> (Torsten Landschoff)
To: <email address hidden>
Subject: severity of 303057 is serious

severity 303057 serious

Revision history for this message
Debian Bug Importer (debzilla) wrote :

*** Bug 15271 has been marked as a duplicate of this bug. ***

Revision history for this message
Debian Bug Importer (debzilla) wrote :

*** Bug 15272 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Christian Hammers (ch) wrote :

Hello

A month is too long for a RC bug so I dare to ask if there's progress
with this one :)

The bug was marked unreproducible and the last comment from the
submitter was "you can close this bug and any others like it",
so maybe downgrade it a bit and leave it just for reference?

Regarding the issue, if slapd reacts this strange, maybe add this
db_verify command to /etc/init.d/slapd :) (if it does not take too
long).

bye,

-christian-

Revision history for this message
In , Florian Weimer (fw) wrote : Fixed in upstream CVS

See this message below. Recent OpenLDAP versions will recover
automatically, as needed.

Rumor has it that Berkeley DB 4.4 will offer similar functionality,
too.

From: Howard Chu <email address hidden>
Subject: Re: Force single thread of control during recovery
Newsgroups: comp.databases.berkeley-db
Date: Sun, 15 May 2005 16:42:07 -0700
Message-ID: <email address hidden>

oleksandr kalinin wrote:
> Hello All,
> In my environment, there is a command-line database maintenance tool
> for
> users to display, add, modify records etc. During recovery, I would have to
> make sure this tool does not attempt to open the environment so that there
> is only single thread of control accessing it. Seems like I have to
> implement own locking mechanism to handle this, or is there some "smarter"
> way, e.g. in the library? Locking is easy in this case, but just to make
> sure I haven't missed something important in documentation... Many thanks
> for your help.

The current releases of the BDB library don't offer any support here. We
wrote our own locking mechanism for OpenLDAP to mediate access to the
environment. We record the process IDs of all processes with a valid
handle to the environment, so we can detect unclean exits and force a
recovery only when needed. And of course, only one process can trigger a
recovery. The code is in OpenLDAP's CVS if you're interested,
servers/slapd/back-bdb/alock.c

--
  -- Howard Chu
  Chief Architect, Symas Corp. Director, Highland Sun
  http://www.symas.com http://highlandsun.com/hyc
  Symas: Premier OpenSource Development and Support

Revision history for this message
In , Eugene Konev (ejka) wrote :

After experimenting with different kinds of slapd database corrution on
2.2.23 I've found out that this behavior is hapenning then db
environment files (/var/lib/ldap/__db.00[1-5]) are missing or severely
corrupted.
ltracing slapd shows that it stucks in dbenv_open, so it's problem in
berkeley db, not slapd itself.
Most times (when logs were not severely corrupted as far as I can say)
it was possible to recover situation by running db4.2_recover. So I'd
also suggest adding call to db4.2_recover somewhere in the init-script.

Revision history for this message
In , Torsten Landschoff (torsten) wrote : Re: Bug#255276: slapd/slapcat hang in endless loops

On Mon, May 23, 2005 at 01:40:11AM +0800, Eugene Konev wrote:
> After experimenting with different kinds of slapd database corrution on
> 2.2.23 I've found out that this behavior is hapenning then db
> environment files (/var/lib/ldap/__db.00[1-5]) are missing or severely
> corrupted.
> ltracing slapd shows that it stucks in dbenv_open, so it's problem in
> berkeley db, not slapd itself.
> Most times (when logs were not severely corrupted as far as I can say)
> it was possible to recover situation by running db4.2_recover. So I'd
> also suggest adding call to db4.2_recover somewhere in the init-script.

Looks like we should do that after bazillions of people requested it.
Any suggestions how to implement this correctly?

Greetings

 Torsten

Revision history for this message
In , Eugene Konev (ejka) wrote :

Hello Torsten.

 On Mon, 23 May 2005 17:11:44 +0200
 you wrote:

 TL> Looks like we should do that after bazillions of people requested it.
 TL> Any suggestions how to implement this correctly?

The attached patch adds calling db4.2_recover to slapd.init on every
slapd startup.

Revision history for this message
In , Steve Langasek (vorlon) wrote :

On Wed, May 25, 2005 at 11:26:03PM +0800, Eugene Konev wrote:

> On Mon, 23 May 2005 17:11:44 +0200
> you wrote:

> TL> Looks like we should do that after bazillions of people requested it.
> TL> Any suggestions how to implement this correctly?

> The attached patch adds calling db4.2_recover to slapd.init on every
> slapd startup.

Are there any objections to applying this patch for sarge?

--
Steve Langasek
postmodern programmer

> diff -Nru openldap2.2-2.2.23/debian/slapd.init openldap2.2-hack/debian/slapd.init
> --- openldap2.2-2.2.23/debian/slapd.init 2005-05-24 19:42:21.000000000 +0800
> +++ openldap2.2-hack/debian/slapd.init 2005-05-25 23:15:35.000000000 +0800
> @@ -48,6 +48,10 @@
> "$SLAPD_CONF"`
> fi
>
> +# Find out slapd db directories
> +SLAPD_DBDIRS=`sed -ne 's/^directory[[:space:]]\+"*\([^"]\+\).*/\1/p' \
> + "$SLAPD_CONF" `
> +
> # XXX: Breaks upgrading if there is no pidfile (invoke-rc.d stop will fail)
> # -- Torsten
> if [ -z "$SLAPD_PIDFILE" ]; then
> @@ -107,6 +111,24 @@
> }
>
>
> +# Try to recover slapd database
> +try_fix_db() {
> + if [ "$SLAPD_TRYFIXDB" != yes -o \
> + -z "$SLAPD_DBDIRS" ]; then
> + return 0
> + fi
> + echo -n " (possibly) fixing db,"
> + for DBDIR in $SLAPD_DBDIRS; do
> + if [ -d "$DBDIR" -a -f "$DBDIR/objectClass.bdb" ]; then
> + db4.2_recover -eh $DBDIR 2>&1
> + if [ $? -ne 0 ]; then
> + reason="Automatic recovery of slapd database failed. You will need to perform recovery by hand, possibly from backup."
> + exit 1
> + fi
> + fi
> + done
> +}
> +
> # Start the slapd daemon and capture the error message if any to
> # $reason.
> start_slapd() {
> @@ -157,6 +179,7 @@
> start() {
> echo -n "Starting OpenLDAP:"
> trap 'report_failure' 0
> + try_fix_db
> start_slapd
> start_slurpd
> trap "-" 0

Revision history for this message
In , Torsten Landschoff (torsten) wrote :

Hi Steve,

On Wed, May 25, 2005 at 05:02:42PM -0700, Steve Langasek wrote:
> > The attached patch adds calling db4.2_recover to slapd.init on every
> > slapd startup.
>
> Are there any objections to applying this patch for sarge?

I'll apply it with some adaptions. Most prominently it should check for
the database type before running db_recover.

Greetings

 Torsten

Revision history for this message
In , Torsten Landschoff (torsten) wrote :

Hi Eugene,

On Wed, May 25, 2005 at 11:26:03PM +0800, Eugene Konev wrote:

> The attached patch adds calling db4.2_recover to slapd.init on every
> slapd startup.

Thanks again for the patch. I am currently working to apply your
changes. A few remarks while I am working on it:

> diff -Nru openldap2.2-2.2.23/debian/slapd.init openldap2.2-hack/debian/slapd.init
> --- openldap2.2-2.2.23/debian/slapd.init 2005-05-24 19:42:21.000000000 +0800
> +++ openldap2.2-hack/debian/slapd.init 2005-05-25 23:15:35.000000000 +0800
> @@ -48,6 +48,10 @@
> "$SLAPD_CONF"`
> fi
>
> +# Find out slapd db directories
> +SLAPD_DBDIRS=`sed -ne 's/^directory[[:space:]]\+"*\([^"]\+\).*/\1/p' \
> + "$SLAPD_CONF" `
> +

I'd rather gather this list at the time when it is needed. Apart from
that I don't really grok that sed expression :)

> +# Try to recover slapd database
> +try_fix_db() {
...
> + if [ "$SLAPD_TRYFIXDB" != yes -o \
That switch makes no sense inside fix_db I'd say. I used to do it like
this but stumbled across it later because I was using that function
later and it did not do anything because it checked some magic variable.
Not going to happen here I'd say, but for consistency I'll move that
check.

> + echo -n " (possibly) fixing db,"
I don't like that message - as a user I am not going to understand
what's going on there. Changed to "running BDB recovery".

> + for DBDIR in $SLAPD_DBDIRS; do
> + if [ -d "$DBDIR" -a -f "$DBDIR/objectClass.bdb" ]; then
Did not see that check at the first glance. Moving it to the point where
the list of BDB directories is collected.

> + db4.2_recover -eh $DBDIR 2>&1
What is that -e option about? I do not really understand the meaning.
Guess I'll keep it enabled. Perhaps it would be a good idea to install
a DB_CONFIG during upgrade and run db_recover without -e to get the new
settings!?

@Steve: What would you think about such a change?

> + if [ $? -ne 0 ]; then
That will not work because of the "set -e" up in that file. The script
will bail out if db_recover fails and will never get to that check. It's
better to run

 db4.2_recover ... || do stuff in case of failure

or run db_recover inside an if but that will hide the real functionality
of db_recover as an side effect of the if.

I attached my current patch (currently building slapd to test it).
Suggestions welcome :)

Greetings

 Torsten

Revision history for this message
In , Torsten Landschoff (t-landschoff) wrote : setting package to openldap2.2 slapd ldap-utils libldap-2.2-7, tagging 255276

package openldap2.2 slapd ldap-utils libldap-2.2-7
tags 255276 + pending

Revision history for this message
In , Eugene Konev (ejka) wrote : Re: Bug#255276: slapd/slapcat hang in endless loops

Hello Torsten.

 On Thu, 26 May 2005 16:53:17 +0200
 you wrote:

 >> +# Find out slapd db directories
 >> +SLAPD_DBDIRS=`sed -ne 's/^directory[[:space:]]\+"*\([^"]\+\).*/\1/p' \
 >> + "$SLAPD_CONF" `
 >> +

 TL> I'd rather gather this list at the time when it is needed. Apart from
 TL> that I don't really grok that sed expression :)

It has only possible failure if path contains double quotes in it, which
is very unlikely situation. And this:

 TL> +# Find bdb environment dirs
 TL> +find_bdb_envs() {
 TL> + local d
 TL> + for d in `awk '/directory/ {print $2}' < "$SLAPD_CONF"`; do
 TL> + if [ -d "$d" -a -f "$d/objectClass.bdb" ]; then
 TL> + echo $d
 TL> + fi
 TL> + done
 TL> +}

will happily skip entries like:
directory "/var/lib/ldap"
(note the quotes), which are by default in sarge install.

Revision history for this message
In , Florian Weimer (fw) wrote : Patch to run database recovery on startup

The patch is risky. After it's been applied, invoking
"/etc/init.d/slapd start" while slapd is running can (and most
probably will) result in data loss.

"db4.2_recover -e" will pick up new DB_CONFIG settings, so there's no
need to special-case it for updates.

Revision history for this message
In , Torsten Landschoff (torsten) wrote : Re: Bug#255276: Patch to run database recovery on startup

Hi Florian,

On Fri, May 27, 2005 at 08:27:47AM +0200, Florian Weimer wrote:
> The patch is risky. After it's been applied, invoking
> "/etc/init.d/slapd start" while slapd is running can (and most
> probably will) result in data loss.

Yep, that is creating headaches for me too :(

> "db4.2_recover -e" will pick up new DB_CONFIG settings, so there's no
> need to special-case it for updates.

Are you sure? I though "-e" was to retain the old setting?

Greetings

 Torsten

Revision history for this message
In , Torsten Landschoff (torsten) wrote : Re: Bug#255276: slapd/slapcat hang in endless loops

On Fri, May 27, 2005 at 09:10:50AM +0800, Eugene Konev wrote:

> >> +# Find out slapd db directories
> >> +SLAPD_DBDIRS=`sed -ne 's/^directory[[:space:]]\+"*\([^"]\+\).*/\1/p' \
> >> + "$SLAPD_CONF" `
> >> +
>
> TL> I'd rather gather this list at the time when it is needed. Apart from
> TL> that I don't really grok that sed expression :)
>
> It has only possible failure if path contains double quotes in it, which
> is very unlikely situation. And this:

Yes, I was able to parse it finally :)

> TL> +# Find bdb environment dirs
> TL> +find_bdb_envs() {
> TL> + local d
> TL> + for d in `awk '/directory/ {print $2}' < "$SLAPD_CONF"`; do
> TL> + if [ -d "$d" -a -f "$d/objectClass.bdb" ]; then
> TL> + echo $d
> TL> + fi
> TL> + done
> TL> +}
>
> will happily skip entries like:
> directory "/var/lib/ldap"
> (note the quotes), which are by default in sarge install.

Yep. I was using eval to strip them and removed that before sending the
email as it can have really serious results depending on your
slapd.conf.

Greetings

 Torsten

Revision history for this message
In , Florian Weimer (fw) wrote : Re: Bug#255276: Patch to run database recovery on startup

* Torsten Landschoff:

>> "db4.2_recover -e" will pick up new DB_CONFIG settings, so there's no
>> need to special-case it for updates.
>
> Are you sure?

Yes, I regularly use "-e" to recreate the environment after tweaking
DB_CONFIG.

> I though "-e" was to retain the old setting?

"-e" does not retain any settings, it retains the environment. The
settings are taken from DB_CONFIG and the compiled-in default values,
not from the previous environment configuration.

Revision history for this message
In , Florian Weimer (fw) wrote :

* Florian Weimer:

>> I though "-e" was to retain the old setting?
>
> "-e" does not retain any settings, it retains the environment.

It retains it by removing and recreating it (sorry for being unclear).

Revision history for this message
In , Torsten Landschoff (torsten) wrote : Bug#255276: fixed in openldap2.2 2.2.23-6
Download full text (4.0 KiB)

Source: openldap2.2
Source-Version: 2.2.23-6

We believe that the bug you reported is fixed in the latest version of
openldap2.2, which is due to be installed in the Debian FTP archive:

ldap-utils_2.2.23-6_i386.deb
  to pool/main/o/openldap2.2/ldap-utils_2.2.23-6_i386.deb
libldap-2.2-7_2.2.23-6_i386.deb
  to pool/main/o/openldap2.2/libldap-2.2-7_2.2.23-6_i386.deb
openldap2.2_2.2.23-6.diff.gz
  to pool/main/o/openldap2.2/openldap2.2_2.2.23-6.diff.gz
openldap2.2_2.2.23-6.dsc
  to pool/main/o/openldap2.2/openldap2.2_2.2.23-6.dsc
slapd_2.2.23-6_i386.deb
  to pool/main/o/openldap2.2/slapd_2.2.23-6_i386.deb

A summary of the changes between this version and the previous one is
attached.

Thank you for reporting the bug, which will now be closed. If you
have further comments please address them to <email address hidden>,
and the maintainer will reopen the bug report if appropriate.

Debian distribution maintenance software
pp.
Torsten Landschoff <email address hidden> (supplier of updated openldap2.2 package)

(This message was generated automatically at their request; if you
believe that there is a problem with it please contact the archive
administrators by mailing <email address hidden>)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Format: 1.7
Date: Sun, 29 May 2005 18:23:20 +0200
Source: openldap2.2
Binary: slapd ldap-utils libldap-2.2-7
Architecture: source i386
Version: 2.2.23-6
Distribution: unstable
Urgency: low
Maintainer: Torsten Landschoff <email address hidden>
Changed-By: Torsten Landschoff <email address hidden>
Description:
 ldap-utils - OpenLDAP utilities
 libldap-2.2-7 - OpenLDAP libraries
 slapd - OpenLDAP server (slapd)
Closes: 255276 303505 306229 308234 310422
Changes:
 openldap2.2 (2.2.23-6) unstable; urgency=low
 .
   Torsten Landschoff <email address hidden>:
   * debian/po/ja.po: Merge updates from Kenshi Muto (closes: #303505).
   * debian/po/fr.po: Merge updates from Christian Perrier (closes: #306229).
   * debian/slapd.scripts-common: If the user enters the empty value for
     the database dumping directory use the default value. Seems like the
     readline interface does not care about the default value
     (closes: #308234).
   * debian/slapd.postinst: Make sure the debhelper commands are executed
     in all cases (closes: #310422).
   * Merged suggested changes by Eugene Konev to automatically run
     db_recover before starting slapd (closes: #255276).
     + debian/slapd.init: Run db_recover if enabled and available and no
       slapd process running.
     + debian/slapd.default: Add configuration option to disable it.
   * Applied and improved patch by Matthijs Mohlmann to support migration
     from ldbm to bdb backend.
     + debian/slapd.config: Ask if migration is wanted.
     + debian/slapd.postinst: Update configuration from ldbm to bdb if yes.
     + debian/slapd.scripts-common: Implemented some parts in their own
       functions.
   * Add a README.DB_CONFIG.gz and reference it where referring to BDB
     configuration.
   * Update default DB_CONFIG with some senseful values.
 .
   Steve Langasek <email address hidden>:
   * libraries/libldap_r/Makefile.in: make sure the ximian-connector ntlm
     pa...

Read more...

Revision history for this message
Adam Conrad (adconrad) wrote :

These should all be dealt with now that we'ved synced with the latest version
from sid (2.2.23-8)

Changed in openldap2.2:
status: Unknown → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.