Reading /proc/$(pidof slapd)/exe fails inside a docker container

Bug #1376548 reported by Paul Bickerstaff on 2014-10-02
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
docker.io (Ubuntu)
Undecided
Unassigned
openldap (Ubuntu)
Undecided
Unassigned

Bug Description

In "Ubuntu 14.04.1 LTS" amd64 with slapd package version "2.4.31-1+nmu2ubuntu8", "OpenLDAP server (slapd)", executing the following standard service command fails to have effect.

# service slapd stop
 * Stopping OpenLDAP slapd [ OK ]
# ps -ef | grep slapd | grep -v grep
openldap 196 1 0 02:00 ? 00:00:00 /usr/sbin/slapd -h ldap:/// ldapi:/// -g openldap -u openldap -F /etc/ldap/slapd.d

i.e. it reports all is OK but it failed to stop the running process which continues with the same pid.

The problem is clouded by the --oknodo option in /etc/init.d/slapd. This is responsible for the erroneous report.

stop_slapd() {
        reason="`start-stop-daemon --stop --quiet --oknodo --retry TERM/10 \
                --pidfile "$SLAPD_PIDFILE" \
                --exec $SLAPD 2>&1`"
}

Removing --oknodo demonstrates a failure with exit code 1. The role of oknodo should be reconsidered here.

Further experimentation shows that the --exec option is not working.

Since the init script is checking for $SLAPD_PIDFILE and exiting if empty, I suggest just dropping "--exec $SLAPD" from the init script. It is superfluous and the "service slapd stop" command will work after its removal.

SLAPD_PIDFILE is correctly identified on my system.

Mine is a stock standard fresh slapd install.

Hi Paul,

Thanks for the report.

On Wed, Oct 1, 2014 at 8:03 PM, Paul Bickerstaff
<email address hidden> wrote:
> In "Ubuntu 14.04.1 LTS" amd64 with slapd package version
> "2.4.31-1+nmu2ubuntu8", "OpenLDAP server (slapd)", executing the
> following standard service command fails to have effect.

Is there any output from slapd in /var/log/syslog that might indicate
why it didn't stop? Is it still responding normally to connections
after that?

Is this happening consistently for you, or only intermittently? If the
latter, can you see any pattern in when it happens?

> The problem is clouded by the --oknodo option in /etc/init.d/slapd. This
> is responsible for the erroneous report.

JFTR: the intent of --oknodo is to provide idempotence, per the
examples in the start-stop-daemon(8) man page.

> stop_slapd() {
> reason="`start-stop-daemon --stop --quiet --oknodo --retry TERM/10 \
> --pidfile "$SLAPD_PIDFILE" \
> --exec $SLAPD 2>&1`"
> }
>
> Removing --oknodo demonstrates a failure with exit code 1. The role of
> oknodo should be reconsidered here.
>
> Further experimentation shows that the --exec option is not working.

That agrees with the return codes; 0 with --oknodo and 1 without it
means that start-stop-daemon(8) thinks no action needs to be taken.

However, your ps output above shows the command as /usr/sbin/slapd,
which (assuming you haven't modified the init script) is exactly what
--exec should be checking for. So I don't understand why this wouldn't
be working for you.

It definitely doesn't seem that slapd is failing to stop (which
answers some of my questions above); I'd expect s-s-d to return 2 in
that case.

Can you verify that /proc/$(pidof slapd)/exe does point to /usr/sbin/slapd?

> Since the init script is checking for $SLAPD_PIDFILE and exiting if
> empty, I suggest just dropping "--exec $SLAPD" from the init script. It
> is superfluous and the "service slapd stop" command will work after its
> removal.

As I understand it, the --exec test is there to protect against the
case where the daemon has already died but the pidfile is stil present
(for example, if it crashed), and some other unrelated process has
already taken over the PID. My larger concern is *why* --exec isn't
working properly on your system -- this could be a symptom of
something more subtle.

cheers,
Ryan

Download full text (6.4 KiB)

The failure to stop was consistent.

There was no logging, consistent with the successful exit code triggered by --oknodo.

Experimentation showed that --exec was failing and it was because /proc/$(pidof slapd)/exe could not be read ("Permission denied" to root).

It has occurred to me belatedly that this is because I'm running slapd inside a docker container (Docker version 1.2). I apologize for not being alert enough to recognize this earlier.

The container is running with various capabilities (NET_ADMIN, SYS_ADMIN, SYSLOG, DAC_OVERRIDE, NET_BIND_SERVICE, SETGID, SETUID). It will not run in privileged mode -- due to (flaws in the profile for) apparmor. So /proc is a protected area (read-only for example if not in privileged mode but even more limited for security reasons).

While I now understand what is causing the problem, and can edit the init.d script when building the docker image, I believe that the logic in the stop_slapd function is flawed.

The slapd function is not stopping, due to a failure, but the stop function is ending with exit code 0. The fundamental flaw may well be in start-stop-daemon but this init script tests for the existence of SLAPD_PIDFILE but assumes erroneously that "--exec $SLAPD" is functional.

I admit I don't grasp why --oknodo is not recognizing a failure (which is evident if this option is dropped) and interpreting the situation as nothing to do.

I suggest that the environment I am running in will become increasingly more common and plea for a fix to be made.

The scenario of concern driving the current script, i.e. an existing pidfile but daemon has died, could be tested for. If not the case then --pidfile alone should be sufficient. If the daemon has stopped then it would be OK for the stop function to exit gracefully, possibly with a warning about the pidfile. If there is no pidfile then I think the script is already exiting. One could, if the pidfile didn't stop a running daemon, attempt the --exec option. One can also try a brute force stop without using either option if both fail.

Cheers
Paul Bickerstaff
DevOps, Portland Software Services
Mobile: +6421390266

-----Original Message-----
From: <email address hidden> [mailto:<email address hidden>] On Behalf Of Ryan Tandy
Sent: Friday, 3 October 2014 8:04 a.m.
To: <email address hidden>
Subject: Re: [Bug 1376548] [NEW] service slapd stop fails

Hi Paul,

Thanks for the report.

On Wed, Oct 1, 2014 at 8:03 PM, Paul Bickerstaff <email address hidden> wrote:
> In "Ubuntu 14.04.1 LTS" amd64 with slapd package version
> "2.4.31-1+nmu2ubuntu8", "OpenLDAP server (slapd)", executing the
> following standard service command fails to have effect.

Is there any output from slapd in /var/log/syslog that might indicate why it didn't stop? Is it still responding normally to connections after that?

Is this happening consistently for you, or only intermittently? If the latter, can you see any pattern in when it happens?

> The problem is clouded by the --oknodo option in /etc/init.d/slapd.
> This is responsible for the erroneous report.

JFTR: the intent of --oknodo is to provide idempotence, per the examples in the start-stop-da...

Read more...

Ryan Tandy (rtandy) wrote :

On Fri, Oct 3, 2014 at 5:32 AM, Paul Bickerstaff
<email address hidden> wrote:
> Experimentation showed that --exec was failing and it was because
> /proc/$(pidof slapd)/exe could not be read ("Permission denied" to
> root).
>
> It has occurred to me belatedly that this is because I'm running slapd
> inside a docker container (Docker version 1.2). I apologize for not
> being alert enough to recognize this earlier.

Thanks, that's the key piece of info. I believe this is a bug in
Docker, then; either an upstream bug such as
https://github.com/docker/docker/issues/6800 which has been shown to
affect multiple Debian packages
(https://github.com/docker/docker/issues/6800#issuecomment-49685466 is
especially similar to your situation) or something more like
https://bugs.launchpad.net/ubuntu/+source/docker.io/+bug/1320869 which
mentions apparmor.

I'm going to mark this as a Docker bug, and leave it up to its
maintainers to decide whether it should be marked as a duplicate or
triaged individually.

> While I now understand what is causing the problem, and can edit the
> init.d script when building the docker image, I believe that the logic
> in the stop_slapd function is flawed.

We might have to agree to disagree on that. The same pattern is used
in plenty of other packages and is expected to work.

> The slapd function is not stopping, due to a failure, but the stop
> function is ending with exit code 0. The fundamental flaw may well be in
> start-stop-daemon but this init script tests for the existence of
> SLAPD_PIDFILE but assumes erroneously that "--exec $SLAPD" is
> functional.

I don't agree at all with "erroneously". --exec is expected to work.

> I admit I don't grasp why --oknodo is not recognizing a failure (which is evident if this option is dropped) and interpreting the situation as nothing to do.

That should probably be reported as a bug in start-stop-daemon. I
would expect the failure to read /proc/N/exe to be reported as an
error, even under --oknodo.

Changed in openldap (Ubuntu):
status: New → Invalid
Robie Basak (racb) on 2014-10-06
summary: - service slapd stop fails
+ Reading /proc/$(pidof slapd)/exe fails inside a docker container
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers