Ubuntu

Stopping resolvconf doesn't disable updates because Upstart doesn't run the pre-stop script

Reported by Thomas Hood on 2012-02-16
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
resolvconf (Ubuntu)
High
Unassigned
upstart (Ubuntu)
Medium
Unassigned

Bug Description

Stopping the resolvconf job should disable updates by means of deleting the enable-updates flag file, but this does not happen.

# ls -l /run/resolvconf
total 4
-rw-r--r-- 1 root root 0 2012-02-16 17:02 enable-updates
drwxr-xr-x 2 root root 60 2012-02-16 17:02 interface
-rw-r--r-- 1 root root 177 2012-02-16 17:02 resolv.conf
# stop resolvconf
stop: Unknown instance:
# ls -l /run/resolvconf
total 4
-rw-r--r-- 1 root root 0 2012-02-16 17:02 enable-updates
drwxr-xr-x 2 root root 60 2012-02-16 17:02 interface
-rw-r--r-- 1 root root 177 2012-02-16 17:02 resolv.conf
# start resolvconf
resolvconf start/running
# stop resolvconf
resolvconf stop/waiting
# ls -l /run/resolvconf
total 4
-rw-r--r-- 1 root root 0 2012-02-16 17:06 enable-updates
drwxr-xr-x 2 root root 60 2012-02-16 17:02 interface
-rw-r--r-- 1 root root 177 2012-02-16 17:02 resolv.conf

The following change seems to fix this.

--- /etc/init/resolvconf.conf_ORIG 2012-02-16 17:09:22.313489458 +0100
+++ /etc/init/resolvconf.conf 2012-02-16 17:08:31.398957282 +0100
@@ -14,6 +14,6 @@
  resolvconf --enable-updates
 end script

-pre-stop script
+post-stop script
  resolvconf --disable-updates
 end script

# stop resolvconf
resolvconf stop/waiting
# ls -l /run/resolvconf
total 4
drwxr-xr-x 2 root root 60 2012-02-16 17:02 interface
-rw-r--r-- 1 root root 177 2012-02-16 17:02 resolv.conf

Related branches

Thomas Hood (jdthood) wrote :

I wrote:
> The following change seems to fix this.

But don't make that change, because then updates are not enabled after reboot!

(rebooted)
# ls -l /run/resolvconf
total 4
drwxr-xr-x 2 root root 60 2012-02-16 17:12 interface/
-rw-r--r-- 1 root root 0 2012-02-16 17:12 postponed-update
-rw-r--r-- 1 root root 151 2012-02-16 17:12 resolv.conf

Combined with the curious fact that, after boot, the status of resolvconf is "stop/waiting" and not "running" (as I earlier mentioned here: https://bugs.launchpad.net/ubuntu/+source/resolvconf/+bug/929552/comments/16), it looks to me as if resolvconf is both started and stopped on boot, except that pre-stop scripts are not run. Or something. I don't understand Upstart.

Steve Langasek (vorlon) wrote :

> But don't make that change, because then updates are not enabled after
> reboot!

Well, it looks like the change may be per se correct, and you simply have
something amiss on your system that's causing the resolvconf job to fail at
startup. It doesn't fail for me.

> it looks to me as if resolvconf is both started and stopped on boot,
> except that pre-stop scripts are not run. Or something. I don't
> understand Upstart.

This looks to me like a previously unreported bug in upstart. resolvconf is the only job on my system which tries to use pre-start + pre-stop without a main process. Conceptually, I think there's no reason not to use post-stop since the post-stop script is run before the stopped event is emitted, so I'll go ahead with committing this fix (but not uploading until we know why resolvconf is failing to start for you on boot).

Also raising a task on upstart, since I don't think the pre-stop script should be silently skipped here.

Changed in resolvconf (Ubuntu):
status: New → Triaged
importance: Undecided → High
Changed in upstart (Ubuntu):
importance: Undecided → Medium
Thomas Hood (jdthood) wrote :

Here's a simpler test case which shows the pre-stop script not being run.

# status foo
foo stop/waiting
# cat /etc/init/foo.conf
description "Foo"

pre-start script
 touch /run/foo
end script

pre-stop script
 rm -f /run/foo
end script
# ls -l /run/foo
ls: cannot access /run/foo: No such file or directory
# start foo
foo start/running
# ls -l /run/foo
-rw-r--r-- 1 root root 0 2012-02-16 22:08 /run/foo
# stop foo
foo stop/waiting
# ls -l /run/foo
-rw-r--r-- 1 root root 0 2012-02-16 22:08 /run/foo

(The file /run/foo should have been deleted but wasn't.)

Steve Langasek (vorlon) on 2012-02-16
Changed in resolvconf (Ubuntu):
status: Triaged → Fix Committed
Thomas Hood (jdthood) wrote :

> until we know why resolvconf is failing to start for you on boot

We know that the resolvconf job starts, otherwise /run/resolvconf wouldn't exist. The problem is that the job also get stopped.

I added touch commands to the scripts which show this.

--------------------------------------------------
pre-start script
 touch /run/resolvconf-started
 mkdir -p /run/resolvconf/interface
 # Request a postponed update (needed in case the base file has content).
 touch /run/resolvconf/postponed-update
 # Enable updates and perform the postponed update.
 resolvconf --enable-updates
end script

post-stop script
 touch /run/resolvconf-stopped
 resolvconf --disable-updates
end script
-------------------------------------------------

After boot:

# status resolvconf
resolvconf stop/waiting
# ls -l /run/resolvconf*
-rw-r--r-- 1 root root 0 2012-02-16 22:37 /run/resolvconf-started
-rw-r--r-- 1 root root 0 2012-02-16 22:37 /run/resolvconf-stopped

/run/resolvconf:
total 4
drwxr-xr-x 2 root root 60 2012-02-16 22:38 interface
-rw-r--r-- 1 root root 0 2012-02-16 22:38 postponed-update
-rw-r--r-- 1 root root 151 2012-02-16 22:37 resolv.conf

Interestingly, another job very much like resolvconf doesn't suffer from this problem:

# cat /etc/init/foo.conf
# upstart script for foo

description "Initialize or finalize foo"

start on mounted MOUNTPOINT=/run

stop on runlevel [06]

pre-start script
 mkdir -p /run/foo
 touch /run/foo/foofile
end script

post-stop script
 mv /run/foo/foofile /run/foo/foofile2
end script

# status foo
foo start/running
# ls -l /run/foo
total 0
-rw-r--r-- 1 root root 0 2012-02-16 22:37 foofile

On Thu, Feb 16, 2012 at 09:43:44PM -0000, Thomas Hood wrote:
> > until we know why resolvconf is failing to start for you on boot

> We know that the resolvconf job starts, otherwise /run/resolvconf
> wouldn't exist. The problem is that the job also get stopped.

No, the job *tries* to start. It is not "started" unless the pre-start
script completes successfully. When the pre-start script instead exits
non-zero (because it's run under sh -e and one of the commands has failed -
presumably the last), the job is then unwound to a "stopped" state.

So since the mkdir/touch commands are obviously succeeding, we need to see
the output of the resolvconf command to see what's going on here.
Per
<https://bugs.launchpad.net/ubuntu/+source/resolvconf/+bug/929552/comments/18>,
current precise upstart + a reboot should give you this in
/var/log/upstart/resolvconf.log.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

I actually upgraded to 1.4-0ubuntu7 yesterday so when I look in /var/log/upstart/resolvconf.log now I see the results of several reboots: yes, there they are, repeated instances of:

    cp: cannot create regular file `/var/spool/postfix/etc/resolv.conf': Read-only file system^M
    run-parts: /etc/resolvconf/update-libc.d/postfix exited with return code 1^M
    run-parts: /etc/resolvconf/update.d/libc exited with return code 1^M

I uninstalled postfix and now, after boot,

    $ status resolvconf
    resolvconf start/running

with this Upstart job definition:

    $ cat /etc/init/resolvconf.conf
    # upstart script for resolvconf
    description "Initialize or finalize resolvconf"
    start on mounted MOUNTPOINT=/run
    stop on runlevel [06]
    pre-start script
 mkdir -p /run/resolvconf/interface
 # Request a postponed update (needed in case the base file has content).
 touch /run/resolvconf/postponed-update
 # Enable updates and perform the postponed update.
 resolvconf --enable-updates
    end script
    post-stop script
 resolvconf --disable-updates
    end script

Furthermore, starting and stopping the resolvconf job does the right thing.

# status resolvconf
resolvconf start/running
# ls -l /run/resolvconf
total 4
-rw-r--r-- 1 root root 0 2012-02-17 12:17 enable-updates
drwxr-xr-x 2 root root 60 2012-02-17 11:58 interface
-rw-r--r-- 1 root root 177 2012-02-17 11:58 resolv.conf
# stop resolvconf
resolvconf stop/waiting
# ls -l /run/resolvconf
total 4
drwxr-xr-x 2 root root 60 2012-02-17 11:58 interface
-rw-r--r-- 1 root root 177 2012-02-17 11:58 resolv.conf
# start resolvconf
resolvconf start/running
# ls -l /run/resolvconf
total 4
-rw-r--r-- 1 root root 0 2012-02-17 12:17 enable-updates
drwxr-xr-x 2 root root 60 2012-02-17 11:58 interface
-rw-r--r-- 1 root root 177 2012-02-17 11:58 resolv.conf

--
Thomas

Thomas Hood (jdthood) wrote :

Steve Langasek wrote:
> Conceptually, I think there's no reason not to use
> post-stop since the post-stop script is run before
> the stopped event is emitted, so I'll go ahead with
> committing this fix [...]

Using post-stop has the consequence that any error during the resolvconf update run in the upstart job causes resolvconf updates to be disabled. That is undesirable IMHO. Using pre-stop does not have this consequence.

But we can't use pre-stop until Upstart is changed so that it actually runs pre-stop. Ideally we'd just fix Upstart and not change the resolvconf Upstart job definition. How fast can we get Upstart fixed?

If we can't get Upstart fixed in an acceptable time frame then I'd propose a different workaround. Create a resolvconf-stop Upstart job that *starts* on runlevel [06] and that disables updates in its pre-start.

Thomas Hood (jdthood) wrote :

An alternative might be to use post-start (whose exit status is ignored) so long as we use post-stop.

On Fri, Feb 17, 2012 at 02:05:02PM -0000, Thomas Hood wrote:

> Using post-stop has the consequence that any error during the resolvconf
> update run in the upstart job causes resolvconf updates to be disabled.
> That is undesirable IMHO.

This is entirely by design. If the errors in the hook scripts should not
cause resolvconf to redisable itself, then resolvconf should be changed to
not return a non-zero exit code here. Otherwise, it's impossible for the
upstart job to distinguish between a fatal error setting up resolvconf
(which should result in the job being marked as "stopped") and an ignorable
hook failure.

We *want* to know if resolvconf has failed to be set up, so that the job's
state reflects reality, and so that when things are broken, 'service
resolvconf start' does the right thing. The unwinding in the post-stop is
an intented side-effect of the job not being considered successfully
started. So the root bug is that resolvconf is returning an error in a case
where it apparently doesn't need to.

Should we adjust resolvconf to not take hook failures into consideration for
its return code?

> Using pre-stop does not have this consequence.

Only because of a bug that prevents the pre-stop script from being run at
all. If and when that bug is fixed, the pre-stop script will have the exact
same effect.

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

Upstart sees resolvconf as a persistent job and keeps track of its state, running or stopped. One significance of this is that Upstart won't start resolvconf if it is already "running" or stop it if it is already "stopped".

The part of resolvconf that conforms to this model is the enabling and disabling of updates. We want to make use of this at shutdown time.

So long as resolvconf is treated as an Upstart job there should be a one-to-one correlation:

    resolvconf updates enabled == resolvconf Upstart job running
    resolvconf updates disabled == resolvconf Upstart job stopped

In order to ensure this, "resolvconf --enable-updates" should return a nonzero error code if and only if it failed to enable updates, i.e., failed to create the /run/resolvconf/enable-updates flag file.

Likewise "resolvconf --disable-updates" should return a nonzero error code if and only if it failed to disable updates. Fortunately it already does so.

Steve wrote in #9:
> Should we adjust resolvconf to not take hook failures into consideration for its return code?

The smallest modification would be to make the suggested change only to the enable-updates case, as follows.

$ diff -u resolvconf_ORIG resolvconf
--- resolvconf_ORIG 2012-02-19 16:49:46.725254960 +0100
+++ resolvconf 2012-02-19 16:51:39.277257572 +0100
@@ -113,9 +113,9 @@
  fi
  ;;
   --enable-updates)
- : >| "$ENABLE_UPDATES_FLAGFILE"
+ : >| "$ENABLE_UPDATES_FLAGFILE" || exit 1
  if [ -e "$POSTPONED_UPDATE_FLAGFILE" ] ; then
- update_and_exit -u
+ (update_and_exit -u) || :
  fi
  exit 0
  ;;

With this, other callers of resolvconf won't see any change in exit code semantics.

Thomas Hood (jdthood) wrote :

The original issue here (#933566), "Stopping resolvconf doesn't disable updates", has already been addressed (by switching to post-stop from pre-stop) so I will open a new bug report (#936835) about the additional measure of ignoring errors on enabling/disabling updates.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package resolvconf - 1.63ubuntu8

---------------
resolvconf (1.63ubuntu8) precise; urgency=low

  * Use a post-stop script for disabling resolvconf, not a pre-stop script,
    since upstart seems to be silently ignoring pre-stop when there's no
    main process. LP: #933566.
  * debian/postinst: mkdir -p /run/resolvconf/interface again, just in case
    there's been a reboot between the preinst and postinst which would wipe
    out /run. May or may not address LP 933035.
  * debian/config, debian/templates, debian/postinst: if we don't know that
    /etc/resolv.conf was being dynamically managed before install (in at
    least some cases), link the original contents of /etc/resolv.conf to
    /etc/resolvconf/resolv.conf.d/tail so that any statically configured
    nameservers aren't lost. LP: #923685.
  * when called with --enable-updates, ignore failures from the hooks.
    LP: #933723.
 -- Steve Langasek <email address hidden> Mon, 20 Feb 2012 19:17:32 +0000

Changed in resolvconf (Ubuntu):
status: Fix Committed → Fix Released
Thomas Hood (jdthood) wrote :

> If and when that bug is fixed, the pre-stop script will
> have the exact same effect.

If and when the bug is fixed the wiki documentation at
http://upstart.ubuntu.com/wiki/Stanzas#pre-stop
should also be updated so that it no longer implies
(by omission, compared with post-stop) that the pre-stop
command isn't executed if the job fails to launch the
main process or the pre-start command fails.

summary: - Stopping resolvconf doesn't disable updates
+ Stopping resolvconf doesn't disable updates because Upstart doesn't run
+ the pre-stop script
Changed in upstart (Ubuntu):
status: New → Confirmed
Thomas Hood (jdthood) wrote :

I wrote above in #7:
> If we can't get Upstart fixed in an acceptable time frame then
> I'd propose a different workaround. Create a resolvconf-stop
> Upstart job that *starts* on runlevel [06] and that disables
> updates in its pre-start.

I have been re-reading the Upstart documentation and although the other solution (of continuing to treat resolvconf itself as an Upstart service job but changing the exit codes of "/sbin/resolvconf --enable-updates") has been adopted and works, I am now in a position to re-describe my alternative, quoted above, using Upstart terminology. The alternative is, namely, to replace the service job with a "resolvconf-initialize" task job and a "resolvconf-finalize" task job. The alternative is closer to how things work in Debian where resolvconf has scripts that run only in runlevels S and 06, and has the advantage that Upstart doesn't have to track resolvconf's enabledness (= presence of flag file /run/resolvconf/enable-updates) with its own concept of startedness --- with the risk of being wrong.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers