Broken logic in /etc/init/libvirt-cgred-wait.conf

Bug #946737 reported by Loïc Minier
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libcgroup (Ubuntu)
New
Medium
Unassigned

Bug Description

Hi,

Since some time, libvirt-bin fails upgrading at the time "start libvirt-bin" is run during upgrades; I have start_libvirtd="no" in /etc/default/libvirt-bin.

Looking a bit into this, I tried understanding the structure of the upstart jobs around cgred/cgconfig/*-wait etc.

I've added some "logger" based debug to /etc/init/libvirt-cgred-wait.conf which appears to have some errors:
        # If already started, just exit
        status cgred | grep -q "start/running"
        [ $? -ne 0 ] || { stop; exit 0; }
        [...]

The line after "grep" is never reached when cgred isn't considered "running" by upstart. I think this is because the script is run under sh -e (set -e), but I don't know why testing for $? doesn't prevent the set -e behavior.

I changed the code to this slightly nicer syntax:
        # If already started, just exit
        [ `status cgred` = "start/running" ] && { stop; exit 0; }

and this time saw the logging statements go through, so I would recommend changing the script to use this syntax.

However I'm worried by "sleep 3600"; this seems to be a 60 minutes timebomb.

Also, for some reason cgroups weren't listed in my /proc/mounts; starting cgconfig manually I saw this in syslog:
Mar 5 01:03:13 localhost CGRE[17054]: Started the CGroup Rules Engine Daemon.
and then cgroups were mounted, however this:
/usr/sbin/cgconfigparser -l /etc/cgconfig.conf
fails every second run and leaves /sys/fs/cgroup empty; I suspect this might play a role in breaking the startup expectations.

(I'm not sure how all these scripts are supposed to play, so I'm reporting all the weird things I'm seeing; the main point of this specific bug report is to mention the syntax issues with /etc/init/libvirt-cgred-wait.conf.)

Cheers,

Revision history for this message
Loïc Minier (lool) wrote :

Aha, during my last upgrade libvirt-bin started after two minutes, which seems to be exactly the TIMEOUT set in /etc/init/libvirt-cgconfig-wait.conf.

Revision history for this message
Loïc Minier (lool) wrote :

/etc/init/libvirt-cgconfig-wait.conf will immediately exit if cgconfig is already considered "running"; however if that's not the case, it will go into a "sleep" case which wont be interrupted even when cgconfig actually starts: AIUI, the stop statement only prevents startup, but wont kill a running task.

For instance /etc/init/rc.conf wont start /etc/init.d/rc $RUNLEVEL if reboot is called before reaching runlevel 2, but if /etc/init.d/rc is called and someone calls reboot, it's too late to kill it.

So once libvirt-bin is waiting for libvirt-cgconfig-wait which is in sleep, we're stuck in a 120s sleep and that means libvirt-bin's startup is delayed by 120s, right?

Revision history for this message
Loïc Minier (lool) wrote :

Actually last comment was wrong, testing confirms that sleep *gets* killed when the stop condition is met.

I still don't know what happens with my libvirt-bin startup issue, but at least the cgred-wait changes seem correct since the other -wait is also using a different syntax.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi Loic,

> Since some time, libvirt-bin fails upgrading at the time "start libvirt-bin" is run during upgrades; I have start_libvirtd="no" in /etc/default/libvirt-bin.

This is should be independent of the libvirt-cgconfig-wait job. That doesn't actually explicitly start libvirt-bin. Are you on precise?

Note that really libcgroup and libvirt do not work together anyway, due to a race at startup. We recommend cgroup-lite (which is in main) be used in its place.

But libvirt starting after upgrades, when you have start_libvirt="no, is puzzling.

Changed in libcgroup (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Loïc Minier (lool) wrote : Re: [Bug 946737] Re: Broken logic in /etc/init/libvirt-cgred-wait.conf

On Mon, Mar 05, 2012, Serge Hallyn wrote:
> This is should be independent of the libvirt-cgconfig-wait job. That
> doesn't actually explicitly start libvirt-bin. Are you on precise?

 Yes, I'm on precise

> Note that really libcgroup and libvirt do not work together anyway, due
> to a race at startup. We recommend cgroup-lite (which is in main) be
> used in its place.
>
> But libvirt starting after upgrades, when you have start_libvirt="no, is
> puzzling.

 Right now the daemon is running, but the main issue is that the upgrade
 itself is stuck at starting libvirt-bin when the package's postinst
 run, and I don't understand why.

 I'll switch to cgroup-lite and see if that helps, still worth merging
 the proposed shell syntax fix if that's ok with you -- but I'm happy to
 upload it if you +1 it

--
Loïc Minier

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Yes, I agree your fix is nice - and needed. Please do upload it (as I'm out most of this week).

Thanks!

Note that Jon will do a new libcgroup release in debian this week. I'll merge and test it next week.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.