upgrades fail to complete

Bug #1762680 reported by Dimitri John Ledkov
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxd (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Setting up lxd (3.0.0-0ubuntu3) ...
Job for lxd-containers.service failed because the control process exited with error code.
See "systemctl status lxd-containers.service" and "journalctl -xe" for details.

Upgrades hang on trying to start lxd-containers.service, thus blocking apt full-upgrade forever.

root 2315 0.1 0.0 39372 21112 pts/6 Ss+ 09:47 0:02 | | \_ /usr/bin/dpkg --status-fd 27 --configure --pending
root 24530 0.0 0.0 15860 3204 pts/6 S+ 10:01 0:00 | | \_ /bin/sh /var/lib/dpkg/info/lxd.postinst configure 3.0.0-0ubuntu2
root 25096 0.0 0.0 67876 5764 pts/6 S+ 10:02 0:00 | | \_ /bin/systemctl restart lxd-containers.service
root 25099 0.0 0.0 61812 3120 pts/6 S+ 10:02 0:00 | | \_ /bin/systemd-tty-ask-password-agent --watch

Should the non-blocking variant of restarting lxd-containers.service be used in the maintainer script? e.g. --no-block?

Revision history for this message
Christian Brauner (cbrauner) wrote :

The first thing we need to see is why this is blocking. Do you have any additional logs?

David Britton (dpb)
Changed in lxd (Ubuntu):
status: New → Incomplete
Revision history for this message
Stéphane Graber (stgraber) wrote :

And to answer the initial question, no, we should not be using the non-blocking systemctl calls here, that'd just paper over broken systems.

I'm not sure why lxd-containers would be re-triggered on upgrade though, in theory lxd-containers should be kept as it is (normally running) and the lxd unit is the one that should be brought down and back up.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

# Automatically added by dh_systemd_start/11.1.6ubuntu1
if [ "$1" = "configure" ] || [ "$1" = "abort-upgrade" ] || [ "$1" = "abort-deconfigure" ] || [ "$1" = "abort-remove" ] ; then
        if [ -d /run/systemd/system ]; then
                systemctl --system daemon-reload >/dev/null || true
                if [ -n "$2" ]; then
                        _dh_action=restart
                else
                        _dh_action=start
                fi
                deb-systemd-invoke $_dh_action 'lxd-containers.service' >/dev/null || true
        fi
fi

I see above in the maintainer script, thus on upgrades ( when [ -n "$2" ] is true ) a restart of lxd-containers job is attempted. Maybe we need dh_systemd --no-restart --name lxd-containers.service or some such?

Revision history for this message
Christian Brauner (cbrauner) wrote :

I'm not super familiar with this but that sounds reasonable to me. The unit shouldn't be re-triggered.

David Britton (dpb)
Changed in lxd (Ubuntu):
status: Incomplete → New
Revision history for this message
Stéphane Graber (stgraber) wrote :

I'm kinda confused as to how this even makes it in considering we have:

override_dh_systemd_start:
»···echo "Not running dh_systemd_start"

Revision history for this message
Stéphane Graber (stgraber) wrote :

Found a way to make the restart call go away.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxd - 3.0.0-0ubuntu4

---------------
lxd (3.0.0-0ubuntu4) bionic; urgency=medium

  * Cherry-pick upstream fixes:
    - lxc/alias: Fix example in help message
    - i18n: Update translation templates
    - client: Introduce LXD_SOCKET
    - Makefile: Add a manifest
    - containers: fix snapshot deletion
    - lxc/init: Add missing --no-profiles
    - i18n: Update translations
    - lxc/file: Fix pull target logic
    - doc: Fix example in userns-idmap
    - devices: fail if Nvidia device minor is missing
    - Add db.ContainersNodeList
    - storage: createContainerMountpoint() fix perms
    - ceph: s/0755/0711/g
    - lvm: s/0755/0711/g
    - storage utils: s/0755/0711/g
    - zfs: s/0755/0711/g
    - patches: add "storage_api_path_permissions"
    - sys/fs: s/MkdirAll/Mkdir/g
    - btrfs: fix permissions
    - Pass a logger to raft-http
    - Add new cluster.Promote function
    - Add new cluster.Rebalance function
    - Notify the cluster leader after a node removal, so it can rebalance
    - Add integration test
    - doc: Tweak backup.md
    - lxd/init: Require root for interactive cluster join
    - Disable flaky unit tests for now
    - Log the error that made Daemon.Init() fail
    - client: Expose http URL in ConnectionInfo
    - lxc/query: Add support for non-JSON endpoints
    - lxd/containers: Fix lxc.net check

  * Really don't restart lxd-containers on upgrades (LP: #1762680)
  * Bump standards to 4.1.4

 -- Stéphane Graber <email address hidden> Tue, 17 Apr 2018 13:35:28 -0400

Changed in lxd (Ubuntu):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.