Installing lxd leaves /var/lib/lxd/unix.socket with wrong group ownership

Bug #1577001 reported by Dan Kegel on 2016-04-30
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
lxd (Ubuntu)
Low
Unassigned
systemd (Ubuntu)
Undecided
Martin Pitt

Bug Description

On ubuntu 16.04, doing
  sudo apt-get install lxd
sometimes leaves the file
  /var/lib/lxd/unix.socket
with group root, but it should have group lxd. Doing
  sudo systemctl restart lxd.socket
rescues the file and gives it the right group ownership.

Adding logging to the package's postinst shows that, if /var/lib/lxd/unix.socket did not already exist, it is created by the line
 deb-systemd-helper enable lxd.service
and with the wrong group permissions.
If the socket already existed with the correct group ownership that command breaks the permissions and sets it to be group root.

It's about 90% repeatable on one machine here (with an SSD main disk).

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: lxd 2.0.0-0ubuntu4
ProcVersionSignature: Ubuntu 4.4.0-21.37-generic 4.4.6
Uname: Linux 4.4.0-21-generic x86_64
ApportVersion: 2.20.1-0ubuntu2
Architecture: amd64
CurrentDesktop: Unity
Date: Sat Apr 30 08:24:53 2016
InstallationDate: Installed on 2016-03-26 (35 days ago)
InstallationMedia: Ubuntu 16.04 LTS "Xenial Xerus" - Beta amd64 (20160323)
SourcePackage: lxd
UpgradeStatus: No upgrade log present (probably fresh install)

Dan Kegel (dank) wrote :
Dan Kegel (dank) wrote :

See also later comments in https://github.com/lxc/lxd/issues/1635

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in lxd (Ubuntu):
status: New → Confirmed
Tessa (unit3) wrote :

Had this issue on a fresh install of lxd on Xenial as well.

Stéphane Graber (stgraber) wrote :

Moving over to systemd, our unit is very clear about the expected owner for the socket. The needed group is created in preinst so is guaranteed to be there by the time systemd is poked in postinst. LXD itself doesn't change socket ownership when socket activated and if it did, it would honor the --group option which would have it chown to the right group.

affects: lxd (Ubuntu) → systemd (Ubuntu)

I can reproduce this bug in a small (autopkgtest) Xenial VM with

  dpkg -P lxd; rm /var/lib/lxd/unix.socket; apt-get install -y lxd; ls -l /var/lib/lxd/unix.socket

But I cannot reproduce this as long as the lxd package is already installed, all these work fine:

  systemctl stop lxd.{service,socket} lxd-containers; rm /var/lib/lxd/unix.socket; systemctl reset-failed lxd.{service,socket} lxd-containers; DEBIAN_FRONTEND=noninteractive dpkg-reconfigure lxd; ls -l /var/lib/lxd/unix.socket

  systemctl stop lxd.{service,socket} lxd-containers; rm /var/lib/lxd/unix.socket; systemctl reset-failed lxd.{service,socket} lxd-containers; DEBIAN_FRONTEND=noninteractive apt-get install --reinstall lxd; ls -l /var/lib/lxd/unix.socket

  systemctl stop lxd.{service,socket} lxd-containers; rm /var/lib/lxd/unix.socket; systemctl reset-failed lxd.{service,socket} lxd-containers; export DPKG_MAINTSCRIPT_PACKAGE=lxd; deb-systemd-helper enable lxd.service; deb-systemd-helper enable lxd.socket; deb-systemd-helper enable lxd-containers.service; deb-systemd-invoke start lxd-containers.service; deb-systemd-invoke start lxd.socket; ls -l /var/lib/lxd/unix.socket

(The reset-failed is to avoid the "unit restarted too often" rate limit when running these too often)

More interestingly, I also cannot reproduce the bug with the first command if I stop the socket unit before or after purging:

  systemctl stop lxd.socket; dpkg -P lxd; rm /var/lib/lxd/unix.socket; apt-get install -y lxd; ls -l /var/lib/lxd/unix.socket

  dpkg -P lxd; systemctl stop lxd.socket; rm /var/lib/lxd/unix.socket; apt-get install -y lxd; ls -l /var/lib/lxd/unix.socket

This exhibits a bug in lxd's maintainer scripts: Purging lxd still leaves lxd.socket running. Re-adding an lxd task about this, it needs the counterpart of starting lxd.socket in the postinst.

> Adding logging to the package's postinst shows that, if /var/lib/lxd/unix.socket did not already exist, it is created by the line
> deb-systemd-helper enable lxd.service

Most of "deb-systemd-helper enable" shouldn't affect the permissions of unix.socket at all, as this is just creating symlinks in /etc/systemd/system/ (without even calling systemctl for that). So I figure the "systemctl daemon-reload" at the very end triggers this. And indeed:

  mv /lib/systemd/system/lxd.socket{,.disabled}; systemctl daemon-reload; sleep 0.5; mv /lib/systemd/system/lxd.socket{.disabled,}; systemctl daemon-reload; sleep 0.5; ls -l /var/lib/lxd/unix.socket
  srw-rw---- 1 root root 0 May 1 18:22 /var/lib/lxd/unix.socket

This is inconsistent -- either unix.socket should not be created at all (as the unit is still running) or with correct permissions.

@Dan: Is this reproducible for you on a 16.04 install that has lxd purged, no /var/lib/lxd/unix.socket, and lxd.socket *not* running? (Note that 16.04 comes with lxd preinstalled on server and cloud images) I. e. do you only see this on reinstall or on a clean install of lxd? If so, we have another bug, but if "systemctl status lxd.socket" is running before the reinstall it's the issue I described above.

Martin Pitt (pitti) wrote :

I created https://github.com/systemd/systemd/issues/3171 for changing the socket group on reload with an absent unit. This will not be an issue any more when lxd's prerm stops `lxd.socket` on removal, but (1) this sounds worth fixing anyway, and (2) I have some feeling that the above reproducer is not the complete story here yet. Waiting on Dan's answer for that.

Dan Kegel (dank) wrote :

@martin: I'm on desktop, fwiw. I think you nailed it:

apt-cache policy lxd systemd
# lxd:
# Candidate: 2.0.0-0ubuntu4
# systemd:
# Installed: 229-4ubuntu4

# Clean initial conditions
sudo apt purge -y lxd || true
sudo systemctl stop lxd.socket || true
sudo rm -f /var/lib/lxd/unix.socket

echo "Demonstrate proper functioning on first install"
sudo apt install -y lxd
ls -l /var/lib/lxd/unix.socket
# srw-rw---- 1 root lxd 0 May 2 10:10 /var/lib/lxd/unix.socket

echo "Purge (bug: leaves lxd.socket running, leaves unix.socket present!)"
sudo apt purge -y lxd
ls -l /var/lib/lxd/unix.socket
# srw-rw---- 1 root lxd 0 May 2 10:10 /var/lib/lxd/unix.socket

echo "Reinstall (bug: breaks group ownership of unix.socket!)"
sudo apt install -y lxd
ls -l /var/lib/lxd/unix.socket
# srw-rw---- 1 root root 0 May 2 10:10 /var/lib/lxd/unix.socket

Martin Pitt (pitti) wrote :

@Dan: I know that my reproducer works. I just wondered if you can reproduce this with a clean install of lxd when it was *not* previously installed (or at least lxd.socket is not running before apt-get install lxd).

Dan Kegel (dank) wrote :

@Martin: the problem doesn't reproduce for me if lxd.socket isn't running before installing lxd.
That's why I posted the script along with output for the ls -l, to show what happens for me in the two cases.

Purging lxd should stop lxd and lxd.socket, shouldn't it?

Thanks!

Martin Pitt (pitti) wrote :

> @Martin: the problem doesn't reproduce for me if lxd.socket isn't running before installing lxd.

Thanks for confirming, then it seems we nailed this indeed. Thanks for confirming!

> Purging lxd should stop lxd and lxd.socket, shouldn't it?

Correct, that's what the lxd task is for.

Martin Pitt (pitti) wrote :

On the systemd side this was fixed upstream in https://github.com/systemd/systemd/commit/01a8b4675. However, lxd's postrm still needs to be fixed to stop lxd.socket on removal.

Changed in systemd (Ubuntu):
status: Confirmed → Fix Committed
assignee: nobody → Martin Pitt (pitti)
Changed in lxd (Ubuntu):
status: New → Triaged
importance: Undecided → High
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 230-1git1

---------------
systemd (230-1git1) yakkety; urgency=medium

  * Don't add a Breaks: against usb-modeswitch when building on Ubuntu; there
    it does not use hotplug.functions and is a lower version.
  * boot-and-services autopkgtest: Add missing xserver-xorg and
    lightdm-greeter test dependencies, so that lightdm can start.
    (See LP #1581106)

 -- Martin Pitt <email address hidden> Wed, 25 May 2016 09:37:41 +0200

Changed in systemd (Ubuntu):
status: Fix Committed → Fix Released
Changed in lxd (Ubuntu):
importance: High → Low
Changed in lxd (Ubuntu):
status: Triaged → In Progress
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxd - 2.1-0ubuntu2

---------------
lxd (2.1-0ubuntu2) yakkety; urgency=medium

  * Properly handle purge. (LP: #1614621)
  * Configure system-wide dnsmasq not to touch lxdbr0. (LP: #1613820)
  * Stop lxd.socket on package remove. (LP: #1577001)

 -- Stéphane Graber <email address hidden> Mon, 29 Aug 2016 17:29:27 -0400

Changed in lxd (Ubuntu):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers