Can't purge snapd in LXD: rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/...': Function not implemented

Bug #1989603 reported by Paride Legovini
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxd
Fix Released
Unknown
snapd (Ubuntu)
Confirmed
Low
Unassigned

Bug Description

In LXD containers snapd can't be purge-removed. One tool affected by this bug is autopkgtest-build-lxd, which attempts to purge several packages (including snapd) to create a base image to run autopkgtests on.

Steps to reproduce:

  lxc launch ubuntu:jammy paride-j
  lxc exec paride-j -- apt-get -y remove --purge snapd

Kinetic is also affected, I didn't test older releases. To see how this affects autopkgtest-build-lxd just install autopkgtest and run:

  autopkgtest-build-lxd ubuntu:jammy

This reminds me of LP: #1903967, but it's likely a different issue.

--- relevant log excerpt ---

[...]
Purging configuration files for snapd (2.56.2+22.04ubuntu1) ...
Stopping snap.lxd.activate.service
Stopping unit snap.lxd.activate.service
Waiting until unit snap.lxd.activate.service is stopped [attempt 1]
snap.lxd.activate.service is stopped.
Removing snap.lxd.activate.service
Stopping snap.lxd.daemon.service
Stopping unit snap.lxd.daemon.service
Waiting until unit snap.lxd.daemon.service is stopped [attempt 1]
snap.lxd.daemon.service is stopped.
Removing snap.lxd.daemon.service
Stopping snap.lxd.user-daemon.service
Stopping unit snap.lxd.user-daemon.service
Waiting until unit snap.lxd.user-daemon.service is stopped [attempt 1]
snap.lxd.user-daemon.service is stopped.
Removing snap.lxd.user-daemon.service
Stopping snap-core20-1611.mount
Stopping unit snap-core20-1611.mount
Waiting until unit snap-core20-1611.mount is stopped [attempt 1]
snap-core20-1611.mount is stopped.
Removing snap core20 and revision 1611
Removing snap-core20-1611.mount
Stopping snap-lxd-23541.mount
Stopping unit snap-lxd-23541.mount
Waiting until unit snap-lxd-23541.mount is stopped [attempt 1]
snap-lxd-23541.mount is stopped.
Removing snap lxd and revision 23541
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/proc/cpuinfo': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/proc/meminfo': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/proc/stat': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/proc/uptime': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/proc/diskstats': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/proc/swaps': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/proc/loadavg': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/proc/slabinfo': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/sys/devices/system/cpu/cpu0/cpuidle/state5/disable': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/sys/devices/system/cpu/cpu0/cpuidle/state5/above': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/sys/devices/system/cpu/cpu0/cpuidle/state5/time': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/sys/devices/system/cpu/cpu0/cpuidle/state5/rejected': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/sys/devices/system/cpu/cpu0/cpuidle/state5/power': Function not implemented
rm: cannot remove '/var/snap/lxd/common/var/lib/lxcfs/sys/devices/system/cpu/cpu0/cpuidle/state5/residency': Function not implemented
[...]

Related branches

Revision history for this message
Alberto Mardegan (mardy) wrote :

It looks like the LXD snap has mounted "proc" and "sys" over /var/snap/lxd/common/var/lib/lxcfs/{proc,sys}/ but it's not unmounting them.

In the "remove" hook, the LXD snap should unmount any mounted filesystems; not doing so can have disastrous effects if there are also user partitions mounted in there.

affects: snapd (Ubuntu) → lxd (Ubuntu)
Revision history for this message
Paride Legovini (paride) wrote :

I tried purging lxd (snap remove --purge lxd) and it worked just fine, and after that an `apt purge snapd` also worked, so perhaps the umount of .../lxcfs/{proc,sys} is racy?

Uninstalling and reinstalling lxd also works:

 - [jump into *fresh* kinetic lxd container]
 - snap remove lxd
 - snap install lxd
 - apt purge snapd # works!

While refreshing lxd does not work:

 - [jump into *fresh* kinetic lxd container]
 - snap refresh lxd --edge
 - apt purge snapd # FAIL

Revision history for this message
Paride Legovini (paride) wrote :
no longer affects: lxd
Paride Legovini (paride)
no longer affects: lxd (Ubuntu)
Changed in lxd:
status: Unknown → Fix Released
Revision history for this message
Paride Legovini (paride) wrote :

This is still reproducible at least on Focal and Jammy. The link to the (not really) Fix Released upstream bug is dead.

Changed in snapd (Ubuntu):
status: New → Confirmed
Revision history for this message
Paride Legovini (paride) wrote :

Stumbled on this again today, and I still believe it's a snapd bug and not a lxd bug: snapd should be uninstallable cleanly no matter of what individual snaps do.

Revision history for this message
Ernest Lotter (ernestl) wrote (last edit ):

Hi, had a first look at this,

THE LINE OF CODE THAT CAUSES THE FAILURE:
-------------------------------------------------------------------------
$ ls /var/lib/dpkg/info/snapd.*
> snapd.prerm snapd.postrm ...

See: https://github.com/canonical/snapd/blob/master/packaging/ubuntu-16.04/snapd.postrm

As per log: "Removing snap lxd and revision 23541" =>
https://github.com/canonical/snapd/blob/master/packaging/ubuntu-16.04/snapd.postrm#L74

which is within the "purge" condition https://github.com/canonical/snapd/blob/master/packaging/ubuntu-16.04/snapd.postrm#L30

This is the line that fails:
rm -rf --one-file-system "/var/snap/$snap/common"

https://github.com/canonical/snapd/blob/master/packaging/ubuntu-16.04/snapd.postrm#L90

WITH SNAPD, LXD INSTALLED
----------------------------------------------

After installing lxd, and running lxd init, a link
/var/snap/lxd/common/var/lib/lxcfs -> /var/snap/lxd/common/shmounts/lxcfs
but /var/snap/lxd/common/shmounts/lxcfs does not exist.

See: https://paste.ubuntu.com/p/Jg4Z2dbfgd/

So its seems the issue is, that the postrm script tries to remove something that it did not add.

QUESTIONS
-------------------

1) @Paride: for your test, does the symlink actaully point to anything, otherwise is not representative of the issue.

2) @LXD team: What in lxd is responsible for populating "/var/snap/lxd/common/shmounts/lxcfs"?

3) @LXD team: Does lxd snap use a remove hook?
   https://snapcraft.io/docs/supported-snap-hooks#heading--remove

Revision history for this message
Paride Legovini (paride) wrote :

The answer is different depending on the system: /var/snap/lxd/common/var/lib/lxcfs is a broken symlink on Noble, but not on Focal. And: snapd purge+remove works on Noble, but not on Focal. See below for reproducers.

## NOBLE ##

$ lxc launch ubuntu:noble paride-n
$ lxc exec paride-n bash
root@paride-n:~# lxd init --auto
Installing LXD snap, please be patient.

root@paride-n:~# ls -l /var/snap/lxd/common/var/lib/lxcfs
lrwxrwxrwx 1 root root 35 Dec 3 13:43 /var/snap/lxd/common/var/lib/lxcfs -> /var/snap/lxd/common/shmounts/lxcfs

root@paride-n:~# ls -l /var/snap/lxd/common/shmounts/lxcfs
ls: cannot access '/var/snap/lxd/common/shmounts/lxcfs': No such file or directory

root@paride-n:~# apt-get --yes --purge remove snapd
[...]
root@paride-n:~# echo $?
0

^^ Success

## FOCAL ##

$ lxc launch ubuntu:focal paride-f
$lxc exec paride-f bash
root@paride-f:~# lxd init --auto

root@paride-f:~# ls -l /var/snap/lxd/common/var/lib/lxcfs
total 0
dr-xr-xr-x 2 nobody nogroup 0 Dec 3 13:53 proc
dr-xr-xr-x 2 nobody nogroup 0 Dec 3 13:53 sys

root@paride-f:~# apt-get --yes --purge remove snapd
[...]
Removing snap lxd and revision 29619
rm: skipping '/var/snap/lxd/common/var/lib/lxcfs', since it's on a different device
dpkg: error processing package snapd (--purge):
 installed snapd package post-removal script subprocess returned error exit status 1
dmesg: read kernel buffer failed: Operation not permitted
                                                         Errors were encountered while processing:
 snapd
E: Sub-process /usr/bin/dpkg returned an error code (1)
root@paride-f:~# echo $?
100

^^ Fail

Revision history for this message
Simon Déziel (sdeziel) wrote :

@ernestl re 3) @LXD team: Does lxd snap use a remove hook?

Yes: https://github.com/canonical/lxd-pkg-snap/blob/latest-edge/snapcraft/hooks/remove

Revision history for this message
Aleksandr Mikhalitsyn (mihalicyn) wrote :

@ernestl 2) @LXD team: What in lxd is responsible for populating "/var/snap/lxd/common/shmounts/lxcfs"?

https://github.com/canonical/lxd-pkg-snap/blob/0b6e8e9a61c48f87244ccb9b34fb4cd35a007ae6/snapcraft/commands/daemon.start#L149C68-L149C79
and
https://github.com/canonical/lxd-pkg-snap/blob/0b6e8e9a61c48f87244ccb9b34fb4cd35a007ae6/snapcraft/commands/daemon.start#L630

It's daemon.start. It looks like LXCFS daemon wasn't stopped for some reason or ${SNAP_COMMON}/var/lib/lxcfs was not unmounted.

Revision history for this message
Ernest Lotter (ernestl) wrote :

@Simon, @Aleksandr, thanks for the info.

I see unmount of "${SNAP_COMMON}/var/lib/lxcfs/" "${SNAP_COMMON}/shmounts" is only available in edge!

It was fixed by Snapd team (Maciek B): https://github.com/canonical/lxd-pkg-snap/commit/3fdcce095d0e96b1ae38709e95ffbd0e79ada3db#diff-c35102df858da3e525aa91c703cf456892f1a7325cdcbbb1e1cedba5d1929a88R13

@Paride, would you mind re-testiung by installing edge version to confirm the problem is fixed?

@LXD, would it not be preferred to have daemon_start and daemon_stop symmetrical in order to avoid the need for addressing the issue in the remove hook? (I am assuming its not from a very superficial scan)

Revision history for this message
Ernest Lotter (ernestl) wrote :

Conclusion: This is not a snapd issue, but a lxd issue.

Revision history for this message
Thomas Parrott (tomparrott) wrote :
Revision history for this message
Thomas Parrott (tomparrott) wrote :

Which LXD version were you using when this issue was seen?

Revision history for this message
Ernest Lotter (ernestl) wrote :

I had another look...

@Paride,

With reference to .../comments/2:
Uninstall/reinstall works because it gets rid of the mounts, unlike refresh.

With reference to .../comments/7:
The reason why ubuntu:noble works is because there are no snaps installed.

The reality `apt[-get] --purge snapd` does not run remove-hooks, so if there are mounts, as is the case in focal and jammy etc, then that's trouble.

It works to first snap remove lxd. I will discuss with the snapd team what is expected of `apt --purge snapd`.

Revision history for this message
Paride Legovini (paride) wrote :

Thanks for the investigation, let me know if you need any further testing from me.

@Thomas, I checked the snap version of my reproducer. As of today, when I `lxc launch ubuntu:focal`, I get:

root@paride-f:~# snap list
Name Version Rev Tracking Publisher Notes
core20 20240911 2434 latest/stable canonical✓ base
lxd 4.0.10-e664786 29619 4.0/stable/… canonical✓ -
snapd 2.66.1 23258 latest/stable canonical✓ snapd

A `snap info lxd` shows the full track:

snap-id: J60k4JY0HppjwOjW8dZdYc8obXKxujRu
tracking: 4.0/stable/ubuntu-20.04

Revision history for this message
Ernest Lotter (ernestl) wrote (last edit ):

For the longer term, create JIRA ticket to improve the snapd purge to be more robust in dealing with removal of mounts introduced by snaps: https://warthogs.atlassian.net/browse/SNAPDENG-34339.

This is low priority, and for the specific case it is still recommended to add workaround in test.
See: https://bugs.launchpad.net/ubuntu/+source/snapd/+bug/1989603/comments/14

Changed in snapd (Ubuntu):
importance: Undecided → Low
Revision history for this message
Simon Déziel (sdeziel) wrote :

A fix for this landed in `4.0/candidate` and `4.0/edge` channels. It should make its way into `4.0/stable`.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.