lxcfs startup failures because fuse directory not empty

Bug #1756785 reported by Jon Watte

This bug report will be marked for expiration in 29 days if no further activity occurs. (find out why)

12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lxcfs (Ubuntu)
Incomplete
Undecided
Unassigned

Bug Description

During start-up, I get about 8 complaints that LXCFS is starting and stopping, until it goes red and says it failed.

The lxcfs.service unit journal contains a number of these:

Mar 18 21:12:36 ripper systemd[1]: lxcfs.service: Unit entered failed state.
Mar 18 21:12:36 ripper systemd[1]: lxcfs.service: Failed with result 'exit-code'.
Mar 18 21:12:36 ripper systemd[1]: lxcfs.service: Service hold-off time over, scheduling restart.
Mar 18 21:12:36 ripper systemd[1]: Stopped FUSE filesystem for LXC.
Mar 18 21:12:36 ripper systemd[1]: Started FUSE filesystem for LXC.
Mar 18 21:12:37 ripper lxcfs[1539]: hierarchies:
Mar 18 21:12:37 ripper lxcfs[1539]: 0: fd: 5: hugetlb
Mar 18 21:12:37 ripper lxcfs[1539]: 1: fd: 6: cpu,cpuacct
Mar 18 21:12:37 ripper lxcfs[1539]: 2: fd: 7: cpuset
Mar 18 21:12:37 ripper lxcfs[1539]: 3: fd: 8: blkio
Mar 18 21:12:37 ripper lxcfs[1539]: 4: fd: 9: rdma
Mar 18 21:12:37 ripper lxcfs[1539]: 5: fd: 10: pids
Mar 18 21:12:37 ripper lxcfs[1539]: 6: fd: 11: memory
Mar 18 21:12:37 ripper lxcfs[1539]: 7: fd: 12: freezer
Mar 18 21:12:37 ripper lxcfs[1539]: 8: fd: 13: perf_event
Mar 18 21:12:37 ripper lxcfs[1539]: 9: fd: 14: net_cls,net_prio
Mar 18 21:12:37 ripper lxcfs[1539]: 10: fd: 15: devices
Mar 18 21:12:37 ripper lxcfs[1539]: 11: fd: 16: name=systemd
Mar 18 21:12:37 ripper lxcfs[1539]: 12: fd: 17: unified
Mar 18 21:12:37 ripper lxcfs[1539]: fuse: mountpoint is not empty
Mar 18 21:12:37 ripper lxcfs[1539]: fuse: if you are sure this is safe, use the 'nonempty' mount option
Mar 18 21:12:37 ripper systemd[1]: lxcfs.service: Main process exited, code=exited, status=1/FAILURE
Mar 18 21:12:37 ripper fusermount[1580]: /bin/fusermount: failed to unmount /var/lib/lxcfs: Invalid argument
Mar 18 21:12:37 ripper systemd[1]: lxcfs.service: Unit entered failed state.

Looking at /var/lib/lxcfs, it does, in fact, contain two directories "cgroup" and "proc"
So, either:
1) This should only start once, but it's started multiple times, or
2) This directory should be empty but the script fails to make it so

ls -l /var/lib/lxcfs shows:

[21:26] jwatte@ripper:~$ ls -la /var/lib/lxcfs/
total 16
drwxr-xr-x 4 root root 4096 Aug 26 2017 ./
drwxr-xr-x 69 root root 4096 Mar 18 21:10 ../
drwxr-xr-x 13 root root 4096 Aug 26 2017 cgroup/
dr-xr-xr-x 2 root root 4096 Aug 26 2017 proc/

So, something somewhere made this directory non-empty, and now it fails every time on start. (I sure didn't put anything here!) August 26 matches when I first installed this system, at the time running 17.04.

I recommend making the startup script simply nuke the contents of this directory.

ProblemType: Bug
DistroRelease: Ubuntu 17.10
Package: lxcfs 2.0.8-0ubuntu1~17.10.2
ProcVersionSignature: Ubuntu 4.13.0-37.42-generic 4.13.13
Uname: Linux 4.13.0-37-generic x86_64
NonfreeKernelModules: nvidia_modeset nvidia
ApportVersion: 2.20.7-0ubuntu3.7
Architecture: amd64
CurrentDesktop: ubuntu:GNOME
Date: Sun Mar 18 21:23:52 2018
InstallationDate: Installed on 2017-08-26 (204 days ago)
InstallationMedia: Ubuntu-Server 17.04 "Zesty Zapus" - Release amd64 (20170412)
SourcePackage: lxcfs
UpgradeStatus: Upgraded to artful on 2018-02-05 (41 days ago)

Revision history for this message
Jon Watte (jwatte) wrote :
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Thanks for filing this bug in Ubuntu.

I don't think it's safe for the service to just simply nuke the contents of /var/lib/lxcfs prior to mounting/starting.

It does sound like a local configuration problem, or maybe a bug that existed in 17.04 (which is what you first installed). It will be hard to reproduce now if it depends on 17.04, as that release is end-of-life.

In your case, since the contents are cgroup and proc with the fuse system NOT MOUNTED (please make sure), you can delete those and then restart the service.

Something like this:
sudo systemctl stop lxcfs.service

# confirm with mount that it is unmounted
mount -t fuse.lxcfs

# delete or move away the contents of /var/lib/lxcfs, your choice

# start the service
sudo systemctl start lxcfs.service

# confirm it's mounted
mount -t fuse.lxcfs

If you come across a way to reproduce the problem from the beginning, like a fresh install, or an upgrade from a previous release of ubuntu, then please update this bug. In the meantime, I'll mark it as incomplete.

Changed in lxcfs (Ubuntu):
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for lxcfs (Ubuntu) because there has been no activity for 60 days.]

Changed in lxcfs (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Felipe Santos (felipecassiors) wrote :

I've got to say I stumble upon this issue every now and then. And in my case, my /var/lib/lxcfs directory is dedicated to lxcfs.

I don't know how come it gets populated with files, but every time I just need to delete the folder to make it work again.

Given the directory is exclusive to lxcfs I totally agree it would be nice if lxcfs had an option like --clean-up-if-not-empty, so that I don't need to manually intervene every time this issue happens for whatever reason.

Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote :

Hi Felipe,

Thanks for reporting it.

I will reset this as incomplete based on the last comment from Andres

> If you come across a way to reproduce the problem from the beginning, like a fresh install, or an upgrade from a previous release of ubuntu, then please update this bug. In the meantime, I'll mark it as incomplete.

Would you be able to provide such reproducer?

Changed in lxcfs (Ubuntu):
status: Expired → Incomplete
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi Felipe,

can you report which files you are finding under /var/lib/lxcfs?

The original report was specificall about the cgroup and proc directories, which was suspicious as they are also the directories which lxcfs creates.

Revision history for this message
Felipe Santos (felipecassiors) wrote :

I have no way to reproduce this issue reliably. Next time I notice it, I will make sure to get the files list before deleting them. It can take days or months. :(

For example, I wonder if this issue happens when OS crashes and reboots ungracefully, or when the VM gets down abruptely for whatever reason and doesn't give the chance for lxcfs to clean up its stuff.

But since it's a mountpoint, I expect that lxcfs would in fact never write anything to /var/lib/lxcfs (before it's mounted).

So, I'm afraid I can't provide any more clues to help fix this issue at this point (but will update if I get more data). I just wanted to cast my support for the initial suggestion of "nuking" the directory in case it's not empty.

Revision history for this message
Mitchell Dzurick (mitchdz) wrote :

Thanks Felipe. This bug will remain in an incomplete state then, and will expire in 53 days. Please give us an update if you have any new information, or leads on this issue.

summary: - lxcfs startup failues because fuse directory not empty
+ lxcfs startup failures because fuse directory not empty
Revision history for this message
Felipe Santos (felipecassiors) wrote :

Ok, the issue happened again in one of my VMs. It's currently in the state where lxcfs process can't start because the folder isn't empty.

Here are the files list:

```shell-session
$ tree /var/lib/lxcfs-on-kubernetes
/var/lib/lxcfs-on-kubernetes
├── proc
│   ├── cpuinfo
│   ├── diskstats
│   ├── loadavg
│   ├── meminfo
│   ├── stat
│   ├── swaps
│   └── uptime
└── sys
    └── devices
        └── system
            └── cpu
                └── online

13 directories, 0 files
```

Every single item there seems to be an empty directory, like this one:

```shell-session
$ stat /var/lib/lxcfs-on-kubernetes/sys/devices/system/cpu/online
  File: /var/lib/lxcfs-on-kubernetes/sys/devices/system/cpu/online
  Size: 4096 Blocks: 8 IO Block: 4096 directory
Device: fd00h/64768d Inode: 11687079 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2024-08-21 03:58:24.794466255 +0200
Modify: 2024-08-19 18:00:11.223066540 +0200
Change: 2024-08-19 18:00:11.223066540 +0200
 Birth: 2024-08-19 18:00:11.223066540 +0200
```

```shell-session
$ find /var/lib/lxcfs-on-kubernetes -type f

$ find /var/lib/lxcfs-on-kubernetes -type d
/var/lib/lxcfs-on-kubernetes
/var/lib/lxcfs-on-kubernetes/proc
/var/lib/lxcfs-on-kubernetes/proc/swaps
/var/lib/lxcfs-on-kubernetes/proc/cpuinfo
/var/lib/lxcfs-on-kubernetes/proc/uptime
/var/lib/lxcfs-on-kubernetes/proc/loadavg
/var/lib/lxcfs-on-kubernetes/proc/meminfo
/var/lib/lxcfs-on-kubernetes/proc/diskstats
/var/lib/lxcfs-on-kubernetes/proc/stat
/var/lib/lxcfs-on-kubernetes/sys
/var/lib/lxcfs-on-kubernetes/sys/devices
/var/lib/lxcfs-on-kubernetes/sys/devices/system
/var/lib/lxcfs-on-kubernetes/sys/devices/system/cpu
/var/lib/lxcfs-on-kubernetes/sys/devices/system/cpu/online
```

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.