segfault at 0 ip 00007fe70ae4e3b2 sp 00007fe70884fb70 error 4 in liblxcfs.so[7fe70ae46000+f000]

Bug #1807628 reported by Haw Loeung
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lxcfs (Ubuntu)
Fix Released
Undecided
Christian Brauner
Bionic
Fix Released
Undecided
Kellen Renshaw
Focal
Fix Released
Undecided
Unassigned
Jammy
Fix Released
Undecided
Unassigned
Kinetic
Fix Released
Undecided
Christian Brauner

Bug Description

SRU template
[Impact]

 * lxcfs on Bionic will segfault if there are no non-directory files in a cgroup. This necessitates restarting running containers.

[Test Plan]

 * Install lxcfs on an Ubuntu Bionic machine. "sudo apt install lxcfs"
  * Open 3 terminals to the machine, each with a root prompt.
 * Prepare a mount directory in terminal 1:
   mkdir /mnt/lxcfs
 * In terminal 1, execute:
   while true ; do mkdir /sys/fs/cgroup/systemd/test ; rmdir /sys/fs/cgroup/systemd/test ; done
 * In terminal 2, execute:
   lxcfs -p /tmp/lxcfs.pid /mnt/lxcfs
 * In terminal 3, execute:
   while true; do ls /mnt/lxcfs/cgroup/name\=systemd/test > /dev/null ;done
 * Segfault should not occur with patched version.

[Where problems could occur]

 * Correcting the null pointer dereference could allow previously undetected bugs masked by the segfault to be encountered.

[Other Info]

 * Proposed fix is upstream since version 3.0.4 with no negative impacts.
 * Proposed fix is a minimal cherry-pick of the fix, without other functional changes.

###
Original bug text
###

Hi,

lxcfs crashed earlier today requiring us to restart a bunch of LXC containers. I'm not able to upload using apport-bug but here's the attached crash report.

I commented on https://github.com/lxc/lxcfs/issues/73#issuecomment-445598111 and repeating what I said here:

| Dec 8 06:25:03 orlo kernel: [25247258.665022] lxcfs[3871]: segfault at 0 ip 00007fe70ae4e3b2 sp 00007fe70884fb70 error 4 in liblxcfs.so[7fe70ae46000+f000]
| Dec 8 06:25:09 orlo systemd[1]: lxcfs.service: Main process exited, code=killed, status=11/SEGV
| Dec 8 06:25:09 orlo systemd[1]: lxcfs.service: Unit entered failed state.
| Dec 8 06:25:09 orlo systemd[1]: lxcfs.service: Failed with result 'signal'.
| Dec 8 06:25:10 orlo systemd[1]: lxcfs.service: Service hold-off time over, scheduling restart.
| Dec 8 06:25:10 orlo lxcfs[10839]: hierarchies:
| Dec 8 06:25:10 orlo lxcfs[10839]: 0: fd: 5: perf_event
| Dec 8 06:25:10 orlo lxcfs[10839]: 1: fd: 6: blkio
| Dec 8 06:25:10 orlo lxcfs[10839]: 2: fd: 7: freezer
| Dec 8 06:25:10 orlo lxcfs[10839]: 3: fd: 8: devices
| Dec 8 06:25:10 orlo lxcfs[10839]: 4: fd: 9: cpuset
| Dec 8 06:25:10 orlo lxcfs[10839]: 5: fd: 10: cpu,cpuacct
| Dec 8 06:25:10 orlo lxcfs[10839]: 6: fd: 11: pids
| Dec 8 06:25:10 orlo lxcfs[10839]: 7: fd: 12: memory
| Dec 8 06:25:10 orlo lxcfs[10839]: 8: fd: 13: net_cls,net_prio
| Dec 8 06:25:10 orlo lxcfs[10839]: 9: fd: 14: hugetlb
| Dec 8 06:25:10 orlo lxcfs[10839]: 10: fd: 15: name=systemd

So now after restarting it, the containers are showing this:

| Error: /proc must be mounted
| To mount /proc at boot you need an /etc/fstab line like:
| proc /proc proc defaults
| In the meantime, run "mount proc /proc -t proc"

Package version:

| ubuntu@orlo:~$ cfs
| lxcfs:
| Installed: 2.0.8-0ubuntu1~16.04.2
| Candidate: 2.0.8-0ubuntu1~16.04.2
| Version table:
| 3.0.2-0ubuntu1~16.04.1 100
| 100 http://archive.ubuntu.com/ubuntu xenial-backports/main amd64 Packages
| *** 2.0.8-0ubuntu1~16.04.2 500
| 500 http://archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
| 100 /var/lib/dpkg/status
| 2.0.0-0ubuntu2 500
| 500 http://archive.ubuntu.com/ubuntu xenial/main amd64 Packages

Revision history for this message
Haw Loeung (hloeung) wrote :
description: updated
Revision history for this message
Stéphane Graber (stgraber) wrote :
Download full text (4.5 KiB)

Trace:

Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x00007fe70ae4e3b2 in cg_readdir (path=<optimized out>, buf=0x7fe690003610, filler=0x7fe70b850ce0 <fill_dir>, offset=<optimized out>, fi=<optimized out>)
    at bindings.c:1793
1793 bindings.c: No such file or directory.
[Current thread is 1 (Thread 0x7fe708850700 (LWP 3871))]
(gdb) bt full
#0 0x00007fe70ae4e3b2 in cg_readdir (path=<optimized out>, buf=0x7fe690003610, filler=0x7fe70b850ce0 <fill_dir>, offset=<optimized out>, fi=<optimized out>)
    at bindings.c:1793
        d = 0x7fe690000940
        list = 0x0
        i = 0
        ret = <optimized out>
        nextcg = 0x0
        fc = 0x7fe6500240d0
        clist = 0x0
        __func__ = "cg_readdir"
        initpid = <optimized out>
#1 0x000055cfc579a411 in do_proc_readdir (fi=<optimized out>, offset=<optimized out>, filler=<optimized out>, buf=<optimized out>, path=<optimized out>)
    at lxcfs.c:307
        proc_readdir = <optimized out>
        error = <optimized out>
#2 lxcfs_readdir (path=<optimized out>, buf=0x7fe690003610, filler=0x7fe70b850ce0 <fill_dir>, offset=0, fi=0x7fe70884fc80) at lxcfs.c:504
No locals.
#3 0x00007fe70b856232 in fuse_fs_readdir (fs=0x55cfc69e2fd0,
    path=0x7fe650024170 "/cgroup/name=systemd/lxc/juju-c5f7d5-1-lxd-1/user.slice/user-113.slice/session-463849.scope", buf=0x7fe690003610,
    filler=0x7fe70b850ce0 <fill_dir>, off=0, fi=0x7fe70884fc80) at fuse.c:2044
No locals.
#4 0x00007fe70b8563bc in readdir_fill (fi=0x7fe70884fc80, dh=0x7fe690003610, off=0, size=4096, ino=3274299, req=0x7fe650024ad0, f=0x55cfc69e2aa0)
    at fuse.c:3502
        d = {id = 472446402651, cond = {__data = {__lock = 0, __futex = 0, __total_seq = 0, __wakeup_seq = 532575944823, __woken_seq = 0,
              __mutex = 0x5c0b63bf, __nwaiters = 918218053, __broadcast_seq = 0},
            __size = '\000' <repeats 16 times>, "w\000\000\000|", '\000' <repeats 11 times>, "\277c\v\\\000\000\000\000E\345\272\066\000\000\000",
            __align = 0}, finished = 1544250303}
        path = 0x7fe650024170 "/cgroup/name=systemd/lxc/juju-c5f7d5-1-lxd-1/user.slice/user-113.slice/session-463849.scope"
        err = <optimized out>
#5 fuse_lib_readdir (req=0x7fe650024ad0, ino=3274299, size=4096, off=0, llfi=<optimized out>) at fuse.c:3528
        err = 0
        f = 0x55cfc69e2aa0
        fi = {flags = 0, fh_old = 140628235127104, writepage = 0, direct_io = 0, keep_cache = 0, flush = 0, nonseekable = 0, flock_release = 0, padding = 0,
          fh = 140628235127104, lock_owner = 0}
        dh = 0x7fe690003610
#6 0x00007fe70b85d0f6 in do_readdir (req=<optimized out>, nodeid=<optimized out>, inarg=<optimized out>) at fuse_lowlevel.c:1390
        arg = <optimized out>
        fi = {flags = 0, fh_old = 140628235138576, writepage = 0, direct_io = 0, keep_cache = 0, flush = 0, nonseekable = 0, flock_release = 0, padding = 0,
          fh = 140628235138576, lock_owner = 0}
#7 0x00007fe70b85e679 in fuse_ll_process_buf (data=0x55cfc69e3160, buf=0x7fe70884ff00, ch=<optimized out>) at fuse_lowlevel.c:2442
        f = 0x55cfc69e3160
        bufv = {count = 1, idx = 0, off = 0, buf = {{size = 80, flags = (unknown: 0), mem = ...

Read more...

Changed in lxcfs (Ubuntu):
assignee: nobody → Christian Brauner (cbrauner)
Revision history for this message
Christian Brauner (cbrauner) wrote :

I sent a fix: https://github.com/lxc/lxcfs/pull/262

I think the issue is

 at bindings.c:1793
        d = 0x7fe690000940
        list = 0x0
        i = 0
        ret = <optimized out>
        nextcg = 0x0
        fc = 0x7fe6500240d0
        clist = 0x0
        __func__ = "cg_readdir"
        initpid = <optimized out>

list == NULL

The other variables are fine.

Changed in lxcfs (Ubuntu):
status: New → In Progress
Revision history for this message
Haw Loeung (hloeung) wrote :

Any updates on this one? This hit us again (see latest internal incident report).

Changed in lxcfs (Ubuntu):
status: In Progress → Fix Released
Changed in lxcfs (Ubuntu Jammy):
status: New → Fix Released
Changed in lxcfs (Ubuntu Focal):
status: New → Fix Released
Changed in lxcfs (Ubuntu Bionic):
status: New → In Progress
assignee: nobody → Kellen Renshaw (krenshaw)
description: updated
description: updated
Revision history for this message
Kellen Renshaw (krenshaw) wrote (last edit ):

Moved SRU info over from bug 1977870. Reproduction and debdiff (with newer, incorrect bug number in changelog) testing info is below:
Was able to successfully reproduce the issue on 3.0.3-0ubuntu1~18.04.2 on a Bionic VM. Updated the Test Plan with the reproduction commands to verify that the issue is fixed.

(gdb) bt full
#0 0x00007f162ca854ea in cg_readdir (path=<optimized out>, buf=0x7f1628004740, filler=0x7f162d4b1d00, offset=<optimized out>,
    fi=<optimized out>) at bindings.c:1800
        d = 0x7f1618001760
        list = 0x0
        i = 0
        ret = <optimized out>
        nextcg = 0x0
        fc = <optimized out>
        clist = 0x0
        __func__ = "cg_readdir"
        initpid = <optimized out>
#1 0x000055a4a90425c3 in ?? ()
No symbol table info available.
#2 0x00007f162d4b7292 in ?? ()
No symbol table info available.
#3 0x00007f1626ffcc00 in ?? ()
No symbol table info available.
#4 0x9cce25ca93392700 in ?? ()
No symbol table info available.
#5 0x00007f161c00a010 in ?? ()
No symbol table info available.
#6 0x9cce25ca93392700 in ?? ()
No symbol table info available.
#7 0x0000000000000000 in ?? ()
No symbol table info available.

[Reproduction]

 * Install lxcfs on an Ubuntu Bionic machine. "sudo apt install lxcfs"
 * Open 3 terminals to the machine, each with a root prompt.
 * Prepare a mount directory in terminal 1:
   mkdir /mnt/lxcfs
 * In terminal 1, execute:
   while true ; do mkdir /sys/fs/cgroup/systemd/test ; rmdir /sys/fs/cgroup/systemd/test ; done
 * In terminal 2, execute:
   lxcfs -p /tmp/lxcfs.pid /mnt/lxcfs
 * In terminal 3, execute:
   while true; do ls /mnt/lxcfs/cgroup/name\=systemd/test > /dev/null ;done
 * Segfault should occur within 1 minute.

[Testing of fix]
 Using package from PPA:
 https://launchpad.net/~krenshaw/+archive/ubuntu/lp1977870-lxcfs

 Created using debdiff from bug 1977870 and uploading the .changes file after debuild -S.

 The issue did not recur in several minutes of testing, the unpatched version fails within seconds.

tags: added: sts
Revision history for this message
Kellen Renshaw (krenshaw) wrote :

Adjusted debdiff from bug 1977870, only change is changing the LP bug number to 1807628.

Revision history for this message
Kellen Renshaw (krenshaw) wrote :

Reproducer has been running for around ~2 days with the patch from the debdiff applied. Previously this would segfault within seconds for me.

tags: added: patch
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Hi Kellen,

Apologies for taking so long to reply; I've been traveling and then on a medical leave. Anyway, I've finally uploaded the changes (after taking the liberty to adjust the Bug-Ubuntu DEP-3 header):

$ dput lxcfs_3.0.3-0ubuntu1~18.04.3_source.changes
Trying to upload package to ubuntu
Checking signature on .changes
gpg: /home/sergio/work/lxcfs/lxcfs_3.0.3-0ubuntu1~18.04.3_source.changes: Valid signature from 106DA1C8C3CBBF14
Checking signature on .dsc
gpg: /home/sergio/work/lxcfs/lxcfs_3.0.3-0ubuntu1~18.04.3.dsc: Valid signature from 106DA1C8C3CBBF14
Uploading to ubuntu (via ftp to upload.ubuntu.com):
  Uploading lxcfs_3.0.3-0ubuntu1~18.04.3.dsc: done.
  Uploading lxcfs_3.0.3-0ubuntu1~18.04.3.debian.tar.xz: done.
  Uploading lxcfs_3.0.3-0ubuntu1~18.04.3_source.buildinfo: done.
  Uploading lxcfs_3.0.3-0ubuntu1~18.04.3_source.changes: done.
Successfully uploaded packages.

Hopefully the SRU team will review & accept it soon.

Revision history for this message
Kellen Renshaw (krenshaw) wrote :

Hi Sergio,

Thanks for catching that and for uploading the package.

Revision history for this message
Robie Basak (racb) wrote : Please test proposed package

Hello Haw, or anyone else affected,

Accepted lxcfs into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/lxcfs/3.0.3-0ubuntu1~18.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in lxcfs (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
Revision history for this message
Kellen Renshaw (krenshaw) wrote :
Download full text (7.2 KiB)

Using documented test procedure, replicated segfault on lxcfs 3.0.3-0ubuntu1~18.04.2.

Enabled -proposed, performed test procedure on lxcfs 3.0.3-0ubuntu1~18.04.3, segfault did not occur.

### Testing for lxcfs 3.0.3-0ubuntu1~18.04.2:
root@z-rotomvm29:~# time lxcfs -p /tmp/lxcfs.pid /mnt/lxcfs
mount namespace: 5
hierarchies:
  0: fd: 6: cpuset
  1: fd: 7: cpu,cpuacct
  2: fd: 8: devices
  3: fd: 9: rdma
  4: fd: 10: perf_event
  5: fd: 11: hugetlb
  6: fd: 12: memory
  7: fd: 13: pids
  8: fd: 14: net_cls,net_prio
  9: fd: 15: freezer
 10: fd: 16: blkio
 11: fd: 17: name=systemd
 12: fd: 18: unified
bindings.c: 943: make_key_list_entry: Error getting files under name=systemd:test
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/cgroup.clone_children: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/tasks: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/notify_on_release: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/cgroup.clone_children: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/cgroup.clone_children: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/notify_on_release: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/cgroup.clone_children: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/tasks: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/notify_on_release: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/cgroup.clone_children: No such file or directory
bindings.c: 943: make_key_list_entry: Error getting files under name=systemd:test
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/notify_on_release: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/cgroup.clone_children: No such file or directory
bindings.c: 943: make_key_list_entry: Error getting files under name=systemd:test
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/tasks: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/notify_on_release: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/cgroup.clone_children: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/cgroup.procs: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/tasks: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/notify_on_release: No such file or directory
bindings.c: 803: cgfs_iterate_cgroup: Failed to stat test/cgroup.clone_children: No such file or directory
Segmentation fault (core dumped)

real 0m7.074s
user 0m0.050s
sys 0m0.058s

### Testing of lxcfs 3.0.3-0ubuntu1~18.04.3 package from -proposed:
root@z-rotomvm29:~# cat <<EOF >/etc/apt/sources.list.d/ubuntu-$(lsb_release -cs)-proposed.list
> # Enable Ubuntu proposed archive
> deb http://archive.ubuntu.com/ubuntu/ $(lsb_release -cs)-proposed restricted main multiverse universe
> EOF
root@z-rot...

Read more...

tags: added: verification-done-bionic
removed: verification-needed-bionic
tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxcfs - 3.0.3-0ubuntu1~18.04.3

---------------
lxcfs (3.0.3-0ubuntu1~18.04.3) bionic; urgency=medium

  * d/p/0002-bindings-prevent-NULL-pointer-dereference.patch: (LP: #1807628)
    - bindings: Prevent NULL pointer dereference.

 -- Kellen Renshaw <email address hidden> Tue, 07 Jun 2022 16:31:21 +0000

Changed in lxcfs (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for lxcfs has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.