[MIR] lxcfs

Bug #1413405 reported by Stéphane Graber
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxcfs (Ubuntu)
Fix Released
Undecided
Serge Hallyn

Bug Description

lxcfs is a dependency of the LXC for containers running systemd.
As a result, it is required to run modern Ubuntu or Debian (or any systemd powered distro) within LXC

Availability: currently in universe (version 0.3)
Rationale: required for systemd containers in LXC
Security: no open bugs in LP, actively maintained, no CVE. Ships a daemon running as root, mounts a fuse filesystem and uses cgmanager for access checks.
Quality: actively maintained upstream, though very young project
Dependencies: all depends are in main
Standards: compliant
Maintenance: actively maintained by upstream

The ~ubuntu-lxc team is subscribed to bug mail for this package.

Michael Terry (mterry)
Changed in lxcfs (Ubuntu):
assignee: nobody → Jamie Strandboge (jdstrand)
Changed in lxcfs (Ubuntu):
assignee: Jamie Strandboge (jdstrand) → Seth Arnold (seth-arnold)
Tyler Hicks (tyhicks)
Changed in lxcfs (Ubuntu):
assignee: Seth Arnold (seth-arnold) → Tyler Hicks (tyhicks)
status: New → In Progress
Revision history for this message
Tyler Hicks (tyhicks) wrote :

I've reviewed lxcfs 0.7-0ubuntu2 in Vivid. Due to time-constraints, this should not be considered a full audit. I focused on common mistakes in C and any interesting pieces of code that caught my eye. Here are some of the more important pieces from my notes:

 * lxcfs provides "A cgroupfs-like tree which is container aware and works using CGManager." and "A set of files which can be bind-mounted over their /proc originals to provide CGroup-aware values."

* No CVE history (the project is very young)

* Minimal build deps (libcmanager and libfuse are the only notables)

* lxcfs is a root owned fuse daemon
  - Uses libfuse to daemonize

* Test suite consists of two simple tests (one for the cgroup subdir and one for the proc subdir) that are packaged as autopkgtests

* Packaging is clean and simple

* The build is clean

There is one issue that could be addressed:

* Technically, the memory pointed to by the 'd' pointer is never freed from
   main() in lxcfs.c. This is not an issue in practice but would be nice to
   silence the warning from cppcheck and probably other checkers.

There is one issue that must be addressed:

* In many of the lxcfs_ops functions, the matching of the /cgroup path
   component is a little off. The strncmp() is limited to only the first 7 chars
   and then there's nothing in pick_controller_from_path() verifying that the
   8th char is a '/'. This results in "/cgroup@freezer/a/b" being treated as a
   valid path.

There is one open question that I have:

* In pid_to_ns_wrapper(), you access /proc/<PID>/ns/pid, where <PID> comes from
   the struct fuse_context that is initially passed into lxcfs_read(). Is that
   process pinned for the lifetime of the lxcfs_read() or could it be recycled
   in the middle of the lxcfs_read()? We need to be sure that
   pid_to_ns_wrapper() is not accessing the ns of a recycled pid.

Once the "/cgroup" strncmp() matching issue is fixed and pid_to_ns_wrapper() is deemed safe, lxcfs gets a Security Team ack for main. It is a complex solution for a complicated problem but I'm very confident that the containers team will quickly address any issues discovered in the future. Thanks!

Changed in lxcfs (Ubuntu):
assignee: Tyler Hicks (tyhicks) → Stéphane Graber (stgraber)
Changed in lxcfs (Ubuntu):
assignee: Stéphane Graber (stgraber) → Serge Hallyn (serge-hallyn)
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Thanks for hte review, Tyler. I've pushed a new package freeing the fuse data and ensuring that "cgroup" and the controller are separated by '/'.

The pid passed to pid_to_ns_wrapper() comes from the fuse context - it is the pid of 'current' which is doing the read or write operation. If that task were to exit, the operation would presumably be aborted. I have not yet completely verified (in the kernel and fuse sources) or tested that killing the writing task is not possible (or that the fuse operation would be canceled). If it were not that would appear to be a serious kernel bug which should be fixed there. This should be test-able with a small fuse library which sleeps during write, and a pair of sibling tasks where one writes while the other kills the first. I can test that on monday, but probably not before. Note that in this case I would consider that a serious kernel bug - not a bug in lxcfs. The uid/gid/pid of the acting task is (AIUI) meant to be trusted.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxcfs - 0.7-0ubuntu3

---------------
lxcfs (0.7-0ubuntu3) vivid; urgency=medium

  * 0002-Make-sure-that-that-cgroup-and-the-controller-are-se.patch: when
    looking a controller under /cgroup, make sure that '/' separates them,
    so that '/cgroup@freezer' is not a valid path. (LP: #1413405)
  * 0003-free-d-at-program-end.patch: free the lxcfs_state before exiting.
 -- Serge Hallyn <email address hidden> Sun, 19 Apr 2015 08:44:06 -0500

Changed in lxcfs (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Stéphane Graber (stgraber) wrote :

Re-opened and re-assigned to Tyler.

Changed in lxcfs (Ubuntu):
status: Fix Released → Triaged
assignee: Serge Hallyn (serge-hallyn) → Tyler Hicks (tyhicks)
Revision history for this message
Tyler Hicks (tyhicks) wrote :

Hi Serge - Thanks for confirming that the pid that pid_to_ns_wrapper() is using is the pid of 'current'. With that being the case, I can't see how a race condition would occur but a test would be nice if you can find the time.

The 0003-free-d-at-program-end.patch patch isn't quite right since 'd' is not freed when cgm_get_controllers() fails. cppcheck still complains about the leak. I'm not concerned about it but thought I'd mention it.

However, I did spot two new things while (re-)reviewing the new lxcfs package:

1) The sscanf() in proc_diskstats_read() doesn't place a limit on the length of the string that is copied into the dev_name buffer. This could result in a buffer overflow if the device name exceeds 72 characters (I'm not sure if that's actually possible). The sscanf man page says, "String input conversions store a terminating null byte ('\0') to mark the end of the input; the maximum field width does not include this terminator." That means that the format string should be "%u %u %71s".

2) The sprintf(fnam, ...) calls in pid_to_ns_wrapper(), pid_from_ns_wrapper(), and get_pid1_time() should be changed to snprintf(fname, sizeof(fnam), ...). There's no way to overflow those buffers today but it'll prevent future issues if someone decreases the size of the fnam buffers in the future.

I trust that you (or Stéphane) will fix the issues mentioned above correctly so, in interest of time, I don't need to re-review your fixes before lxcfs is promoted. Thanks! :)

Changed in lxcfs (Ubuntu):
status: Triaged → In Progress
assignee: Tyler Hicks (tyhicks) → Serge Hallyn (serge-hallyn)
Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Hi Tyler,

I just confirmed that a root task doing a read on the fuse fs, if another root task kills it (SIGTERM or SIGKILL), does not in fact go away (wait doesn't return) until after the fuse hook returns.

I'll push another package fixing the other findings, thanks again.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxcfs - 0.7-0ubuntu4

---------------
lxcfs (0.7-0ubuntu4) vivid; urgency=medium

  * Add some more sanity checks (LP: #1413405)
 -- Serge Hallyn <email address hidden> Mon, 20 Apr 2015 08:42:18 -0500

Changed in lxcfs (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Stéphane Graber (stgraber) wrote :

Promoted 0ubuntu4 to main.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.