Ephemeral containers have "/rootfs" prefix in /proc/self/maps entries

Bug #959352 reported by Benji York on 2012-03-19
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Bryan Wu
Precise
Undecided
Unassigned
Quantal
Medium
Bryan Wu
lxc (Ubuntu)
High
Serge Hallyn
Precise
High
Serge Hallyn
Quantal
High
Serge Hallyn

Bug Description

===================================
SRU Justification (for lxc workaround)
1. Impact: /proc/self/maps and /proc/self/fd entries in ephemeral containers
   are prefixed with '/rootfs'. Software which uses these paths to find
   plugins or other files will break.
2. Development fix (workaround): The prepended paths lead to the root dentry
   of the overlayfs mount. So lxc-start-ephemeral is changed to make the
   container rootfs / the root of the overlay mount.
3. Stable fix: same as development fix.
4. Test case:
   sudo lxc-create -t ubuntu -n q1
   sudo lxc-start-ephemeral -o q1
   In another terminal, follow the instructions to open a console to the
   ephemeral container. Therein log in as ubuntu/ubuntu, and do:
      cat /proc/self/maps
      ls -l /proc/self/fd/
   and check whether entries are prefixed with '/rootfs'
5. Regression potential: Customized containers (especially which have custom
   made directories under /var/lib/lxc/<container> may break. If this becomes
   a problem we could place all of /var/lib/lxc/<container>-temp-XXXXX in
   another empty tmpfs, however that is not free.
===================================
Ephemeral containers (but not non-ephemeral ones) have all of their /proc/*/maps entries prefixed with "/rootfs". One problem this causes is that graphviz uses /proc/self/maps to locate its plugins. That means that some of the plugins can not be loaded.

To reproduce the problem with dot, run this command:

    dot -Tcmapx < /dev/null

No output is expected, however because of the bug this output is produced:

    Format: "cmapx" not recognized. Use one of: dia hpgl mif mp pcl pic vtx

A workaround for the problem with graphviz is to make the plugins available at the path it is expecting:

    mkdir -p /rootfs/usr/lib
    ln -s /usr/lib/graphviz /rootfs/usr/lib/graphviz
---
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
ApportVersion: 1.95-0ubuntu1
Architecture: i386
DistroRelease: Ubuntu 12.04
HibernationDevice: RESUME=UUID=2c5f282a-e713-4ae0-a940-87a40efd050f
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Release i386 (20110427.1)
MachineType: LENOVO 4313CTO
Package: lxc
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-20-generic-pae root=UUID=8469c78f-d0fc-4564-a009-eed59bd1fdff ro quiet splash vt.handoff=7
ProcVersionSignature: Ubuntu 3.2.0-20.33-generic-pae 3.2.12
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
StagingDrivers: mei
Tags: precise staging
Uname: Linux 3.2.0-20-generic-pae i686
UpgradeStatus: Upgraded to precise on 2012-01-24 (63 days ago)
UserGroups: adm admin cdrom dialout libvirtd lpadmin plugdev sambashare
dmi.bios.date: 10/26/2010
dmi.bios.vendor: LENOVO
dmi.bios.version: 6MET81WW (1.41 )
dmi.board.name: 4313CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr6MET81WW(1.41):bd10/26/2010:svnLENOVO:pn4313CTO:pvrThinkPadT510:rvnLENOVO:rn4313CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 4313CTO
dmi.product.version: ThinkPad T510
dmi.sys.vendor: LENOVO

Serge Hallyn (serge-hallyn) wrote :

Thanks. That's curious. I'm marking low priority as it has a workaround, please raise if you feel appropriate.

Changed in lxc (Ubuntu):
importance: Undecided → Low
Gary Poster (gary) wrote :

We are discovering that this, or something very similar to it, is serious. For our own use, at least, I'd raise this to high (we don't have privileges to do so).

According to lsof, many, many processes (sshd, upstart, init, ntpd, etc., as well as processes we start directly such as xvfb-run and our own code are looking in the wrong place (/rootfs/...) for important and sometimes essential files, including /rootfs/dev/random, /rootfs/dev/null, /rootfs/lib/tls ... , /rootfs/lib/security... , /rootfs/lib/libdbus-1.so.3.4.0, /rootfs/lib/libcom_err.so.2.1, /rootfs/usr/lib/libkrb5.so.3.3, and so on.

We could maybe make symlinks for /rootfs/usr, /rootfs/lib, and /rootfs/dev but that feels like we're going too far, and that we really ought to address this.

I'm not sure where to look to fix this. Is there something else important we are missing from lxc-clone? Help would be very appreciated.

Thank you

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in lxc (Ubuntu):
status: New → Confirmed
Serge Hallyn (serge-hallyn) wrote :

Gary,

you were able to create aufs ephemeral containers with lp:~serge-hallyn/ubuntu/precise/lxc/lxcshutdownv2. Can you create one and check /proc/self/mounts? I'm wondering whether this is only with overlayfs.

Changed in lxc (Ubuntu):
importance: Low → High
Gary Poster (gary) wrote :
Download full text (4.9 KiB)

Hi Serge. Thanks for looking at this.

On an aufs-mounted ephemeral:

gary@lpdev-temp-tKJg9Ss:~$ cat /proc/self/maps
08048000-08054000 r-xp 00000000 08:12 414657 /var/lib/lxc/lpdev/rootfs/bin/cat
08054000-08055000 r--p 0000b000 08:12 414657 /var/lib/lxc/lpdev/rootfs/bin/cat
08055000-08056000 rw-p 0000c000 08:12 414657 /var/lib/lxc/lpdev/rootfs/bin/cat
09aba000-09adb000 rw-p 00000000 00:00 0 [heap]
f7629000-f762a000 rw-p 00000000 00:00 0
f762a000-f777d000 r-xp 00000000 08:12 395454 /var/lib/lxc/lpdev/rootfs/lib/tls/i686/cmov/libc-2.11.1.so
f777d000-f777f000 r--p 00153000 08:12 395454 /var/lib/lxc/lpdev/rootfs/lib/tls/i686/cmov/libc-2.11.1.so
f777f000-f7780000 rw-p 00155000 08:12 395454 /var/lib/lxc/lpdev/rootfs/lib/tls/i686/cmov/libc-2.11.1.so
f7780000-f7783000 rw-p 00000000 00:00 0
f778d000-f778f000 rw-p 00000000 00:00 0
f778f000-f7790000 r-xp 00000000 00:00 0 [vdso]
f7790000-f77ab000 r-xp 00000000 08:12 393068 /var/lib/lxc/lpdev/rootfs/lib/ld-2.11.1.so
f77ab000-f77ac000 r--p 0001a000 08:12 393068 /var/lib/lxc/lpdev/rootfs/lib/ld-2.11.1.so
f77ac000-f77ad000 rw-p 0001b000 08:12 393068 /var/lib/lxc/lpdev/rootfs/lib/ld-2.11.1.so
ffd7f000-ffda0000 rw-p 00000000 00:00 0 [stack]

gary@lpdev-temp-tKJg9Ss:~$ cat /proc/self/mounts
rootfs / rootfs rw 0 0
none / aufs rw,relatime,si=bf91329ce378bd10,noplink 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
sysfs /sys sysfs rw,relatime 0 0
none /home/gary aufs rw,relatime,si=bf91329dec9efd10,noplink 0 0
devpts /dev/console devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
devpts /dev/tty1 devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
devpts /dev/tty2 devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
devpts /dev/tty3 devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
devpts /dev/tty4 devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
devpts /dev/pts devpts rw,relatime,mode=600,ptmxmode=666 0 0
devpts /dev/ptmx devpts rw,relatime,mode=600,ptmxmode=666 0 0
none /lib/init/fstab aufs rw,relatime,si=bf91329ce378bd10,noplink 0 0

On an overlayfs-mounted ephemeral:

gary@lpdev-temp-P4zdc8P:~$ cat /proc/self/maps
08048000-08054000 r-xp 00000000 08:12 414657 /rootfs/bin/cat
08054000-08055000 r--p 0000b000 08:12 414657 /rootfs/bin/cat
08055000-08056000 rw-p 0000c000 08:12 414657 /rootfs/bin/cat
09081000-090a2000 rw-p 00000000 00:00 0 [heap]
f7586000-f7587000 rw-p 00000000 00:00 0
f7587000-f76da000 r-xp 00000000 08:12 395454 /rootfs/lib/tls/i686/cmov/libc-2.11.1.so
f76da000-f76dc000 r--p 00153000 08:12 395454 /rootfs/lib/tls/i686/cmov/libc-2.11.1.so
f76dc000-f76dd000 rw-p 00155000 08:12 395454 ...

Read more...

Serge Hallyn (serge-hallyn) wrote :

This may be a kernel regression.

Using an aufs ephemeral container with the new precise lxc package, on oneiric kernel and userspace, doesn't show the /var/lib/lxc/<container>/rootfs prefix.

(I can't test overlayfs on oneiric kernel as it's not supported there)

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 959352

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete

apport information

tags: added: apport-collected precise staging
description: updated

apport information

apport information

apport information

apport information

apport information

Benji York (benji) wrote : CRDA.txt

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Benji York (benji) wrote : Lspci.txt

apport information

Benji York (benji) wrote : Lsusb.txt

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Gary Poster (gary) on 2012-03-28
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.3 kernel[1] (Not a kernel in the daily directory). Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[1] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.3-precise/

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: needs-upstream-testing
Gary Poster (gary) wrote :

I have duped the problem Benji reported, and did the kernel test in order to let him work on other tasks.

The given kernel (reported in /proc/version as "Linux version 3.3.0-030300-generic (apw@gomeisa) (gcc version 4.4.3 (Ubuntu 4.4.3-4ubuntu5.1) ) #201203182135 SMP Mon Mar 19 01:36:20 UTC 2012") does not appear to have overlayfs or aufs compiled in, so I was unable to test as requested (http://pastebin.ubuntu.com/905562/). I'm happy to try again if given updated instructions.

tags: added: kernel-unable-to-test-upstream

Thank you for taking the time to file a bug report on this issue.

However, given the number of bugs that the Kernel Team receives during any development cycle it is impossible for us to review them all. Therefore, we occasionally resort to using automated bots to request further testing. This is such a request.

We have noted that there is a newer version of the development kernel than the one you last tested when this issue was found. Please test again with the newer kernel and indicate in the bug if this issue still exists or not.

You can update to the latest development kernel by simply running the following commands in a terminal window:

    sudo apt-get update
    sudo apt-get dist-upgrade

If the bug still exists, change the bug status from Incomplete to Confirmed. If the bug no longer exists, change the bug status from Incomplete to Fix Released.

If you want this bot to quit automatically requesting kernel tests, add a tag named: bot-stop-nagging.

 Thank you for your help, we really do appreciate it.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-request-3.2.0-21.34
Stéphane Graber (stgraber) wrote :

Added bot-stop-nagging, AFAIK aufs and overlayfs are SAUCE changes to the kernel, so it's pretty unlikely that any upstream kernel testing would work. This really needs to be looked into by a kernel team member.

tags: added: bot-stop-nagging
Changed in linux (Ubuntu):
status: Incomplete → Confirmed

It would be helpful if anyone could confirm if this issue remains with the latest Quantal 3.5.0-3.3 kernel. We do have the Quantal kernel available for testing in Precise via the q-lts-backport PPA:

https://launchpad.net/~ubuntu-x-swat/+archive/q-lts-backport

sudo add-apt-repository ppa:ubuntu-x-swat/q-lts-backport
sudo apt-get update
sudo apt-get install linux-image-generic-lts-quantal linux-headers-generic-lts-quantal

Stéphane Graber (stgraber) wrote :

It's still happening on up to date quantal:

root@lantea:~# cat /proc/version
Linux version 3.5.0-3-generic (buildd@akateko) (gcc version 4.7.1 (Ubuntu/Linaro 4.7.1-2ubuntu1) ) #3-Ubuntu SMP Mon Jul 2 16:51:39 UTC 2012
root@lantea:~# dpkg -l | grep linux-image-3.5
ii linux-image-3.5.0-2-generic 3.5.0-2.2 Linux kernel image for version 3.5.0 on 32 bit x86 SMP
ii linux-image-3.5.0-3-generic 3.5.0-3.3 Linux kernel image for version 3.5.0 on 32 bit x86 SMP
root@lantea:~# lxc-start-ephemeral -o p1 -u ubuntu -- cat /proc/self/maps
Setting up ephemeral container...
Starting up the container...
Warning: Permanently added '10.0.3.112' (ECDSA) to the list of known hosts.
ubuntu@10.0.3.112's password:
08048000-08053000 r-xp 00000000 08:01 415243 /rootfs/bin/cat
08053000-08054000 r--p 0000a000 08:01 415243 /rootfs/bin/cat
08054000-08055000 rw-p 0000b000 08:01 415243 /rootfs/bin/cat
09d5a000-09d7b000 rw-p 00000000 00:00 0 [heap]
b7584000-b7585000 rw-p 00000000 00:00 0
b7585000-b7724000 r-xp 00000000 08:01 669268 /rootfs/lib/i386-linux-gnu/libc-2.15.so
b7724000-b7726000 r--p 0019f000 08:01 669268 /rootfs/lib/i386-linux-gnu/libc-2.15.so
b7726000-b7727000 rw-p 001a1000 08:01 669268 /rootfs/lib/i386-linux-gnu/libc-2.15.so
b7727000-b772a000 rw-p 00000000 00:00 0
b772d000-b772f000 rw-p 00000000 00:00 0
b772f000-b7730000 r-xp 00000000 00:00 0 [vdso]
b7730000-b7750000 r-xp 00000000 08:01 669259 /rootfs/lib/i386-linux-gnu/ld-2.15.so
b7750000-b7751000 r--p 0001f000 08:01 669259 /rootfs/lib/i386-linux-gnu/ld-2.15.so
b7751000-b7752000 rw-p 00020000 08:01 669259 /rootfs/lib/i386-linux-gnu/ld-2.15.so
bfdbe000-bfddf000 rw-p 00000000 00:00 0 [stack]
Stopping lxc
root@lantea:~#

Stéphane Graber (stgraber) wrote :

Reproducing the trace above is possible by doing:
 1) apt-get install lxc
 2) lxc-create -t ubuntu -n p1
 3) lxc-start-ephemeral -o p1 -u ubuntu -- cat /proc/self/maps

When prompted for the ubuntu account's password, the password is "ubuntu".

Quoting Leann Ogasawara (<email address hidden>):
> It would be helpful if anyone could confirm if this issue remains with
> the latest Quantal 3.5.0-3.3 kernel. We do have the Quantal kernel

Confirmed it is still happening.

Changed in linux (Ubuntu):
assignee: nobody → Bryan Wu (cooloney)
Andy Whitcroft (apw) wrote :

Can this not be worked around very simply by symlinking rootfs in the container to / as this prefix is common and consistant in all the afffected files?

    ln -s . rootfs

Benji York (benji) wrote :

On Fri, Aug 10, 2012 at 2:58 AM, Andy Whitcroft <email address hidden> wrote:
> Can this not be worked around very simply by symlinking rootfs in the
> container to / as this prefix is common and consistant in all the
> afffected files?
>
> ln -s . rootfs

I believe so. The above is a generalization of the workaround given in
the original report. At report time we were only having a problem with
a particular piece of software (graphviz) because of the way it searches
for plugins.
--
Benji York

Serge Hallyn (serge-hallyn) wrote :

I tried doing 'ln -s . rootfs' in container q1, then did lxc-start-ephemeral -o q1, logged in, and checked /proc/self/maps. It still listed /rootfs/. Of course the paths will now resolve. I'm not sure that suffices. If it does, we could temporarily create that symbolic link in the ubuntu templates.

Stéphane Graber (stgraber) wrote :

I'm quite clearly against adding this hack to lxc-ubuntu and lxc-ubuntu-cloud as since with 12.04, we're saying that a chroot filesystem is identical to a stock ubuntu filesystem, which would no longer be true with the workaround.

Instead, I guess we could get that hack into lxc-start-ephemeral with the proper checks (in case the rootfs is read-only for example), but I still consider that a pretty ugly hack and would rather the source of the problem be fixed...

Bryan Wu (cooloney) wrote :

Based on the overlayfs maintainer Miklos Szeredi's email [1], I think probably this issue won't be fixed in overlayfs very recently. So I agree to add this workaround in lxc-start-ephemeral. Please find my patch attached for review. it is against to latest lxc-start-ephemeral in Quantal.

[1]: https://lists.ubuntu.com/archives/kernel-team/2012-August/021588.html

Andy Whitcroft (apw) wrote :

@Serge -- i am not familiar with how the chroot is made, but given the full directory is not included only the rootfs component we might be able to change where the union is mounted, to avoid this if its really life ending.

Serge Hallyn (serge-hallyn) wrote :

Ok, trying to work around this by changing how the ephemeral container is set up.

Changed in lxc (Ubuntu):
assignee: nobody → Serge Hallyn (serge-hallyn)
Serge Hallyn (serge-hallyn) wrote :

The following actually seems to work. Instead of using an overlayfs for all of /var/lib/lxc/q1-tmp-XXXXX, it only uses the overlayfs mount for /var/lib/lxc/q1-tmp-XXXXXX/rootfs. That way, as per Miklos' comment, /proc/self/fd and /proc/self/maps contents are resolved relative to the overlayfs root the same way as relative to the container rootfs.

Won't help for userspace inside the container which does chroot/pivot_root, but for a simple container it solves the bug for me.

Serge Hallyn (serge-hallyn) wrote :

Here is a version which properly cleans up.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxc - 0.8.0~rc1-4ubuntu24

---------------
lxc (0.8.0~rc1-4ubuntu24) quantal; urgency=low

  * lxc-start-ephemeral: use unionfs only for the rootfs itself
    (LP: #959352)
  * allow config files to include other config files.
 -- Serge Hallyn <email address hidden> Tue, 14 Aug 2012 13:11:24 +0000

Changed in lxc (Ubuntu Quantal):
status: Confirmed → Fix Released
description: updated
Changed in lxc (Ubuntu Precise):
status: New → Triaged
importance: Undecided → High
assignee: nobody → Serge Hallyn (serge-hallyn)
Changed in lxc (Ubuntu Precise):
status: Triaged → In Progress

Hello Benji, or anyone else affected,

Accepted lxc into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/lxc/0.7.5-3ubuntu63 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in lxc (Ubuntu Precise):
status: In Progress → Fix Committed
tags: added: verification-needed
Benji York (benji) wrote :

With 0.7.5-3ubuntu63 I can no longer reproduce the bad behavior and an strace of an affected binary shows correct paths being used.

This looks fixed to me.

Stéphane Graber (stgraber) wrote :

Based on Benji's comment, marking verification-done.

tags: added: verification-done
removed: verification-needed

On Wed, Sep 12, 2012 at 7:54 PM, Stéphane Graber <email address hidden> wrote:
> Based on Benji's comment, marking verification-done.

Thank you. I forgot to do that.

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxc - 0.7.5-3ubuntu63

---------------
lxc (0.7.5-3ubuntu63) precise-proposed; urgency=low

  * lxc.lxc-net.upstart: replace the check for USE_LXC_BRIDGE (which could be
    changed from true to false after starting lxc-net) with one for the
    existence /var/run/lxc. (LP: #1019290)
  * lxc-start-ephemeral: use unionfs only for the rootfs itself
    (LP: #959352)
 -- Serge Hallyn <email address hidden> Tue, 14 Aug 2012 11:38:25 -0500

Changed in lxc (Ubuntu Precise):
status: Fix Committed → Fix Released
Tim Gardner (timg-tpi) wrote :

Got fixed in lxc

Changed in linux (Ubuntu Quantal):
status: Confirmed → Invalid
Changed in linux (Ubuntu Precise):
status: New → Invalid
To post a comment you must log in.