Bug #1551854 “LXD bootstrap issues on xenial” : Bugs : lxd package : Ubuntu

Revision history for this message

Casey Marshall (cmars) wrote on 2016-03-01:

#1

AlsaInfo.txt Edit (38.3 KiB, text/plain; charset="utf-8")
CRDA.txt Edit (392 bytes, text/plain; charset="utf-8")
CurrentDmesg.txt Edit (75.9 KiB, text/plain; charset="utf-8")
Dependencies.txt Edit (2.2 KiB, text/plain; charset="utf-8")
IwConfig.txt Edit (951 bytes, text/plain; charset="utf-8")
JournalErrors.txt Edit (266.1 KiB, text/plain; charset="utf-8")
Lspci.txt Edit (9.8 KiB, text/plain; charset="utf-8")
Lsusb.txt Edit (468 bytes, text/plain; charset="utf-8")
ProcCpuinfo.txt Edit (3.8 KiB, text/plain; charset="utf-8")
ProcEnviron.txt Edit (103 bytes, text/plain; charset="utf-8")
ProcInterrupts.txt Edit (2.6 KiB, text/plain; charset="utf-8")
ProcModules.txt Edit (6.8 KiB, text/plain; charset="utf-8")
PulseList.txt Edit (27.4 KiB, text/plain; charset="utf-8")
UdevDb.txt Edit (158.6 KiB, text/plain; charset="utf-8")
WifiSyslog.txt Edit (190.6 KiB, text/plain; charset="utf-8")

Revision history for this message

Casey Marshall (cmars) wrote on 2016-03-01:

#2

I also confirmed that the mountall error message was duplicated every time I restarted the machine-0 container -- until remounting on the host.

Revision history for this message

Brad Figg (brad-figg) wrote on 2016-03-01: Status changed to Confirmed

#3

This change was made by a bot.

Changed in linux (Ubuntu):
status:	New → Confirmed

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2016-03-01:

#4

I'm on the same kernel

Linux sl 4.4.0-8-generic #23-Ubuntu SMP Wed Feb 24 20:45:30 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

and also have the tracefs mounted

0 ✓ serge@sl ~ $ grep debug /proc/self/mountinfo
74 19 0:7 / /sys/kernel/debug rw,relatime shared:26 - debugfs debugfs rw
44 74 0:9 / /sys/kernel/debug/tracing rw,relatime shared:29 - tracefs tracefs rw

but trusty (upstart-based) containers start fine for me, using lxc version 2.0.0~rc4+master~20160229-0647-0ubuntu1~xenial and lxd from git HEAD.

Very odd therefore that unmounting and re-mounting debugfs works for you...

Will try in a fresh vm.

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2016-03-01:

#5

marking confirmed because two people have reported it, but I cannot reproduce it yet.

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2016-03-01:

#6

Also cannot reproduce in a clean VM, so I have to assume juju is tweaking something.

Can you show output of 'lxc config show <container>' where <container> is the container which fails?

Revision history for this message

Casey Marshall (cmars) wrote on 2016-03-01:

#7

FWIW I've observed the bug outside of Juju. Launching a trusty container, sshd did not start until I remounted debug on the host. The main reason it's been observed with juju is, Juju tries to SSH into the instance right after cloud-init, but upstart in the container isn't starting sshd so bootstrap hangs.

Revision history for this message

Casey Marshall (cmars) wrote on 2016-03-01:

#8

This is the config from the container that had the issue this morning:

c@mawhrin-skel:~/omnibus-layers$ lxc config show juju-145a3177-d1c0-4974-89f6-feaebb3ca87d-machine-0
name: juju-145a3177-d1c0-4974-89f6-feaebb3ca87d-machine-0
profiles:
- default
- juju-lxd
config:
  user.juju-model-uuid: "true"
  user.user-data: |
    #cloud-config
    output:
      all: '| tee -a /var/log/cloud-init-output.log'
    runcmd:
    - set -xe
    - install -D -m 644 /dev/null '/etc/init/juju-clean-shutdown.conf'
    - |-
      printf '%s\n' '
      author "Juju Team <email address hidden>"
      description "Stop all network interfaces on shutdown"
      start on runlevel [016]
      task
      console output

      exec /sbin/ifdown -a -v --force
      ' > '/etc/init/juju-clean-shutdown.conf'
    - install -D -m 644 /dev/null '/var/lib/juju/nonce.txt'
    - printf '%s\n' 'user-admin:bootstrap' > '/var/lib/juju/nonce.txt'
    users:
    - groups:
      - adm
      - audio
      - cdrom
      - dialout
      - dip
      - floppy
      - netdev
      - plugdev
      - sudo
      - video
      lock_passwd: true
      name: ubuntu
      shell: /bin/bash
      ssh-authorized-keys:
      - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDNt6t7py1b0vwYVobsx490piX1LrjtCJrcmOH49EKOtTzxxiv1aTRqVOD38pKR8WPWUc6ZTjYtGetqbwhvma8FLWeTjIaPyw8QzKAS963/KNzZRqE+iALtcdA9sJgrp5hxxl00zZ7cD7b2OD5SOzSjyRHJkBxGDnkzE07g+/qXekkPzVHKvAMbaBU+OwnuW3KSy20/y2D/qlWkLfF7FWfeEvb6P8KwIFZagv/yt+QeLONq4FLwowdBIwMDHBKFA3H+dKzld5bs3hGvLNhlFYUdeKs/F+swkYwwi5ycWj7N7clu0wvP9ZZhXlUJ2Fog39GrXznnekPqr4pAwL8m3vr9
        Juju:juju-client-key
      sudo:
      - ALL=(ALL) NOPASSWD:ALL
  volatile.base_image: 510c27eb5e30ac53c6cf8b423d4e145bd2e40b8845e89bd66a5d78e2a087727a
  volatile.eth0.hwaddr: 00:16:3e:9a:00:f9
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":165536,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":165536,"Nsid":0,"Maprange":65536}]'
  volatile.lo.hwaddr: 00:16:3e:3d:f5:18
devices:
  root:
    path: /
    type: disk
ephemeral: false

This is the config from the container that had the issue this morning:

c@mawhrin-skel:~/omnibus-layers$ lxc config show juju-145a3177-d1c0-4974-89f6-feaebb3ca87d-machine-0
name: juju-145a3177-d1c0-4974-89f6-feaebb3ca87d-machine-0
profiles:
- default
- juju-lxd
config:
  user.juju-model-uuid: "true"
  user.user-data: |
    #cloud-config
    output:
      all: '| tee -a /var/log/cloud-init-output.log'
    runcmd:
    - set -xe
    - install -D -m 644 /dev/null '/etc/init/juju-clean-shutdown.conf'
    - |-
      printf '%s\n' '
      author "Juju Team <juju@lists.ubuntu.com>"
      description "Stop all network interfaces on shutdown"
      start on runlevel [016]
      task
      console output

exec /sbin/ifdown -a -v --force
      ' > '/etc/init/juju-clean-shutdown.conf'
    - install -D -m 644 /dev/null '/var/lib/juju/nonce.txt'
    - printf '%s\n' 'user-admin:bootstrap' > '/var/lib/juju/nonce.txt'
    users:
    - groups:
      - adm
      - audio
      - cdrom
      - dialout
      - dip
      - floppy
      - netdev
      - plugdev
      - sudo
      - video
      lock_passwd: true
      name: ubuntu
      shell: /bin/bash
      ssh-authorized-keys:
      - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDNt6t7py1b0vwYVobsx490piX1LrjtCJrcmOH49EKOtTzxxiv1aTRqVOD38pKR8WPWUc6ZTjYtGetqbwhvma8FLWeTjIaPyw8QzKAS963/KNzZRqE+iALtcdA9sJgrp5hxxl00zZ7cD7b2OD5SOzSjyRHJkBxGDnkzE07g+/qXekkPzVHKvAMbaBU+OwnuW3KSy20/y2D/qlWkLfF7FWfeEvb6P8KwIFZagv/yt+QeLONq4FLwowdBIwMDHBKFA3H+dKzld5bs3hGvLNhlFYUdeKs/F+swkYwwi5ycWj7N7clu0wvP9ZZhXlUJ2Fog39GrXznnekPqr4pAwL8m3vr9
        Juju:juju-client-key
      sudo:
      - ALL=(ALL) NOPASSWD:ALL
  volatile.base_image: 510c27eb5e30ac53c6cf8b423d4e145bd2e40b8845e89bd66a5d78e2a087727a
  volatile.eth0.hwaddr: 00:16:3e:9a:00:f9
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":165536,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":165536,"Nsid":0,"Maprange":65536}]'
  volatile.lo.hwaddr: 00:16:3e:3d:f5:18
devices:
  root:
    path: /
    type: disk
ephemeral: false

Revision history for this message

Adam Stokes (adam-stokes) wrote on 2016-03-02:

#9

Here is my config:

name: juju-078fe32d-4080-4f11-83e2-e579ead11df8-machine-0
profiles:
- default
- juju-myish
config:
  user.juju-model-uuid: "true"
  user.user-data: |
    #cloud-config
    output:
      all: '| tee -a /var/log/cloud-init-output.log'
    runcmd:
    - set -xe
    - install -D -m 644 /dev/null '/etc/init/juju-clean-shutdown.conf'
    - |-
      printf '%s\n' '
      author "Juju Team <email address hidden>"
      description "Stop all network interfaces on shutdown"
      start on runlevel [016]
      task
      console output

      exec /sbin/ifdown -a -v --force
      ' > '/etc/init/juju-clean-shutdown.conf'
    - install -D -m 644 /dev/null '/var/lib/juju/nonce.txt'
    - printf '%s\n' 'user-admin:bootstrap' > '/var/lib/juju/nonce.txt'
    users:
    - groups:
      - adm
      - audio
      - cdrom
      - dialout
      - dip
      - floppy
      - netdev
      - plugdev
      - sudo
      - video
      lock_passwd: true
      name: ubuntu
      shell: /bin/bash
      ssh-authorized-keys:
      - ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDr6xIZdawDhRLDTARbf1TO1FAIcEBLbqh50B82zosRs2T0WsQX00c6NvtBLkpkvuAwqFBZA4/zVr4xY52cDbJ+cB49HW9Z+LgLPa/VQV/Z4XpSHXJxILAeEFY+eSgMRneUKtpNzlW6dnKArBCa+egAGKan6TGTaAjZonNJsd+7LOvoPDAmmSR5AYsXrUZfzEdo5rfwKquZdZRnxZjR41nhezr14deWUjCPAgCH22Is+GNDOHadCUi0nqbcZDBWUC69BptmvdL02HQJgrz3HuPnseTWEqdFYmfhmIwnXO/43/oIvR26dOUq2Y+S5KDH2rOsdp2B6UQbZOiT279GX8gx
        Juju:juju-client-key
      sudo:
      - ALL=(ALL) NOPASSWD:ALL
  volatile.base_image: 510c27eb5e30ac53c6cf8b423d4e145bd2e40b8845e89bd66a5d78e2a087727a
  volatile.eth0.hwaddr: 00:16:3e:f6:c5:98
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":false,"Hostid":100000,"Nsid":0,"Maprange":65536},{"Isuid":false,"Isgid":true,"Hostid":100000,"Nsid":0,"Maprange":65536}]'
  volatile.lo.hwaddr: 00:16:3e:b6:76:cb
devices:
  root:
    path: /
    type: disk
ephemeral: false

I am also not use zfs

Revision history for this message

Adam Stokes (adam-stokes) wrote on 2016-03-02:

#10

Also wrt to Juju, if I do the following:

umount /sys/kernel/debug
mount -t debugfs none /sys/kernel/debug

And then reissue a juju bootstrap it will complete successfully :\ where as before I was running into this error: http://paste.ubuntu.com/15267564/

Alberto Salvia Novella (es20490446e) on 2016-03-03

Changed in linux (Ubuntu):
importance:	Undecided → High

Revision history for this message

Seth Forshee (sforshee) wrote on 2016-03-03:

#11

Serge: Why do we need to mount debugfs in containers? Even in the host we restrict access to root.

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2016-03-04:

#12

@sforshee,

Because in the past mountall would fail if we didn't.

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2016-03-04:

#13

@sforshee - are you saying that removing the debugfs line from /usr/share/lxc/config/ubuntu-common.conf fixes this for you?

Revision history for this message

Serge Hallyn (serge-hallyn) wrote on 2016-03-04:

#14

Note - I am not actively looking at this bug as I've not managed to reproduce it. Hopefully the kernel team has it under control, please shout if I'm needed.

If using juju first is a prerequisite to reproducing this, I can try that, but my impression from previous reports has been that this is not supposed to be a requirement, so I think something else is triggering it which I'm missing.

Revision history for this message

Seth Forshee (sforshee) wrote on 2016-03-04: Re: [Bug 1551854] Re: LXD bootstrap issues on xenial

#15

On Fri, Mar 04, 2016 at 05:36:28PM -0000, Serge Hallyn wrote:
> @sforshee - are you saying that removing the debugfs line from
> /usr/share/lxc/config/ubuntu-common.conf fixes this for you?

I haven't reproduced it. Just wondering as it should be impossible to
actually use debugfs from within the container.

Revision history for this message

Casey Marshall (cmars) wrote on 2016-03-04:

#16

Interesting. I removed the /sys/kernel/debug mount and containers seem to start up just fine:

c@mawhrin-skel:~$ grep # lxc.mount.entry = /sys/kernel/debug c@mawhrin-skel:~$ lxc Creating t2
Starting t2
c@mawhrin-skel:~$ lxc UID PID PPID C STIME TTY root 1 0 4 19:02 ? root 453 1 0 19:02 ? root 1507 1 0 19:02 ? root 1513 1 0 19:02 ? root 1583 1 0 19:02 ? root 1606 1 0 19:02 ? root 1610 1606 0 19:02 ? root 1619 1606 0 19:02 ? root 1772 1 0 19:02 ? message+ 1773 1 0 19:02 ? root 1775 1772 0 19:02 ? root 1812 1 0 19:02 ? root 1864 1 0 19:02 ? daemon 1866 1 0 19:02 ? root 1867 1 0 19:02 ? root 1870 1 0 19:02 ? root 1886 1 0 19:02 ? root 1910 1 0 19:02 ? root 1925 1 0 19:02 ? root 1937 1925 0 19:02 ? syslog 1943 1 2 19:02 ? root 2023 1 0 19:02 ? root 2025 0 0 19:02 ? kernel/debug /usr/share/lxc/config/ubuntu.common.conf
sys/kernel/debug none bind,optional 0 0
launch ubuntu-trusty t2
exec t2 -- ps -ef
TIME CMD
00:00:00 /sbin/init
00:00:00 upstart-socket-bridge --daemon
00:00:00 upstart-udev-bridge --daemon
00:00:00 /lib/systemd/systemd-udevd --dae
00:00:00 dhclient -1 -v -pf /run/dhclient
00:00:00 /bin/sh /etc/network/if-up.d/ntp
00:00:00 lockfile-touch /var/lock/ntpdate
00:00:00 /usr/sbin/ntpdate -s ntp.ubuntu.
00:00:00 /bin/sh /etc/network/if-up.d/ntp
00:00:00 dbus-daemon --system --fork
00:00:00 lockfile-create /var/lock/ntpdat
00:00:00 /lib/systemd/systemd-logind
00:00:00 cron
00:00:00 atd
00:00:00 acpid -c /etc/acpi/events -s /va
00:00:00 /usr/sbin/irqbalance
00:00:00 /usr/sbin/sshd -D
00:00:00 upstart-file-bridge --daemon
00:00:00 /bin/sh /etc/init.d/ondemand bac
00:00:00 sleep 60
00:00:00 rsyslogd
00:00:00 /usr/bin/python /usr/bin/cloud-i
00:00:00 ps -ef

Interesting. I removed the /sys/kernel/debug mount and containers seem to start up just fine:

c@mawhrin-skel:~$ grep kernel/debug /usr/share/lxc/config/ubuntu.common.conf 
# lxc.mount.entry = /sys/kernel/debug sys/kernel/debug none bind,optional 0 0
c@mawhrin-skel:~$ lxc launch ubuntu-trusty t2
Creating t2
Starting t2
c@mawhrin-skel:~$ lxc exec t2 -- ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  4 19:02 ?        00:00:00 /sbin/init
root       453     1  0 19:02 ?        00:00:00 upstart-socket-bridge --daemon
root      1507     1  0 19:02 ?        00:00:00 upstart-udev-bridge --daemon
root      1513     1  0 19:02 ?        00:00:00 /lib/systemd/systemd-udevd --dae
root      1583     1  0 19:02 ?        00:00:00 dhclient -1 -v -pf /run/dhclient
root      1606     1  0 19:02 ?        00:00:00 /bin/sh /etc/network/if-up.d/ntp
root      1610  1606  0 19:02 ?        00:00:00 lockfile-touch /var/lock/ntpdate
root      1619  1606  0 19:02 ?        00:00:00 /usr/sbin/ntpdate -s ntp.ubuntu.
root      1772     1  0 19:02 ?        00:00:00 /bin/sh /etc/network/if-up.d/ntp
message+  1773     1  0 19:02 ?        00:00:00 dbus-daemon --system --fork
root      1775  1772  0 19:02 ?        00:00:00 lockfile-create /var/lock/ntpdat
root      1812     1  0 19:02 ?        00:00:00 /lib/systemd/systemd-logind
root      1864     1  0 19:02 ?        00:00:00 cron
daemon    1866     1  0 19:02 ?        00:00:00 atd
root      1867     1  0 19:02 ?        00:00:00 acpid -c /etc/acpi/events -s /va
root      1870     1  0 19:02 ?        00:00:00 /usr/sbin/irqbalance
root      1886     1  0 19:02 ?        00:00:00 /usr/sbin/sshd -D
root      1910     1  0 19:02 ?        00:00:00 upstart-file-bridge --daemon
root      1925     1  0 19:02 ?        00:00:00 /bin/sh /etc/init.d/ondemand bac
root      1937  1925  0 19:02 ?        00:00:00 sleep 60
syslog    1943     1  2 19:02 ?        00:00:00 rsyslogd
root      2023     1  0 19:02 ?        00:00:00 /usr/bin/python /usr/bin/cloud-i
root      2025     0  0 19:02 ?        00:00:00 ps -ef

Revision history for this message

Seth Forshee (sforshee) wrote on 2016-03-04:

#17

I'm getting something kind of similar without juju. If I remount debugfs ro in the host then start the container I get this in /var/log/upstart/mountall.log:

mount: cannot remount block device debugfs read-write, is write-protected
mountall: mount /sys/kernel/debug [143] terminated with status 32
mountall: Event failed

and services don't start in the container. If I completely unmount debugfs in the host though everything is happy, though debugfs is not mounted in the container.

Casey/Adam: Can one of you confirm that debugfs is not mounted in the host when you get the failures? If it is mounted can you paste the output of 'mount | grep debugfs' in the host?

@hallyn: I didn't find that line you were referring to in /usr/share/lxc/config/ubuntu.common.conf, in fact I didn't find any reference to debugfs in any of the template files. And debugfs is not a ns-mountable filesystem, so I guess it must be a bind mount? So getting EACCES makes sense if the container tries to mount debugfs, I'm just not sure why their containers are trying to mount debugfs if not mounted in the host and mine does not, which is what I assume must be going on.

Maybe it has something to do with that juju-lxd profile. Can someone paste in its contents (lxc profile show juju-lxd) or point me to where I can find it?

At this point I don't really think this is a kernel bug. debugfs is _not_ namespace mountable, nor should it be.

Revision history for this message

Seth Forshee (sforshee) wrote on 2016-03-04:

#18

@Casey: I must have been typing my comment when you posted yours. So you've answered one of my questions, but I have no idea what's leading to the EACCES error. Can you provide the output of 'mount | grep debugfs' in the host when you're seeing the failure?

Revision history for this message

Casey Marshall (cmars) wrote on 2016-03-04:

#19

@stforshee I'll uncomment the debugfs mount in my /usr/share/lxc/config/ubuntu.common.conf (putting it back the way it was), reboot, and see if I can reproduce it again.

My juju-lxd profile shows:

name: juju-lxd
config:
boot.autostart: "true"
security.nesting: "true"
description: ""
devices: {}

Revision history for this message

Seth Forshee (sforshee) wrote on 2016-03-04:

#20

On Fri, Mar 04, 2016 at 09:17:01PM -0000, Casey Marshall wrote:
> @stforshee I'll uncomment the debugfs mount in my
> /usr/share/lxc/config/ubuntu.common.conf (putting it back the way it
> was), reboot, and see if I can reproduce it again.

Oh there it is, I was grepping for debugfs and not debug, d'oh.

Revision history for this message

Seth Forshee (sforshee) wrote on 2016-03-07:

#21

Casey: Any luck reproducing? I'd still like to see what 'mount | grep debugfs' in the host shows when this is happening.

Revision history for this message

Stéphane Graber (stgraber) wrote on 2016-03-07:

#22

I'd like to note that LXD works differently from LXC here.

In LXC we mount debugfs through ubuntu.common.conf whereas with lxd, we simply bind-mount /sys/kernel/debug from the host if it exists.

LXD doesn't use any of the /usr/share/lxc/* files. If it does on your system, then you most definitely aren't running LXD 2.0.

Revision history for this message

Stéphane Graber (stgraber) wrote on 2016-03-07:

#23

Please can everyone affected by this issue post the output of: dpkg -l lxc liblxc1 lxd lxd-client lxcfs

It's very difficult to figure out what's wrong when we don't even know the version being used.

Changed in linux (Ubuntu):
status:	Confirmed → Incomplete

Revision history for this message

Stéphane Graber (stgraber) wrote on 2016-03-07:

#24

Oh and "lxc info" too for good measure (just in case lxd wasn't restarted post-upgrade).

Revision history for this message

Casey Marshall (cmars) wrote on 2016-03-07:

#25

Download full text (3.9 KiB)

Haven't seen it in a few days. I'll reboot and see if I can reproduce it. It usually happens after rebooting the host, when launching new containers or existing ones would autostart.

Info you requested. I think the /usr/share/lxc/... might have been a red herring. I'm exclusively using LXD on this xenial host, not messing with the old lxc commands.

$ dpkg -l lxc liblxc1 lxd lxd-client lxcfs
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Architecture Description
+++-===============================-====================-====================-====================================================================
ii liblxc1 2.0.0~rc5-0ubuntu1 amd64 Linux Containers userspace tools (library)
ii lxc 2.0.0~rc5-0ubuntu1 all Transitional package for lxc1
ii lxcfs 2.0.0~rc2-0ubuntu2 amd64 FUSE based filesystem for LXC
ii lxd 2.0.0~rc1-0ubuntu3 amd64 Container hypervisor based on LXC - daemon
ii lxd-client 2.0.0~rc1-0ubuntu3 amd64 Container hypervisor based on LXC - client

$ lxc info
apicompat: 0
auth: trusted
environment:
  addresses:
  - 192.168.88.234:8443
  - 192.168.122.1:8443
  - 10.0.3.1:8443
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    MIIFqjCCA5KgAwIBAgIQXpne6Qjwhg8de+RmyV1+mTANBgkqhkiG9w0BAQsFADA6
    MRwwGgYDVQQKExNsaW51eGNvbnRhaW5lcnMub3JnMRowGAYDVQQDDBFyb290QG1h
    d2hyaW4tc2tlbDAeFw0xNjAyMjgwMzEyMzdaFw0yNjAyMjUwMzEyMzdaMDoxHDAa
    BgNVBAoTE2xpbnV4Y29udGFpbmVycy5vcmcxGjAYBgNVBAMMEXJvb3RAbWF3aHJp
    bi1za2VsMIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAzv/3uX3JWduq
    wmtbyTABmfJkup6Z5Lh4lKPXgL/H2gQ/mlccORKm1eDZhAGmv9UuQGeMRHEneJqD
    V9c3f7/9cJBwvz2loKlppWj0ohAzTP91L8paeUSfP4X9EAr702Qjyb2ig+xWv5tM
    cJxbdl0zGpjYO1P+xmUdthSidsFrzWQPXptOlcZvak7n0QL5GlkVXdqX+she2Pbs
    ONtyTaBSpF3zEYv6cM9ZeJYL4Hl7LEQ1/p8ojpOyaxO8B1Cn/gIbuDqgzRwmei90
    Aca06YDF4SHVcl8qFajrwkPF3jWW5pgS8sAJlYoq2+ROhl0CnpdBl4AiJrvfFsIs
    RL8dKSuFA6AcLhYooWgMy6UWR8mLbmYHp04ThuBDoRaTt0uGLDlTAfMg7e8Gwpz+
    aEwxSzzQhvJYr4e6TSP1C4zXNqS5mUHfm6RtfEccVFmq6vyqZGELAyREce76J88V
    FMf/V+KYlQYUxo0JH2k+BYMO4Iigar0+8p8o8drqh6Lks5zTP6idsKa0LSWA4rwm
    3+5hjnVJtFa9CVCItJW3r0+nczwtAbjvRojkr7Yb2Kifufhilb+I695qSk+Toug3
    HKOODTqc9sbH+urmfr7jBBexACtWVX/tMptWFnBmcqoN4ptSZutY7CPf+KR//9p0
    8fFRQP+ItmElJHRFO+madGOBMVrC7j8CAwEAAaOBqzCBqDAOBgNVHQ8BAf8EBAMC
    BaAwEwYDVR0lBAwwCgYIKwYBBQUHAwEwDAYDVR0TAQH/BAIwADBzBgNVHREEbDBq
    ggxtYXdocmluLXNrZWyCETE5Mi4xNjguODguMjM0LzI0ghxmZTgwOjozZWE5OmY0
    ZmY6ZmU1Njo0N2RjLzY0ggsxMC4wLjMuMS8yNIIcZmU4MDo6NThkNTpjMmZmOmZl
    ZTQ6NzJiYy82NDANBgkqhkiG9w0BAQsFAAOCAgEAB6aFItuxlZm5+2/IB2eCVAM0
    eQbO6dfvfF2khfiEbWWaKPtkZSYKlDIcoOph35obnNMQjT+y4zlnF/fepvjq8P1R
    yGd+Q+GMcXWVRht3uIMW2ZqwNqOujunyn9+Hl1SYi1dV1g/CH9lJt8I7FKIvyieh
    siZRe5Zp7TdPREkIJveuz8qB3X87WVh9bqvMpoX91Mgrjzd3qATef/tN0HP+b26Y
    X...

Haven't seen it in a few days. I'll reboot and see if I can reproduce it. It usually happens after rebooting the host, when launching new containers or existing ones would autostart.

Info you requested. I think the /usr/share/lxc/... might have been a red herring. I'm exclusively using LXD on this xenial host, not messing with the old lxc commands.

$ dpkg -l lxc liblxc1 lxd lxd-client lxcfs
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name                            Version              Architecture         Description
+++-===============================-====================-====================-====================================================================
ii  liblxc1                         2.0.0~rc5-0ubuntu1   amd64                Linux Containers userspace tools (library)
ii  lxc                             2.0.0~rc5-0ubuntu1   all                  Transitional package for lxc1
ii  lxcfs                           2.0.0~rc2-0ubuntu2   amd64                FUSE based filesystem for LXC
ii  lxd                             2.0.0~rc1-0ubuntu3   amd64                Container hypervisor based on LXC - daemon
ii  lxd-client                      2.0.0~rc1-0ubuntu3   amd64                Container hypervisor based on LXC - client

$ lxc info
apicompat: 0
auth: trusted
environment:
  addresses:
  - 192.168.88.234:8443
  - 192.168.122.1:8443
  - 10.0.3.1:8443
  architectures:
  - x86_64
  - i686
  certificate: |
    -----BEGIN CERTIFICATE-----
    MIIFqjCCA5KgAwIBAgIQXpne6Qjwhg8de+RmyV1+mTANBgkqhkiG9w0BAQsFADA6
    MRwwGgYDVQQKExNsaW51eGNvbnRhaW5lcnMub3JnMRowGAYDVQQDDBFyb290QG1h
    d2hyaW4tc2tlbDAeFw0xNjAyMjgwMzEyMzdaFw0yNjAyMjUwMzEyMzdaMDoxHDAa
    BgNVBAoTE2xpbnV4Y29udGFpbmVycy5vcmcxGjAYBgNVBAMMEXJvb3RAbWF3aHJp
    bi1za2VsMIICIjANBgkqhkiG9w0BAQEFAAOCAg8AMIICCgKCAgEAzv/3uX3JWduq
    wmtbyTABmfJkup6Z5Lh4lKPXgL/H2gQ/mlccORKm1eDZhAGmv9UuQGeMRHEneJqD
    V9c3f7/9cJBwvz2loKlppWj0ohAzTP91L8paeUSfP4X9EAr702Qjyb2ig+xWv5tM
    cJxbdl0zGpjYO1P+xmUdthSidsFrzWQPXptOlcZvak7n0QL5GlkVXdqX+she2Pbs
    ONtyTaBSpF3zEYv6cM9ZeJYL4Hl7LEQ1/p8ojpOyaxO8B1Cn/gIbuDqgzRwmei90
    Aca06YDF4SHVcl8qFajrwkPF3jWW5pgS8sAJlYoq2+ROhl0CnpdBl4AiJrvfFsIs
    RL8dKSuFA6AcLhYooWgMy6UWR8mLbmYHp04ThuBDoRaTt0uGLDlTAfMg7e8Gwpz+
    aEwxSzzQhvJYr4e6TSP1C4zXNqS5mUHfm6RtfEccVFmq6vyqZGELAyREce76J88V
    FMf/V+KYlQYUxo0JH2k+BYMO4Iigar0+8p8o8drqh6Lks5zTP6idsKa0LSWA4rwm
    3+5hjnVJtFa9CVCItJW3r0+nczwtAbjvRojkr7Yb2Kifufhilb+I695qSk+Toug3
    HKOODTqc9sbH+urmfr7jBBexACtWVX/tMptWFnBmcqoN4ptSZutY7CPf+KR//9p0
    8fFRQP+ItmElJHRFO+madGOBMVrC7j8CAwEAAaOBqzCBqDAOBgNVHQ8BAf8EBAMC
    BaAwEwYDVR0lBAwwCgYIKwYBBQUHAwEwDAYDVR0TAQH/BAIwADBzBgNVHREEbDBq
    ggxtYXdocmluLXNrZWyCETE5Mi4xNjguODguMjM0LzI0ghxmZTgwOjozZWE5OmY0
    ZmY6ZmU1Njo0N2RjLzY0ggsxMC4wLjMuMS8yNIIcZmU4MDo6NThkNTpjMmZmOmZl
    ZTQ6NzJiYy82NDANBgkqhkiG9w0BAQsFAAOCAgEAB6aFItuxlZm5+2/IB2eCVAM0
    eQbO6dfvfF2khfiEbWWaKPtkZSYKlDIcoOph35obnNMQjT+y4zlnF/fepvjq8P1R
    yGd+Q+GMcXWVRht3uIMW2ZqwNqOujunyn9+Hl1SYi1dV1g/CH9lJt8I7FKIvyieh
    siZRe5Zp7TdPREkIJveuz8qB3X87WVh9bqvMpoX91Mgrjzd3qATef/tN0HP+b26Y
    XjtRz9rhOkUfJx9b5HsyBLuGOQS979twl9LPKc1WsIrJaavY1JBT7SIwX3W7u7CU
    UWueEQoVvA9pkgeL8NXYYoHOkkrfW3oqExkuA1pqucMt8iRjn88njV57Vk576tjZ
    +bNqLL6R4xoUvnr9VIwq7tMLvodK/E5WvsmP5GrdMru2sa/xd+fS0137oOYqP4Qb
    T296jMcehHg5Gg1ZR1DbTqzWFPKb+CtXU+bqI68Hplq5Lemufsman2C2dthmKwcm
    OumpykWiH9vNKauY52VR8iffppBsH7jup1+yVtIOiT4Whp5LGBtZdYri7GE77qv+
    qtBL7HdeBJxsBadRIfjdeLZlZPlMX7hV+v8Bzxz9fGuPKSnxDdpoPw1XF+zP1YAn
    VeDCRo7oCFV40nVFyb5wCreDoegtSFuQt7/Guv9u/gQSfmGoY1QyjlaHh2pJAAgS
    CPRGObnrdrQMMEqLDaI=
    -----END CERTIFICATE-----
  driver: lxc
  driverversion: 2.0.0.rc5
  kernel: Linux
  kernelarchitecture: x86_64
  kernelversion: 4.4.0-10-generic
  server: lxd
  serverpid: 2382
  serverversion: 2.0.0.rc1
  storage: zfs
  storageversion: "5"
config:
  core.https_address: '[::]'
  storage.zfs_pool_name: lxdpool
public: false

Revision history for this message

Stéphane Graber (stgraber) wrote on 2016-03-07:

#26

Ok, so investigation shows that:

- LXD bind-mounts all that stuff, it doesn't have a choice as it's not privileged enough to mount things itself
- mountall fails to run if its "optional" filesystems fail to mount (because that makes a lot of sense...)
- systemd sets up the host filesystems, on a clean boot they all seem fine
- "something" apparently remounts debugfs ro sometimes, this breaks containers
- "something" apparently makes the /proc/sys/fs/binfmt_misc autofs go nuts (loop of symlinks) which also breaks containers

We could try to teach mountall to do the right thing with optional mount and ignore their failures, however we'd need to SRU that to trusty and precise and then nag other distros in doing the same (centos, oracle, rhel, ...) before we can get rid of our workaround.

As a clean Xenial system does work properly, I think it would be best to figure out what's messing with debugfs and binfmt_misc post-boot and fix whatever it is to stop doing that.

Would be useful if the bug reporters could document exactly what they did on their system between the time it worked fine and the time it stopped working so we can figure out what's messing with those mounts.

Stéphane Graber (stgraber) on 2016-03-08

affects:	linux (Ubuntu) → lxd (Ubuntu)
Changed in lxd (Ubuntu):
assignee:	nobody → Stéphane Graber (stgraber)
status:	Incomplete → Fix Committed

Revision history for this message

Stéphane Graber (stgraber) wrote on 2016-03-08:

#27

So the cause of all this was /sys/kernel/debug/tracing which is a weird auto-mounted kernel path. That is, the sole action of listing that directory will cause it to get mounted for you by the kernel.

That means that any number of thing could accidentally cause it to mount.

Once it's mounted, the kernel considers /sys/kernel/debug to have a directory that's hidden through overmounting and so will not allow unprivileged users to bind-mount the underlying directory, which means /sys/kernel/debug isn't mounted in the container and causes mountall to fail.

There are quite a few ways to fix this.
The best would be to not have the kernel do this weird auto-mount thing, sadly fixing that would be a userspace regression so as weird and inconsistent (trying to remain polite) as the current design is, reverting it is unlieky.

As mentioned before, we could also fix mountall not to be so picky and not die when mounts that it knows as "optional" fail to mount. Unfortunately there are a lot of images out there using mountall, so we can't really rely on being able to push a fix to all of them.

A third option and the one we'll be using for now is to have LXD recursively bind-mount paths, therefore not exposing the container to any more information than would be visible on the host and so avoiding the kernel security feature entirely.

The fix in LXD is a one character change (bind to rbind) and I've sent a pull request upstream to do just that.

I'd just like to stress that I think the kernel behavior here is absolutely ridiculous, we have a security feature which triggers when it shouldn't (the path doesn't exist so can't be "hidden") combined with a crazy feature that's been added to be "user friendly" and causes automatic mounting of a filesystem by simply accessing a path inside another filesystem. The combination of both results in this bug... But the fact is, it's way easier and faster for us to workaround this in LXD than to try and fix the source of the problem...