lxc-start fails with Out of memory reading cgroups

Bug #1271000 reported by David Favor
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lxc (Ubuntu)
Fix Released
Wishlist
Unassigned

Bug Description

Package might really be libcgroup1.

This seems to have stopped working with latest kernel upgrade.

To replicate failure...

net1# lxc-create -t ubuntu -n test
Checking cache download in /var/cache/lxc/saucy/rootfs-amd64 ...
Copy /var/cache/lxc/saucy/rootfs-amd64 to /usr/lib/x86_64-linux-gnu/lxc ...
Copying rootfs to /usr/lib/x86_64-linux-gnu/lxc ...
Generating locales...
  en_US.UTF-8... up-to-date
Generation complete.
Creating SSH2 RSA key; this may take some time ...
Creating SSH2 DSA key; this may take some time ...
Creating SSH2 ECDSA key; this may take some time ...
invoke-rc.d: policy-rc.d denied execution of start.

##
# The default user is 'ubuntu' with password 'ubuntu'!
# Use the 'sudo' command to run tasks as root in the container.
##

net1# lxc-start -n test
lxc-start: Out of memory reading cgroups
lxc-start: failed to spawn 'test'

Also now cgroup namespaces no longer shows up as enabled. Seems like this did show up as enabled in previous kernel. Reading other bugs reports, this seems to be okay + will be fixed upstream + I mention it here because it's a difference.

net1# lxc-checkconfig | grep -v enabled
Kernel configuration not found at /proc/config.gz; searching...
Kernel configuration found at /boot/config-3.11.0-15-generic
--- Namespaces ---
User namespace: missing

--- Control groups ---
Cgroup namespace: required

The cgroup mount point is showing up...

net1# mount | grep cgroup
none on /sys/fs/cgroup type tmpfs (rw)
systemd on /sys/fs/cgroup/systemd type cgroup (rw,noexec,nosuid,nodev,none,name=systemd)
none on /sys/fs/cgroup type tmpfs (rw)

And there is now filesystem entry for the container named test under /sys/fs/cgroup so this may also suggest a problem.

net1# ls /sys/fs/cgroup
net1#

Revision history for this message
David Favor (davidfavor) wrote :

Hum... I set the package to lxc + it changed to libcgroup.

Revision history for this message
David Favor (davidfavor) wrote :

net1# uname -a
Linux net1.bizcooker.com 3.11.0-15-generic #23-Ubuntu SMP Mon Dec 9 18:17:04 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

net1# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 13.10
Release: 13.10
Codename: saucy

net1# lxc-version
lxc version: 1.0.0.alpha1

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

If this is still a problem, could you please show the result of

cat /proc/self/mounts

as well as doing

sudo lxc-start -n c1 -l trace -o start.out

and attaching the resulting start.out here?

Note that if you have libcgroup installed, it is recommended to install cgroup-lite instead.

Changed in libcgroup (Ubuntu):
status: New → Incomplete
Changed in lxc (Ubuntu):
status: New → Incomplete
Revision history for this message
David Favor (davidfavor) wrote :

net1# cat /proc/self/mounts
rootfs / rootfs rw 0 0
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=4071956k,nr_inodes=1017989,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=816440k,mode=755 0 0
/dev/disk/by-uuid/849acaae-87ce-407a-b01c-80910dbad24e / ext4 rw,noatime,nodiratime,errors=remount-ro,data=ordered 0 0
none /sys/fs/cgroup tmpfs rw,relatime,size=4k,mode=755 0 0
none /sys/fs/fuse/connections fusectl rw,relatime 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0
none /sys/kernel/security securityfs rw,relatime 0 0
tmpfs /tmp tmpfs rw,noatime,nodiratime,size=816440k 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0
none /sys/fs/pstore pstore rw,relatime 0 0
tmpfs /var/tmp tmpfs rw,noatime,nodiratime,size=816440k 0 0
/dev/sda1 /boot ext3 rw,noatime,nodiratime,errors=continue,user_xattr,acl,barrier=1,data=ordered 0 0
systemd /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,name=systemd 0 0
cgroup /sys/fs/cgroup/cpuset cgroup rw,relatime,cpuset,clone_children 0 0
cgroup /sys/fs/cgroup/cpu cgroup rw,relatime,cpu 0 0
cgroup /sys/fs/cgroup/cpuacct cgroup rw,relatime,cpuacct 0 0
cgroup /sys/fs/cgroup/memory cgroup rw,relatime,memory 0 0
cgroup /sys/fs/cgroup/devices cgroup rw,relatime,devices 0 0
cgroup /sys/fs/cgroup/freezer cgroup rw,relatime,freezer 0 0
cgroup /sys/fs/cgroup/blkio cgroup rw,relatime,blkio 0 0
cgroup /sys/fs/cgroup/perf_event cgroup rw,relatime,perf_event 0 0
cgroup /sys/fs/cgroup/hugetlb cgroup rw,relatime,hugetlb 0 0
none /sys/fs/cgroup tmpfs rw,relatime 0 0

net1# lxc-start -n c1 -l trace -o start.out
lxc-start: Executing '/sbin/init' with no configuration file may crash the host

net1# cat start.out
      lxc-start 1390511075.386 INFO lxc_start_ui - using rcfile /var/lib/lxc/c1/config
      lxc-start 1390511075.386 WARN lxc_log - lxc_log_init called with log already initialized
      lxc-start 1390511075.386 ERROR lxc_start_ui - Executing '/sbin/init' with no configuration file may crash the host

Hum... Looks like I have both libcgroup1 + cgroup-lite installed.

I'll try a deinstall libcgroup1... Nope... Did apt-get purge libcgroup1 + still get the same error.

Other suggestions?

Revision history for this message
David Favor (davidfavor) wrote :

Ugh... deinstalled + reinstall cgroup-lite + now additional errors occur.

Hum... more errors + start.log looks like I'm actually making progress.

Attaching start.log file for entire log...

net1# lxc-destroy -n test
net1# lxc-create -n test
net1# lxc-start -n test -l trace -o start.out
lxc-start: No such file or directory - failed to create symlink for kmsg
lxc-start: failed to setup kmsg for 'test'
lxc-start: No such file or directory - failed to mount /proc in the container.
lxc-start: failed to setup the container
lxc-start: invalid sequence number 1. expected 2
lxc-start: failed to spawn 'test'

Revision history for this message
David Favor (davidfavor) wrote :

Hum... Something is seriously amiss...

The log file says /var/lib/lxc/c1/config is being used, which is incorrect... as /var/lib/lxc/c1 does note exist.

I'm guessing the config file associated with the container (test) should be used, which is...

/var/lib/lxc/test/config (test instead of c1) + even trying to supply this config file manually, it's ignored... as in...

lxc-start -n test -f /var/lib/lxc/test/config -l trace -o start.out

So it appears the problems are...

1) The config file associated with the container is incorrectly determined.

2) If the correct config file is specified, it's quietly ignored.

Suggestions?

Revision history for this message
David Favor (davidfavor) wrote :

Ah... this works...

lxc-create -t ubuntu -n test
lxc-start -n test

My guess is the default container template is suppose to be the template matching the default OS, which in this case is ubuntu.

Somehow this isn't being done.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

Sorry, I'm confused at this point. If I understand correctly, if you do

lxc-create -n test

the container fails to start, and if you do

lxc-create -t ubuntu -n test

then it succeeds?

I agree that it would seem sensible to have the first case default to the ubuntu or ubuntu-cloud template, however the non-template behavior is actually used by some people. We could discuss upstream adding a '-t none' option to emulate the legacy no-template option, however this could meet some resistance. If my reading of the situation is correct, then please feel free to change the bug status to Triaged (and importance should be low) and I'll ask for comments on the mailing list.

Revision history for this message
David Favor (davidfavor) wrote :

Correct. It appears the default template processing (default meaning no template specified) seems like it could use a bit of work.

Seems like specifying no template should either throw an error or create some form of container that will start.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote : Re: [Bug 1271000] Re: lxc-start fails with Out of memory reading cgroups

Quoting David Favor (<email address hidden>):
> Correct. It appears the default template processing (default meaning no
> template specified) seems like it could use a bit of work.

Ok, thanks.

> Seems like specifying no template should either throw an error or create
> some form of container that will start.

For anyone reading this report who may not be clear on it, the current
behavior is: a template exists to fill in a rootfs. If no template
is specified, then it is assumed that a rootfs has already been created,
and only the basic metadata (rootfs location etc) are to be filled in.
People do rely on this behavior. So the question is how to best avoid
the (more common imo) case where a user accidentally does not specify a
template. We could force the existing no-template users to switch to
specifying '-t none'. We could simply document the current behavior.
We could print out a warning about there being no template, but not
change the default behavior. Not sure if there are any other ideas.

So the question is how to best handle this without causing trouble
for people who rely on this default behavior.

 subject: change default template handling
 status: triaged
 importance: wishlist

Changed in libcgroup (Ubuntu):
importance: Undecided → Wishlist
status: Incomplete → Triaged
Revision history for this message
David Favor (davidfavor) wrote :

Ah... now I understand...

So this behavior is actually desirable. For example, to generate an empty rootfs when creating a new OS template or App template. This way the rootfs can be manually populated + tested before committing details to the template.

Seems like having '-t none' might be good to roll into the code, so -t template is always required.

Then when '-t none' is specified, maybe issue a simple informational message... something like...

"Creating container with an empty rootfs. Be sure to manually create rootfs to meet your needs."

Or similar.

Thanks for the info.

This makes perfect sense now.

Revision history for this message
David Favor (davidfavor) wrote :

Or maybe keep -t optional (so it works like always).

And generate the informational message if -t is missing or '-t none' is specified.

As more + more people use LXC, having this simple message seems like a life saver for newbies, like me.

Thanks for the info.

Revision history for this message
anavarre (aurelien.navarre) wrote :

I cannot confirm the above behavior.

$ sudo lxc-create -t ubuntu -n myserver -P $HOME/lxc
Checking cache download in /usr/local/var/cache/lxc/trusty/rootfs-amd64 ...
Copy /usr/local/var/cache/lxc/trusty/rootfs-amd64 to /home/anavarre/lxc/myserver/rootfs ...
Copying rootfs to /home/username/lxc/myserver/rootfs ...
Generating locales...
  en_US.UTF-8... done
Generation complete.
Creating SSH2 RSA key; this may take some time ...
Creating SSH2 DSA key; this may take some time ...
Creating SSH2 ECDSA key; this may take some time ...
Creating SSH2 ED25519 key; this may take some time ...
update-rc.d: warning: default stop runlevel arguments (0 1 6) do not match ssh Default-Stop values (none)
invoke-rc.d: policy-rc.d denied execution of start.

Current default time zone: 'Europe/Paris'
Local time is now: Sat Mar 15 10:08:51 CET 2014.
Universal Time is now: Sat Mar 15 09:08:51 UTC 2014.

##
# The default user is 'ubuntu' with password 'ubuntu'!
# Use the 'sudo' command to run tasks as root in the container.
##

$ sudo lxc-start -n myserver
lxc-start: Executing '/sbin/init' with no configuration file may crash the host

As you can see above I'm running Trusty (beta 1) and it's always failing to start the container for me, no matter what flags I'm passing.

Revision history for this message
Serge Hallyn (serge-hallyn) wrote :

@anavarre

if you pass -P to lxc-create, then you must also pass -P to lxc-start.

Changed in lxc (Ubuntu):
importance: Undecided → Wishlist
status: Incomplete → Confirmed
status: Confirmed → Fix Committed
no longer affects: libcgroup (Ubuntu)
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxc - 1.1.0~alpha2-0ubuntu2

---------------
lxc (1.1.0~alpha2-0ubuntu2) utopic; urgency=medium

  * Cherry-pick usptream bugfix for lxc-usernic test.
 -- Stephane Graber <email address hidden> Thu, 02 Oct 2014 15:01:56 -0400

Changed in lxc (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.