Persistent container cannot be started

Bug #1716738 reported by Kyle Fazzari
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Snapcraft
Fix Released
Undecided
Cris Dywan

Bug Description

First of all, note that lxd on my system works fine, I use it every day. However, using Snapcraft's persistent containers doesn't work. Check this out (running out of Snapcraft master):

$ SNAPCRAFT_CONTAINER_BUILDS=1 snapcraft
"grade" property not specified: defaulting to "stable"
Using default LXD remote because SNAPCRAFT_CONTAINER_BUILDS is set to 1
Creating snapcraft-opencv-example
Device fuse added to snapcraft-opencv-example
error: Error calling 'lxd forkstart snapcraft-opencv-example /var/lib/lxd/containers /var/log/lxd/snapcraft-opencv-example/lxc.conf': err='Failed to run: /usr/bin/lxd forkstart snapcraft-opencv-example /var/lib/lxd/containers /var/log/lxd/snapcraft-opencv-example/lxc.conf: '
  lxc 20170912175704.321 ERROR lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1328 - Path "/sys/fs/cgroup/systemd//lxc/snapcraft-opencv-example" already existed.
  lxc 20170912175704.321 ERROR lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1385 - No such file or directory - Failed to create /sys/fs/cgroup/systemd//lxc/snapcraft-opencv-example: No such file or directory
  lxc 20170912175704.321 ERROR lxc_cgfsng - cgroups/cgfsng.c:create_path_for_hierarchy:1328 - Path "/sys/fs/cgroup/systemd//lxc/snapcraft-opencv-example-1" already existed.
  lxc 20170912175704.321 ERROR lxc_cgfsng - cgroups/cgfsng.c:cgfsng_create:1385 - No such file or directory - Failed to create /sys/fs/cgroup/systemd//lxc/snapcraft-opencv-example-1: No such file or directory
  lxc 20170912175704.333 ERROR lxc_start - start.c:lxc_spawn:1186 - Failed to set up id mapping.
  lxc 20170912175704.401 ERROR lxc_start - start.c:__lxc_start:1358 - Failed to spawn container "snapcraft-opencv-example".
  lxc 20170912175704.953 ERROR lxc_conf - conf.c:run_buffer:416 - Script exited with status 1.
  lxc 20170912175704.953 ERROR lxc_start - start.c:lxc_fini:546 - Failed to run lxc.hook.post-stop for container "snapcraft-opencv-example".
  lxc 20170912175704.955 ERROR lxc_conf - conf.c:userns_exec_1:4608 - Error setting up child mappings
  lxc 20170912175704.955 ERROR lxc_cgfsng - cgroups/cgfsng.c:recursive_destroy:1288 - Error destroying /sys/fs/cgroup/systemd//lxc/snapcraft-opencv-example-2
  lxc 20170912175704.957 ERROR lxc_conf - conf.c:userns_exec_1:4608 - Error setting up child mappings
  lxc 20170912175704.957 ERROR lxc_cgfsng - cgroups/cgfsng.c:recursive_destroy:1288 - Error destroying /sys/fs/cgroup/devices//lxc/snapcraft-opencv-example-2
  lxc 20170912175704.959 ERROR lxc_conf - conf.c:userns_exec_1:4608 - Error setting up child mappings
  lxc 20170912175704.959 ERROR lxc_cgfsng - cgroups/cgfsng.c:recursive_destroy:1288 - Error destroying /sys/fs/cgroup/pids//lxc/snapcraft-opencv-example-2
  lxc 20170912175704.961 ERROR lxc_conf - conf.c:userns_exec_1:4608 - Error setting up child mappings
  lxc 20170912175704.961 ERROR lxc_cgfsng - cgroups/cgfsng.c:recursive_destroy:1288 - Error destroying /sys/fs/cgroup/cpu//lxc/snapcraft-opencv-example-2
  lxc 20170912175704.963 ERROR lxc_conf - conf.c:userns_exec_1:4608 - Error setting up child mappings
  lxc 20170912175704.963 ERROR lxc_cgfsng - cgroups/cgfsng.c:recursive_destroy:1288 - Error destroying /sys/fs/cgroup/perf_event//lxc/snapcraft-opencv-example-2
  lxc 20170912175704.965 ERROR lxc_conf - conf.c:userns_exec_1:4608 - Error setting up child mappings
  lxc 20170912175704.965 ERROR lxc_cgfsng - cgroups/cgfsng.c:recursive_destroy:1288 - Error destroying /sys/fs/cgroup/blkio//lxc/snapcraft-opencv-example-2
  lxc 20170912175704.966 ERROR lxc_conf - conf.c:userns_exec_1:4608 - Error setting up child mappings
  lxc 20170912175704.966 ERROR lxc_cgfsng - cgroups/cgfsng.c:recursive_destroy:1288 - Error destroying /sys/fs/cgroup/hugetlb//lxc/snapcraft-opencv-example-2
  lxc 20170912175704.968 ERROR lxc_conf - conf.c:userns_exec_1:4608 - Error setting up child mappings
  lxc 20170912175704.968 ERROR lxc_cgfsng - cgroups/cgfsng.c:recursive_destroy:1288 - Error destroying /sys/fs/cgroup/net_cls//lxc/snapcraft-opencv-example-2
  lxc 20170912175704.970 ERROR lxc_conf - conf.c:userns_exec_1:4608 - Error setting up child mappings
  lxc 20170912175704.970 ERROR lxc_cgfsng - cgroups/cgfsng.c:recursive_destroy:1288 - Error destroying /sys/fs/cgroup/freezer//lxc/snapcraft-opencv-example-2
  lxc 20170912175704.972 ERROR lxc_conf - conf.c:userns_exec_1:4608 - Error setting up child mappings
  lxc 20170912175704.972 ERROR lxc_cgfsng - cgroups/cgfsng.c:recursive_destroy:1288 - Error destroying /sys/fs/cgroup/cpuset//lxc/snapcraft-opencv-example-2
  lxc 20170912175704.974 ERROR lxc_conf - conf.c:userns_exec_1:4608 - Error setting up child mappings
  lxc 20170912175704.974 ERROR lxc_cgfsng - cgroups/cgfsng.c:recursive_destroy:1288 - Error destroying /sys/fs/cgroup/memory//lxc/snapcraft-opencv-example-2

Try `lxc info --show-log local:snapcraft-opencv-example` for more info
Stopping local:snapcraft-opencv-example
error: The container is already stopped
Try `lxc info --show-log local:snapcraft-opencv-example` for more info
Traceback (most recent call last):
  File "/home/kyrofa/src/snapcraft/snapcraft/internal/lxd.py", line 130, in _ensure_started
    self._ensure_container()
  File "/home/kyrofa/src/snapcraft/snapcraft/internal/lxd.py", line 331, in _ensure_container
    'lxc', 'start', self._container_name])
  File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['lxc', 'start', 'local:snapcraft-opencv-example']' returned non-zero exit status 1

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/kyrofa/src/snapcraft/bin/snapcraft", line 36, in <module>
    obj=dict(project=snapcraft.ProjectOptions()))
  File "/usr/lib/python3/dist-packages/click/core.py", line 716, in __call__
    return self.main(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 696, in main
    rv = self.invoke(ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 1037, in invoke
    return Command.invoke(self, ctx)
  File "/usr/lib/python3/dist-packages/click/core.py", line 889, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/lib/python3/dist-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "/usr/lib/python3/dist-packages/click/decorators.py", line 17, in new_func
    return f(get_current_context(), *args, **kwargs)
  File "/home/kyrofa/src/snapcraft/snapcraft/cli/__init__.py", line 110, in run
    ctx.forward(lifecyclecli.commands['snap'])
  File "/usr/lib/python3/dist-packages/click/core.py", line 552, in forward
    return self.invoke(cmd, **kwargs)
  File "/usr/lib/python3/dist-packages/click/core.py", line 534, in invoke
    return callback(*args, **kwargs)
  File "/home/kyrofa/src/snapcraft/snapcraft/cli/lifecycle.py", line 132, in snap
    container_config, output, directory)
  File "/home/kyrofa/src/snapcraft/snapcraft/internal/lifecycle.py", line 322, in containerbuild
    metadata=config.get_metadata()).execute(step, args)
  File "/home/kyrofa/src/snapcraft/snapcraft/internal/lxd.py", line 404, in execute
    super().execute(step, args)
  File "/home/kyrofa/src/snapcraft/snapcraft/internal/lxd.py", line 146, in execute
    with self._ensure_started():
  File "/usr/lib/python3.5/contextlib.py", line 59, in __enter__
    return next(self.gen)
  File "/home/kyrofa/src/snapcraft/snapcraft/internal/lxd.py", line 136, in _ensure_started
    check_call(['lxc', 'stop', '-f', self._container_name])
  File "/usr/lib/python3.5/subprocess.py", line 581, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['lxc', 'stop', '-f', 'local:snapcraft-opencv-example']' returned non-zero exit status 1

I get the same errors if I run `lxc start snapcraft-openc-example` myself, and other container work fine. Look at the different configs:

$ lxc config show container-that-works
architecture: x86_64
config:
  volatile.base_image: 58f90cbf68927c3fc43e6ee1386446a04f3d8068c1a75a291339cb2be01dec08
  volatile.eth0.hwaddr: 00:16:3e:02:b3:34
  volatile.idmap.base: "0"
  volatile.idmap.next: '[]'
  volatile.last_state.idmap: '[]'
  volatile.last_state.power: STOPPED
devices:
  root:
    path: /
    type: disk
ephemeral: false
profiles:
- snapcraft-src
stateful: false

$ lxc config show snapcraft-opencv-example
architecture: x86_64
config:
  environment.LC_ALL: C.UTF-8
  environment.SNAPCRAFT_SETUP_CORE: "1"
  raw.idmap: both 1000 0
  volatile.base_image: 58f90cbf68927c3fc43e6ee1386446a04f3d8068c1a75a291339cb2be01dec08
  volatile.eth0.hwaddr: 00:16:3e:6f:57:da
  volatile.idmap.base: "0"
  volatile.idmap.next: '[{"Isuid":true,"Isgid":true,"Hostid":1000,"Nsid":0,"Maprange":1},{"Isuid":true,"Isgid":false,"Hostid":165537,"Nsid":1,"Maprange":65535},{"Isuid":true,"Isgid":true,"Hostid":1000,"Nsid":0,"Maprange":1},{"Isuid":false,"Isgid":true,"Hostid":165537,"Nsid":1,"Maprange":65535}]'
  volatile.last_state.idmap: '[{"Isuid":true,"Isgid":true,"Hostid":1000,"Nsid":0,"Maprange":1},{"Isuid":true,"Isgid":false,"Hostid":165537,"Nsid":1,"Maprange":65535},{"Isuid":true,"Isgid":true,"Hostid":1000,"Nsid":0,"Maprange":1},{"Isuid":false,"Isgid":true,"Hostid":165537,"Nsid":1,"Maprange":65535}]'
  volatile.last_state.power: RUNNING
devices:
  fuse:
    path: /dev/fuse
    type: unix-char
  root:
    path: /
    type: disk
ephemeral: false
profiles:
- default
stateful: false

Since the errors are about mappings, I suspect something is wrong in the idmap.

Tags: container
Kyle Fazzari (kyrofa)
description: updated
Leo Arias (elopio)
tags: added: container
Revision history for this message
Cris Dywan (kalikiana) wrote :

At first I couldn't reproduce this, eventually I managed to reproduce this from a VPS I hadn't used LXD on. Some investigation led me to finding this on my regular build host:

    /etc/sub{u,g}id
    root:1000:1
    [...]

Removing that, which I must've added at one point, lets me reproduce on both machines. I did not think this was required for the feature to work, I'm investigating now if the container setup done by Snapcraft needs changes to work out of the box (since there are multiple ways to do id mapping).

Changed in snapcraft:
status: New → In Progress
assignee: nobody → Christian Dywan (kalikiana)
Revision history for this message
Cris Dywan (kalikiana) wrote :

Also: I can only reproduce this on a LXD 2.0.10 remote running the deb ( not with a remote using the snap with LXD 2.17).

Revision history for this message
Kyle Fazzari (kyrofa) wrote :

Indeed, I should have specified: I'm using lxd v2.0.10 (the latest in the default Xenial archives).

Revision history for this message
Cris Dywan (kalikiana) wrote :

After some discussion with Stephane it seems my dog fooding of snap lxd made me unaware that setting /etc/sub{u,g}id is required for non-snap lxd versions (because the former is, surprisingly, less restricted, since it works off the core's filesystem).
The obvious solution is to show an error message with instructions on how to add the missing configuration.

Revision history for this message
Cris Dywan (kalikiana) wrote :

https://github.com/snapcore/snapcraft/pull/1553 implements what I called the "obvious solution" above. There does not seem to be a way without touching the host, unless the lxd snap is used. And unless we consider using sshfs for local remotes.

Cris Dywan (kalikiana)
Changed in snapcraft:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.