Comment 1 for bug 1953563

Revision history for this message
Maciej Borzecki (maciek-borzecki) wrote :

I see no evidence of this being caused by cgroups v2. In fact all services are up.

Nested ubuntu 20.04 on 21.10 host:

root@my-ubuntu-confined:~# snap-store-proxy status
Store ID: not registered
Internal Service Status:
  memcached: running
  nginx: running
  snapauth: not running: 500 Server Error: INTERNAL SERVER ERROR for url: http://127.0.0.1:8005/_status/check
  snapdevicegw: not running: getresponse() got an unexpected keyword argument 'buffering'
  snapdevicegw-local: not running: [Errno 111] Connection refused
  snapproxy: not running: [Errno 111] Connection refused
  snaprevs: not running: 500 Server Error: INTERNAL SERVER ERROR for url: http://127.0.0.1:8002/_status/check
root@my-ubuntu-confined:~# snap services snap-store-proxy
Service Startup Current Notes
snap-store-proxy.memcached disabled active -
snap-store-proxy.nginx disabled active -
snap-store-proxy.snapassert disabled inactive -
snap-store-proxy.snapauth disabled active -
snap-store-proxy.snapdevicegw disabled active -
snap-store-proxy.snapident disabled inactive -
snap-store-proxy.snapproxy disabled active -
snap-store-proxy.snaprevs disabled active -

The services are up, some of them appear to be retrying operations and logging that. At the same time I observe denials on the host:

Dec 08 08:25:58 dec080806-781058 kernel: audit: type=1400 audit(1638951958.783:675): apparmor="DENIED" operation="capable" namespace="root//lxd-my-ubuntu-confined_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" pid=10881 comm="python3" capability=0 capname="chown"

Dec 08 08:25:59 dec080806-781058 audit[10856]: AVC apparmor="DENIED" operation="capable" namespace="root//lxd-my-ubuntu-confined_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapproxy" pid=10856 comm="python3" capability=0 capname="chown"
Dec 08 08:25:59 dec080806-781058 kernel: audit: type=1400 audit(1638951959.639:676): apparmor="DENIED" operation="capable" namespace="root//lxd-my-ubuntu-confined_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapproxy" pid=10856 comm="python3" capability=0 capname="chown"
Dec 08 08:25:59 dec080806-781058 audit[10881]: AVC apparmor="DENIED" operation="capable" namespace="root//lxd-my-ubuntu-confined_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" pid=10881 comm="python3" capability=0 capname="chown"
Dec 08 08:25:59 dec080806-781058 kernel: audit: type=1400 audit(1638951959.787:677): apparmor="DENIED" operation="capable" namespace="root//lxd-my-ubuntu-confined_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" pid=10881 comm="python3" capability=0 capname="chown"
Dec 08 08:26:00 dec080806-781058 audit[10856]: AVC apparmor="DENIED" operation="capable" namespace="root//lxd-my-ubuntu-confined_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapproxy" pid=10856 comm="python3" capability=0 capname="chown"
Dec 08 08:26:00 dec080806-781058 kernel: audit: type=1400 audit(1638951960.643:678): apparmor="DENIED" operation="capable" namespace="root//lxd-my-ubuntu-confined_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapproxy" pid=10856 comm="python3" capability=0 capname="chown"
Dec 08 08:26:00 dec080806-781058 audit[10881]: AVC apparmor="DENIED" operation="capable" namespace="root//lxd-my-ubuntu-confined_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" pid=10881 comm="python3" capability=0 capname="chown"
Dec 08 08:26:00 dec080806-781058 kernel: audit: type=1400 audit(1638951960.787:679): apparmor="DENIED" operation="capable" namespace="root//lxd-my-ubuntu-confined_<var-snap-lxd-common-lxd>" profile="snap.snap-store-proxy.snapdevicegw" pid=10881 comm="python3" capability=0 capname="chown"

Ubuntu 21.10 on 21.10 host, status is reported as not running for many services:

root@my-ubuntu:~# snap-store-proxy status
Store ID: not registered
Internal Service Status:
  memcached: running
  nginx: running
  snapauth: not running: 500 Server Error: INTERNAL SERVER ERROR for url: http://127.0.0.1:8005/_status/check
  snapdevicegw: not running: getresponse() got an unexpected keyword argument 'buffering'
  snapdevicegw-local: not running: [Errno 111] Connection refused
  snapproxy: not running: [Errno 111] Connection refused
  snaprevs: not running: 500 Server Error: INTERNAL SERVER ERROR for url: http://127.0.0.1:8002/_status/check

But snap services snap-store-proxy reports they are up:
root@my-ubuntu:~# snap services snap-store-proxy
Service Startup Current Notes
snap-store-proxy.memcached disabled active -
snap-store-proxy.nginx disabled active -
snap-store-proxy.snapassert disabled inactive -
snap-store-proxy.snapauth disabled active -
snap-store-proxy.snapdevicegw disabled active -
snap-store-proxy.snapident disabled inactive -
snap-store-proxy.snapproxy disabled active -
snap-store-proxy.snaprevs disabled active -

So the failure mode appears to be similar.

Disabling apparmor in the nested containers and making them privileged:
lxc config set my-ubuntu raw.lxc 'lxc.apparmor.profile=unconfined'
lxc config set my-ubuntu security.privileged true

Makes the problem go away:

root@my-ubuntu:~# snap-store-proxy status
Store ID: not registered
Internal Service Status:
  memcached: running
  nginx: running
  snapauth: not running: 500 Server Error: INTERNAL SERVER ERROR for url: http://127.0.0.1:8005/_status/check
  snapdevicegw: running
  snapdevicegw-local: running
  snapproxy: running
  snaprevs: not running: 500 Server Error: INTERNAL SERVER ERROR for url: http://127.0.0.1:8002/_status/check
root@my-ubuntu:~# snap services snap-store-proxy
Service Startup Current Notes
snap-store-proxy.memcached disabled active -
snap-store-proxy.nginx disabled active -
snap-store-proxy.snapassert disabled inactive -
snap-store-proxy.snapauth disabled active -
snap-store-proxy.snapdevicegw disabled active -
snap-store-proxy.snapident disabled inactive -
snap-store-proxy.snapproxy disabled active -
snap-store-proxy.snaprevs disabled active -

No more denials at this point. I would suspect there is something wrong with apparmor 3 handling here (stacking maybe?) that is only seen on hosts with new kernels.