Comment 20 for bug 1576341

Revision history for this message
Nish Aravamudan (nacc) wrote :

Wanted to level-set (and subscribing pitti and hallyn for their advice):

1) LXD unprivileged containers:

4 services in the Zesty daily are failed at start:

1.a) iscsid.service

  This is because iscsid needs CAP_IPC_LOCK to run mlockall(). Unprivileged containers end up failing in the host kernel.

  I believe the right way about this is to make the change hallyn did to open-iscsi.service in iscsid.service and make open-iscsi.service properly depend on iscsid.service. But I also think the change made by hallyn is too broad and means even privileged containers cannot use iscsi, which does not seem to be strictly true.

1.a.1) Proposed solution: http://paste.ubuntu.com/24196051/

  Effectively, only run iscsid if not in a user namespace (which is where the capabilities get dropped, aiui). And open-iscsi service adds conditions (adapter from Fedora's service file) to check that nodes are defined (which would imply some configuration has been done) and a session exists (which I think means that /etc/iscsi/iscsid.conf contains node.startup=automatic and iscsid has started up a session therefore).

  If we are worried about the potential breakage (I need to of course test all this in the various iSCSI configurations), we might consider just making the first change (ConditionVirtualization=!private-users) to both .service files, but I feel like that is mostly a workaround for not being able to express cleanly the dependency between the two services: open-iscsi.service can only run if iscsid.service is running; but if iscsid.service is not running because of a Condition, then open-iscsi.service should not be in a failed state.

1.b) systemd-remount-fs.service

  z2 systemd-remount-fs[50]: mount: can't find LABEL=cloudimg-rootfs

  /etc/fstab:
    LABEL=cloudimg-rootfs / ext4 defaults 0 0

  This doesn't really make sense in the context of LXD containers afaict, because they don't have a /dev/disk/by-label necessarily? Also, the / is all configured by LXD in practice, not by how the cloud-image is configured?

1.b.1) Proposed solution, comment out the entry in /etc/fstab in the LXD images.

1.c) lvm2-lvmetad.socket

  lvm[61]: Daemon lvmetad returned error 104
  lvm[61]: WARNING: Failed to connect to lvmetad. Falling back to device scanning.
  ...
  lvm2-lvmetad.socket: Trigger limit hit, refusing further activation.

  But manually running `systemctl start lvm2-lvmetad.socket` at `lxc exec z1 bash`, works. That seems confusing and implies some sort of ordering issue? (Note that confusingly `systemctl restart lvm2-lvmetad.socket` does *not* work.)

1.c.1) I don't have a proposed solution for this.

1.d) systemd-journal-audit.socket

  I found this older thread: https://lists.freedesktop.org/archives/systemd-devel/2015-May/032113.html on this topic. Specifically, https://lists.freedesktop.org/archives/systemd-devel/2015-May/032126.html.

  Looking at the socket file, though, I see:

  ConditionCapability=CAP_AUDIT_READ

  which I do not believe is the same as CAP_ADMIN_READ. I don't know if the ML post or the change are incorrect, but I did verify that using CAP_ADMIN_READ in the container instead of CAP_AUDIT_READ did correctly conditionalize the socket start, while CAP_AUDIT_READ does not.

1.d.1) Proposed solution: changing the ConditionCapability to CAP_ADMIN_READ.

2) Privileged containers

2.a) systemd-remount-fs.service

  Same as 1.b) above.

2.b) lvm2-lvmetad.socket

  Same as 1.c) above.

With the changes in 1.a.1, 1.b.1 and 1.d.1:

3) Unpriviled container

3.a) Only 1.c) remains, and after issuing `systemctl start lvm2-lvmetad.socket`, `systemctl status` reports 'running'.

4) Privileged container

4.a) same as 3.a)