Comment 0 for bug 1892895

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

I found a race condition which can be avoided by using wildcard rules in device cgroups, however, I do not see a way to enable that in an interface.

There is a use-case for MicroStack where iSCSI targets are added to the host kernel as block devices via iscsid + the iscsi-tcp kernel module.

An immediate idea is to:

* add block-devices interface to nova-compute and libvirtd apps;
* as a result, get major and minor devices of the hot-plugged devices added to device cgroups of Nova and libvirtd (/sys/fs/cgroup/devices/snap.microstack.{nova-compute, libvirtd}/devices.list).
  * This part of the interface makes sure of that: https://github.com/snapcore/snapd/blob/2.46/interfaces/builtin/block_devices.go#L97

As it turns out, this approach is racy since the device is attempted to be used prior to its major and minor number being added to the relevant device cgroup via: /sys/fs/cgroup/devices/snap.microstack.{nova-compute, libvirtd}/devices.allow

snap-device-helper is responsible for that https://github.com/snapcore/snapd/blob/2.46/cmd/snap-confine/snap-device-helper#L73

In essence, the block special file is created and used prior to the time when snapd runs snap-device-helper and confined applications are not synchronized with the operation of the helper in any way.

In the failure mode I observe consistently, I get "Operation not permitted" which is the EPERM returned from the kernel when it enforces accesses based on what is present in the device cgroup:
https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/focal/tree/security/device_cgroup.c?h=Ubuntu-5.4.0-44.48#n823

Specific to my use-case, what I see is that Nova tells libvirt to use a block device which fails with EPERM. Then Nova tries to remove the volume it just tried to attach and do `blockdev --flushbufs` in the process which fails as well:

* try: virt_driver.attach_volume (Nova) -> virStorageFileReportBrokenChain (libvirt) -> Cannot access storage file '/dev/sde': Operation not permitted -> libvirt.libvirtError Cannot access storage file '/dev/sde': Operation not permitted
* except: "Driver failed to attach volume..." -> volume_api.attachment_delete -> ... -> flush_device_io -> blockdev --flushbufs /dev/sde -> blockdev: cannot open /dev/sde: Operation not permitted
https://opendev.org/openstack/nova/src/branch/stable/ussuri/nova/virt/block_device.py#L498-L510 (Nova code)
https://git.launchpad.net/ubuntu/+source/libvirt/tree/src/util/virstoragefile.c?h=applied/ubuntu/focal#n4877 ("Cannot access storage" in libvirt)

https://paste.ubuntu.com/p/RTgq8XkzY6/ (logs)

If I add a wildcard rule to allow devices with any minor number and a certain major number to be used, this race condition is avoided.

sudo bash -c 'echo b 8:* rwm > /sys/fs/cgroup/devices/snap.microstack.libvirtd/devices.allow'
sudo bash -c 'echo b 8:* rwm > /sys/fs/cgroup/devices/snap.microstack.nova-compute/devices.allow'