unable to allow an app to access all devices with a certain major number via a <majordev>:* device cgroup rule
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
snapd |
Confirmed
|
High
|
Unassigned |
Bug Description
I found a race condition which can be avoided by using wildcard rules in device cgroups, however, I do not see a way to enable that in an interface.
There is a use-case for MicroStack where iSCSI targets are added to the host kernel as block devices via iscsid + the iscsi-tcp kernel module.
An immediate idea is to:
* add block-devices interface to nova-compute and libvirtd apps;
* as a result, get major and minor devices of the hot-plugged devices added to device cgroups of Nova and libvirtd (/sys/fs/
* This part of the interface makes sure of that: https:/
As it turns out, this approach is racy since the device is attempted to be used prior to its major and minor number being added to the relevant device cgroup via: /sys/fs/
snap-device-helper is responsible for that https:/
In essence, the block special file is created and used prior to the time when snapd runs snap-device-helper and confined applications are not synchronized with the operation of the helper in any way.
In the failure mode I observe consistently, I get "Operation not permitted" which is the EPERM returned from the kernel when it enforces accesses based on what is present in the device cgroup:
https:/
Specific to my use-case, what I see is that Nova tells libvirt to use a block device which fails with EPERM. Then Nova tries to remove the volume it just tried to attach and do `blockdev --flushbufs` in the process which fails as well:
* try: virt_driver.
* except: "Driver failed to attach volume..." -> volume_
https:/
https:/
https:/
If I add a wildcard rule to allow devices with any minor number and a certain major number to be used, this race condition is avoided.
sudo bash -c 'echo b 8:* rwm > /sys/fs/
sudo bash -c 'echo b 8:* rwm > /sys/fs/
-------
Another simple use-case this is valid for is working with loop devices.
If I have this in an interface:
const connectedPlugAp
/dev/loop-control rw,
/dev/loop[0-9]* rw,
`
var microStackConne
`SUBSYSTEM=
`SUBSYSTEM=
}
And try to use `losetup -f` when there are no free loop files available:
fallocate -l $loop_file_size $loop_file
losetup -f $loop_file
I will get "Operation not permitted" during the losetup invocation since the device cgroup entry is not added fast enough.
This is a much simpler reproducer then the one with iSCSI.
-------
Update (09-09-2020):
Found one more use-case which is LV activation after reboot:
* reboot -> LV Status NOT available;
* lvchange -a y <vgname-for-lvs> -> device-mapper: reload ioctl on (253:3) failed: Operation not permitted
description: | updated |
Changed in snapd: | |
assignee: | nobody → Zygmunt Krynicki (zyga) |
description: | updated |
Changed in snapd: | |
importance: | Undecided → Medium |
importance: | Medium → High |
Changed in snapd: | |
assignee: | Zygmunt Krynicki (zyga) → nobody |
I've analyzed the problem and I need to discuss my findings with the rest of the snapd team. I have several ides on how to avoid this problem, in addition the the suggestion provided by the reporter.