Cannot remove detached devices from container

Bug #1690299 reported by Michal Kovarik
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
lxd (Ubuntu)
Fix Released
Medium
Stéphane Graber

Bug Description

When multiple unix-char or unix-block devices are attached to container and detached from system it's not possible to remove that missing devices from container.

Steps to reproduce:
1) plug two usb drives to system (in my case sda, sdb)
2) create container: lxc launch ubuntu: test
3) attach devices:
lxc config device add test sdb unix-block path=/dev/sdb
lxc config device add test sdc unix-block path=/dev/sdc
4) stop the container: lxc stop test
5) remove usb drives physically
6) try to start container
7) try to remove sdb and sdc device

Current results:
$ lxc start test
error: Missing source '/dev/sdb' for device 'sdb'
Try `lxc info --show-log test` for more info
$ lxc config device remove test sdb
error: The device path doesn't exist on the host and major/minor wasn't specified.
$ lxc config device list test
sdb: unix-block
sdc: unix-block

Revision history for this message
Michal Kovarik (michkov) wrote :

Version - 2.12-0ubuntu1~ubuntu16.04.1

Michal Kovarik (michkov)
description: updated
Revision history for this message
Stéphane Graber (stgraber) wrote :

Okay, so I figured out what's going on here and while not ideal, it actually makes sense.

unix-char and unix-block devices don't have an "optional" property, therefore LXD will fail to start any container which lists such a device without the host path existing.

LXD also validates the container configuration to be valid whenever a change to it is done.
When removing one of the device, the client tool gives LXD an updated container config which includes the remaining device. That device is invalid and so LXD fails the config validation and returns the error.

The way around the issue is to use "lxc config edit" to remove both devices in one shot. That way the LXD configuration will be valid.

There are a couple of things we could do to make this easier:
 - Add an "optional" flag to unix-char and unix-block, allowing you to specify devices which may not yet exist. The container would then start fine and the config would be considered valid by the validator even if those devices are missing. When the device shows up again, a restart would be needed to have it show up in the container again.

 - Allow for multiple device names to be passed to "lxc config device remove". This would let you remove both devices in one shot, resulting in a valid configuration.

Let me know if either/both of those would be useful to you. Until then, use "lxc config edit" to remove both devices.

Changed in lxd (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Michal Kovarik (michkov) wrote :

Both proposals are useful.

Revision history for this message
Stéphane Graber (stgraber) wrote :

The removal of multiple devices at once, I'll do as part of my new-client branch as the new client library makes that much easier. I spent 30min trying to hook things up with the old client library, but all our helper functions are meant to interact with a single device at once...

Adding the "optional" property isn't blocked on the port to the new client library though, so hopefully I'll get to that over the next few days.

Revision history for this message
Stéphane Graber (stgraber) wrote :

stgraber@castiana:~/data/code/lxc/lxd (stgraber/master)$ lxc config device remove c1 test1 test2
Device test1, test2 removed from c1

So I have that part implemented now, it should make it to LXD 2.16.

Revision history for this message
Stéphane Graber (stgraber) wrote :

Just spent an hour attempting to implement the "optional" property and I'm afraid it can't be done in a sane way due to:
 - We rely on the path existing on the host at the time the device is added to read a number of flags from it for config validation
 - We rely on the bind-mount device existing during removal so we can unconfigure the cgroup limits.

I had a commit which attempted to workaround most of those, but it ended up making "optional" such a special case that half the validation code would be bypassed and cgroup configuration might have ended up left behind after device removal which could be a potential security risk for privileged containers...

So I'll only be submitting the batch removal fix for this issue.

Changed in lxd (Ubuntu):
status: Triaged → Fix Committed
Changed in lxd (Ubuntu):
assignee: nobody → Stéphane Graber (stgraber)
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lxd - 2.15-0ubuntu5

---------------
lxd (2.15-0ubuntu5) artful; urgency=medium

  * Cherry-pick upstream fixes:
    - 0006-lxc-config-Removal-of-multiple-devices-at-once.patch (LP: #1690299)
    - 0007-network-Don-t-fail-on-non-process-PIDs.patch (LP: #1698712)
    - 0008-config-Try-to-be-clever-about-in-snapshots.patch (LP: #1694855)
    - 0009-Fix-readonly-mode-for-directory-mount.patch
    - 0010-client-Fix-race-condition-in-operation-handling.patch
    - 0011-import-keep-volatile-keys.patch
    - 0012-import-remove-last-dependency-on-symlink.patch
    - 0013-Better-handle-errors-in-memory-reporting.patch
    - 0014-client-Don-t-live-migrate-stopped-containers.patch

 -- Stéphane Graber <email address hidden> Mon, 03 Jul 2017 18:19:16 -0400

Changed in lxd (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.