lxd should avoid spurious mounting of ZFS datasets on existing pools

Bug #1629824 reported by Richard Laager
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lxd (Ubuntu)
Won't Fix
Wishlist
Unassigned

Bug Description

lxd should ensure that, when using an existing pool, its datasets are not spuriously mounted.

--- Original Description:

Currently, lxd is not explicitly creating the intermediate "containers" and "images" datasets. It is allowing them to be created implicitly by calling "zfs create -p".

I would like to see lxd always (regardless of whether lxd is creating the pool) create these datasets explicitly (e.g. in lxd/main.go) and set "canmount=off" or "mountpoint=none" on both. Otherwise, regardless of which existing dataset I choose, I end up with "/something/containers" and "/something/images" mounted for no reason. Setting "canmount=off" or "mountpoint=none" are the normal approaches for ZFS datasets that only exist to contain other datasets; which one to use depends on whether the children need to inherit a mountpoint. Currently, the children of "containers" and "images" do not need to inherit a mountpoint, as lxd always sets a mountpoint. (This is because lxd wants the mountpoints to be "foo.zfs" rather than just "foo".) So either option would be fine.

I think it would be wise to also set "exec=on" and "setuid=on" on the "containers" dataset. If the children of the "containers" dataset are mounted with noexec ("exec=off"), the container fails to start, for obvious reasons. Likewise, "setuid=on" is necessary for proper operation of setuid executables inside the container. If lxd doesn't set these options itself, it is at the mercy of whatever options might be inherited. This was a problem for me; see below.

Since "devices=off" is not a problem, I don't see any other options to be concerned with. The other options are either irrelevant, or might be something the admin actually wants to control. It's just these two that break the container.

Setting the options as described above would fix the following scenario, plus many other scenarios:

# Simulate a root pool setup that matches the upstream ZFS root HOWTO (which I maintain):
truncate -s 10G /var/lib/rpool.img
zfs create rpool /var/lib/rpool.img
zfs set -o devices=off rpool
zfs create -o canmount=off -o setuid=off -o exec=off rpool/var

# Try to use lxd, following the same dataset convention:
zfs create -o canmount=off rpool/var/lib
zfs create rpool/var/lib/lxd
apt install lxd
lxd init
# Specify "no" to creating a pool, and "rpool/var/lib/lxd" as the existing dataset.
# Accept the rest of the defaults.
lxd launch ubuntu:

The "containers" and "images" datasets are created too late, after data has already been written to /var/lib/lxd/containers and /var/lib/lxd/images. Creating them earlier solves this, as would setting "canmount=off" or "mountpoint=none" regardless of when they are created. The latter has the advantage of also fixing the unnecessary mounting for other cases.

Revision history for this message
Stéphane Graber (stgraber) wrote :

We may consider doing some of that when reworking the way we handle storage in a few months.

For now, users will typically use "lxd init" which will create a LXD pool for them that's marked with mounts disabled from the start (and a clean config for everything else).

If you opt into using a subset of an existing pool, we assume you know what you're doing and that you will properly configure it prior to giving it to LXD (LXD won't create it for you so you kinda have to).

The fact that LXD doesn't put /var/lib/lxd itself in the ZFS pool is intentional. LXD will eventually support multiple pools and storage backends being used in parallel, which would directly conflict with this approach.

Changed in lxd (Ubuntu):
status: New → Triaged
importance: Undecided → Wishlist
Revision history for this message
Stéphane Graber (stgraber) wrote :

Marking as won't fix in the current state of LXD as it's not yet clear whether this is in scope or not for our zfs work.

Even when we support multiple storage pools, I'd still be opposed to overriding any configuration that the user made already, possibly with the exception of mountpoint which we certainly should set to none regardless.

If the user has a pool which they marked as noexec or nosuid, I would actually expect the containers to fail and require them to adjust their settings, rather than alter security configuration for them automatically (which may conflict with security certifications or company policies).

Changed in lxd (Ubuntu):
status: Triaged → Won't Fix
Revision history for this message
Richard Laager (rlaager) wrote :

> The fact that LXD doesn't put /var/lib/lxd itself in the ZFS pool is
> intentional.

I wasn't suggesting that in the general case. Creating a dataset at /var/lib/lxd has more to do with the root-on-ZFS setup, and is independent of where to place the lxd storage pool(s). I can see how that's distracting. Let's ignore that.

The mountpoint problem exists just the same in this scenario:
truncate -s 10G /var/lib/pool.img
zpool create pool /var/lib/pool.img
apt install lxd
lxd init
# Specify "no" to creating a pool, and "pool" as the existing dataset.
# Accept the rest of the defaults.
lxd launch ubuntu:

You'll end up with /pool/containers and /pool/images mounted.

> If you opt into using a subset of an existing pool, we assume you know
> what you're doing and that you will properly configure it prior to
> giving it to LXD (LXD won't create it for you so you kinda have to).

It's true that the admin has to create the pool first. But they do not have to create the datasets for lxd to work. lxd will implicitly create the containers and images datasets through the use of `zfs create -p POOL/containers/foo".

I'm suggesting this change (untested):
https://github.com/rlaager/lxd/commit/7c9f79c9451fad646c533fc6bb5ea3e67535eab8

Revision history for this message
Stéphane Graber (stgraber) wrote : Re: [Bug 1629824] Re: lxd should explicitly create "containers" and "images" datasets on ZFS

On Tue, Oct 04, 2016 at 02:22:45AM -0000, Richard Laager wrote:
> > The fact that LXD doesn't put /var/lib/lxd itself in the ZFS pool is
> > intentional.
>
> I wasn't suggesting that in the general case. Creating a dataset at
> /var/lib/lxd has more to do with the root-on-ZFS setup, and is
> independent of where to place the lxd storage pool(s). I can see how
> that's distracting. Let's ignore that.
>
> The mountpoint problem exists just the same in this scenario:
> truncate -s 10G /var/lib/pool.img
> zpool create pool /var/lib/pool.img
> apt install lxd
> lxd init
> # Specify "no" to creating a pool, and "pool" as the existing dataset.
> # Accept the rest of the defaults.
> lxd launch ubuntu:
>
> You'll end up with /pool/containers and /pool/images mounted.

Sure, but had you created the exact same thing through "lxd init" you'd
have been fine.

If the user manually creates a pool, it's their responsability to pass -m none.

>
> > If you opt into using a subset of an existing pool, we assume you know
> > what you're doing and that you will properly configure it prior to
> > giving it to LXD (LXD won't create it for you so you kinda have to).
>
> It's true that the admin has to create the pool first. But they do not
> have to create the datasets for lxd to work. lxd will implicitly create
> the containers and images datasets through the use of `zfs create -p
> POOL/containers/foo".
>
> I'm suggesting this change (untested):
> https://github.com/rlaager/lxd/commit/7c9f79c9451fad646c533fc6bb5ea3e67535eab8

I'd probably just have LXD force set mountpoint=none for whatever
dataset its root is at and let the rest inherit.

That wouldn't be quite enough, LXD creates a number of other paths when
archiving images, mounting snapshots, doing live migration, ...

--
Stéphane Graber
Ubuntu developer
http://www.ubuntu.com

Revision history for this message
Richard Laager (rlaager) wrote : Re: lxd should explicitly create "containers" and "images" datasets on ZFS

What in particular is problematic about the solution I have proposed?

The code needs testing, and it should be converted to a loop, and any other datasets need to be added to the list. But what is wrong in concept?

A sufficiently educated admin could create everything perfectly for lxd, but why should they have to when lxd can trivially do it for them?

> If the user manually creates a pool, it's their responsability to
> pass -m none.

It's unreasonable to expect the admin to make their pool use mountpoint=none. That's not a normal ZFS configuration. It's not the ZFS default. It's not how root pools work on any OS. The only place I've seen that approach is with pools lxd creates.

> I'd probably just have LXD force set mountpoint=none for whatever
> dataset its root is at and let the rest inherit.

It's more unreasonable to force their pool to mountpoint=none. If someone runs `lxd init`, gives it their existing pool, and lxd *breaks their system*, that's not okay.

If the practice of lxd was to create another dataset level, then it could force that level to mountpoint=none. That would be a reasonable solution. For example, if I pass in "pool", then lxd could create "pool/lxd" with mountpoint=none and then create "pool/lxd/containers/..." and "pool/lxd/images/..." under there.

However, while that is a fine choice, that's not how lxd is doing things now, and it would be inconsistent with dedicated lxd pools. The solution I'm proposing works in all three supported scenarios and does not break existing users.

> That wouldn't be quite enough, LXD creates a number of other paths when
> archiving images, mounting snapshots, doing live migration, ...

This proves my point. How does the admin have any idea which datasets lxd is going to create, such that they can pre-create all those datasets with the right settings? I've spent several hours testing lxd, and multiple back-and-forth conversations with you, and I'm still apparently unaware of all the datasets lxd can use.

Even if an admin is fully aware of everything lxd uses today, at any time the next version of lxd might create something new.

Revision history for this message
Stéphane Graber (stgraber) wrote :

LXD expects the pool or part of the pool that it's provided to be owned by it with no other data there. This is actually enforced in a few places in the LXD code.

So as an acceptable change to LXD, we could just have it enforce that the provided zfs_pool_name (which can be a pool or dataset) exists but is empty. Right now we only check that it's empty when you attempt to unset zfs_pool_name which is a bit weird.

And then we can have it always set mountpoint=none on that given that it's then guaranteed to be owned only by LXD.

Revision history for this message
Stéphane Graber (stgraber) wrote :
Revision history for this message
Richard Laager (rlaager) wrote :

That fixes the mounting problem. You've WONTFIXed the security options (and I understand your reasoning there).

One last question: Since this requires the pool to be empty at `lxd init` time, is that going to cause a problem for reinstalls? That is, if I have a working system, then I reinstall it (e.g. for an OS upgrade or repair), would I normally re-run `lxd init`? If not, I think this is done.

Thanks for bearing with me on this!

description: updated
summary: - lxd should explicitly create "containers" and "images" datasets on ZFS
+ lxd should avoid spurious mounting of ZFS datasets on existing pools
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.