Migration and evacuation fails with encrypted volumes

Bug #1895848 reported by Mark Goddard
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Confirmed
Medium
Lee Yarwood

Bug Description

# Description

Migration and evacuation fails with encrypted volumes, when the user is in a different project to the instance creator, even if they are admin. This is a common use case, since operators typically need to migrate around instances. It also occurs with masakari during failover events.

# Steps to reproduce

As user 1 in project X:

* Enable volume encryption via barbican (https://docs.openstack.org/cinder/latest/configuration/block-storage/volume-encryption.html)
* Create an instance with an encrypted volume

As admin user in admin project:

* Migrate or evacuate instance created by user 1

# Expected results

Instance is migrated successfully.

# Actual results

Instance fails to migrate.

# Environment

CentOS 8
Kolla CentOS source containers
Train release

# Logs

We see the following in barbican API logs:

Secret retrieval attempt not allowed - please review your user/project privileges: oslo_policy.policy.PolicyNotAuthorized: secret:get is disallowed by policy

This is because barbican secrets, in this case the volume encryption key, are scoped to a project.

# Workaround

I added the following policy.json:

{
    "secret:get": "rule:secret_non_private_read or rule:secret_project_creator or rule:secret_project_admin or rule:secret_acl_read or role:key-manager:migrator",
    "secret:decrypt": "rule:secret_decrypt_non_private_read or rule:secret_project_creator or rule:secret_project_admin or rule:secret_acl_read or role:key-manager:migrator"
}

Then assigned the migrating user the key-manager:migrator role in their project. This allows migration and evacuation to succeed.

tags: added: barbican encryption volumes
Revision history for this message
Lee Yarwood (lyarwood) wrote :

Just to be clear you're referring to live migration and not cold migration here right? AFAIK cold migration by an admin of an instance with encrypted volumes attached should work.

I don't think we can support any live move operations by admins of instances with encrypted volumes attached given the default policy within Barbican. As you've pointed out this blocks access to the secrets by anyone but the owner by default.

As for evacuation, this came up recently downstream and I think we should introduce support for cold evacuations of instances from down compute nodes in W. This would allow admins to move instances with attached encrypted volumes while leaving it up to the end users to restart them later.

Changed in nova:
assignee: nobody → Lee Yarwood (lyarwood)
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Mark Goddard (mgoddard) wrote :

Hi Lee. This does happen with cold migration. I believe also with live migration, although now we have the policy in place it is difficult to confirm that.

I was wondering if there is some ACL that could be applied by Nova to the secret, although it would need to know the user in advance.

Revision history for this message
Lee Yarwood (lyarwood) wrote :

Apologies, I wasn't aware that it was possible to update the ACLs for an existing secret through the Barbican API:

https://docs.openstack.org/barbican/latest/api/reference/acls.html#patch-v1-containers-uuid-acl

I've added an item to the PTG agenda to discuss using this to enable admin only move operations:

https://etherpad.opendev.org/p/nova-wallaby-ptg

Revision history for this message
Mark Goddard (mgoddard) wrote :

Well one can't know everything :) Feel free to ping me for that topic (I noted that in the pad).

Revision history for this message
Lee Yarwood (lyarwood) wrote :

Dumping notes from the PTG etherpad here for context:

https://etherpad.opendev.org/p/nova-wallaby-ptg

(lyarwood) Enabling admin only move operations for instances with associated barbican secrets

- https://bugs.launchpad.net/nova/+bug/1895848

- https://docs.openstack.org/barbican/latest/api/reference/acls.html#patch-v1-containers-uuid-acl

- mgoddard: Feel free to ping me for this one, since I raised the bug.

- Q: Should we try to workaround this in code or just document the suggested workaround from the
  bug (using a migrator role who can read secrets) as Cinder does for other issues during the
  initial creation of an encrypted volume by a user:
  https://docs.openstack.org/cinder/latest/configuration/block-storage/volume-encryption.html#key-
  management-access-control

- there are thing that try to do things with an admin context without a user token
  + resize auto confirm periodic task <- if the guess it running in resize verify this should not
    fail right
  + rebooting instance at compute startup due to resume_guests_state_on_host_boot config

AGREED:

- add a new user to nova conf for barbican

- when nova creates the secret in barbican with the user's token then nova needs to add an ACL so
  that the nova's barbican user can read the token later

- alternative: service user token used in a similar way along side the user admin token

- lyarwood to write up a spec for this in W

Revision history for this message
Mark Goddard (mgoddard) wrote :

I just raised related bug 1917498.

Revision history for this message
Tobias Gurtzick (wzrdtales) wrote :

Is this just about the instance? Or also about the volume. I did not manage to get volume migration to work at all, even with policies applied. See https://bugs.launchpad.net/cinder/+bug/1929128

Revision history for this message
Mark Goddard (mgoddard) wrote :

This is about instance migration

Revision history for this message
Tobias Gurtzick (wzrdtales) wrote :

ok thanks, I figured out so much that moving of encrypted volumes with online vms doesn't seem to be supported at all right now...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.