Re-attaching an encrypted(Barbican) Cinder volume to an instance fails

Bug #1764125 reported by Tzach Shefi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Cinder
Fix Released
High
Eric Harney

Bug Description

Description of problem:
An attached encrypted (Barbican) RBD Cinder volume was attached to instance, write data to it.
Then volume was detached, when trying to reattach the volume to same instance volume fails to attach. Odd errors on attached nova-compute.log

2018-04-15 13:14:06.274 1 ERROR nova.compute.manager [instance: 923c5318-8502-4f85-a215-78afc4fd641b] uuid=managed_object_id)
2018-04-15 13:14:06.274 1 ERROR nova.compute.manager [instance: 923c5318-8502-4f85-a215-78afc4fd641b] ManagedObjectNotFoundError: Key not found, uuid: 7912eac8-2652-4c92-b53f-3db4ecca7bc7

2018-04-15 13:14:06.523 1 ERROR oslo_messaging.rpc.server File "/usr/lib/python2.7/site-packages/cinderclient/client.py", line 177, in request
2018-04-15 13:14:06.523 1 ERROR oslo_messaging.rpc.server raise exceptions.from_response(resp, body)
2018-04-15 13:14:06.523 1 ERROR oslo_messaging.rpc.server VolumeAttachmentNotFound: Volume attachment c17e2b89-5a36-4e7e-8c71-b975f2f5ccb3 could not be found.
2018-04-15 13:14:06.523 1 ERROR oslo_messaging.rpc.server

How reproducible:
Unsure looks like every time I try to re-attach.

Steps to Reproduce:
1. Boot an instance
2. Create an encrypted(Barbican) backed Cinder(RBD) volume, attach to instance write data.
3. Detach volume from instance
4. Try to reattach same volume to same instance.

$nova volume-attach 923c5318-8502-4f85-a215-78afc4fd641b 16584072-ef78-4a80-91ab-cbd47e9bc70d auto

5. Volume fails to attach
No error volume remains unattached
cinder list
+--------------------------------------+-----------+-------------+------+----------------------------+----------+--------------------------------------+
| ID | Status | Name | Size | Volume Type | Bootable | Attached to |
+--------------------------------------+-----------+-------------+------+----------------------------+----------+--------------------------------------+
| 16584072-ef78-4a80-91ab-cbd47e9bc70d | available | 2-Encrypted | 1 | LuksEncryptor-Template-256 | false | |

Actual results:
Volume fails to attach.

Expected results:
Volume should successfully reattach.

Environment / Version-Release number of selected component (if applicable):
rhel7.5
openstack-nova-conductor-17.0.2-0.20180323024604.0390d5f.el7ost.noarch
python-nova-17.0.2-0.20180323024604.0390d5f.el7ost.noarch
python-novaclient-9.1.1-1.el7ost.noarch
openstack-cinder-12.0.1-0.20180326201852.46c4ec1.el7ost.noarch
openstack-nova-scheduler-17.0.2-0.20180323024604.0390d5f.el7ost.noarch
openstack-nova-console-17.0.2-0.20180323024604.0390d5f.el7ost.noarch
puppet-cinder-12.3.1-0.20180222074326.18152ac.el7ost.noarch
openstack-nova-compute-17.0.2-0.20180323024604.0390d5f.el7ost.noarch
python2-cinderclient-3.5.0-1.el7ost.noarch
python-cinder-12.0.1-0.20180326201852.46c4ec1.el7ost.noarch
openstack-nova-api-17.0.2-0.20180323024604.0390d5f.el7ost.noarch
openstack-nova-novncproxy-17.0.2-0.20180323024604.0390d5f.el7ost.noarch
puppet-nova-12.3.1-0.20180319062741.9db79a6.el7ost.noarch
openstack-nova-common-17.0.2-0.20180323024604.0390d5f.el7ost.noarch
openstack-nova-migration-17.0.2-0.20180323024604.0390d5f.el7ost.noarch
openstack-nova-placement-api-17.0.2-0.20180323024604.0390d5f.el7ost.noarch

Libvirt + KVM
Neutron networking
Cinder volume is RBD backed and encrypted via Barbican.

Revision history for this message
Tzach Shefi (tshefi) wrote :
Revision history for this message
Tzach Shefi (tshefi) wrote :

Doesn't show up in logs as tested again after I uploaded.

A new encrypted RBD volume was created:
#cinder create 1 --volume-type LuksEncryptor-Template-256 --name NewEncVol

It successfully attached:

| ac8045f2-a8d6-4567-8476-40f7e0f63dcf | in-use | NewEncVol | 1 | LuksEncryptor-Template-256 | false | 923c5318-8502-4f85-a215-78afc4fd641b |

Volume detach works
#nova volume-detach 923c5318-8502-4f85-a215-78afc4fd641b ac8045f2-a8d6-4567-8476-40f7e0f63dcf

Cinder list ->
| ac8045f2-a8d6-4567-8476-40f7e0f63dcf | available | NewEncVol | 1 | LuksEncryptor-Template-256 | false | |

Re-attaching this new volume, now worked.
 nova volume-attach 923c5318-8502-4f85-a215-78afc4fd641b ac8045f2-a8d6-4567-8476-40f7e0f63dcf auto

| ac8045f2-a8d6-4567-8476-40f7e0f63dcf | in-use | NewEncVol | 1 | LuksEncryptor-Template-256 | false | 923c5318-8502-4f85-a215-78afc4fd641b |

Now I'm not sure if this bug is a one off / fluke, or maybe only original volume had an issue.

tags: added: cinder volumes
Revision history for this message
Lee Yarwood (lyarwood) wrote :

FWIW we did see something like this downstream [1] where Barbican actually lost track of the secret associated with the volume. We've been unable to reproduce the issue since then but there are some useful comments in there from Ade Lee around what to look for on the Barbican side.

Can you confirm that there are no attempts to delete the 7912eac8-2652-4c92-b53f-3db4ecca7bc7 secret from Cinder or Nova before we move this to Barbican?

[1] https://bugzilla.redhat.com/show_bug.cgi?id=1567614

Revision history for this message
Lee Yarwood (lyarwood) wrote :
Download full text (9.1 KiB)

Thanks to Tzach I was able to get access to an env downstream and confirm whats going on here.

c-vol appears to be creating a fresh secret for the new volume that isn't capable of unlocking the volume. IMHO c-vol should just copy the associated secret UUID during the creation process from an image with one already associated to it.

Additionally, the create flow here is really weird, I can see that we download the image twice and try to import into rbd twice. The first import appears to be a fresh LUKS encrypted image, the second a raw to raw conversion that does nothing to the original LUKS encryption of the image.

Anyway, I'm removing nova from this bug and adding cinder. More detailed notes can be found below.

[ notes ]

I can see multiple keys used by n-cpu :

2018-05-17 11:47:47.382 1 DEBUG barbicanclient.v1.secrets [req-6c45d622-ecf1-4cbb-a038-b8eaaf776818 ea26e0f59cf44f909a0dbe86f1f21078 3d16a4daf99042d5adbc4f0d55dbf322 - default default] Getting secret - Secret href: http://172.17.1.12:9311/v1/secrets/a3c400ce-8b94-4ee5-90e9-564bab6c823b get /usr/lib/python2.7/site-packages/barbicanclient/v1/secrets.py:457

2018-05-17 11:52:26.413 1 DEBUG barbicanclient.v1.secrets [req-dfe882de-0b11-4a70-b527-78b47a7faf2e ea26e0f59cf44f909a0dbe86f1f21078 3d16a4daf99042d5adbc4f0d55dbf322 - default default] Getting secret - Secret href: http://172.17.1.12:9311/v1/secrets/3b88eedc-813e-4e01-bec7-d8d2b7d2ef42 get /usr/lib/python2.7/site-packages/barbicanclient/v1/secrets.py:457

Fetching these we can see that they are not the same :

$ curl -vv -H "X-Auth-Token: $TOKEN" -H 'Accept: application/octet-stream' -o a3c400ce-8b94-4ee5-90e9-564bab6c823b http://10.0.0.106:9311/v1/secrets/a3c400ce-8b94-4ee5-90e9-564bab6c823b
* About to connect() to 10.0.0.106 port 9311 (#0)
* Trying 10.0.0.106...
  % Total % Received % Xferd Average Speed Time Time Time Current
                                 Dload Upload Total Spent Left Speed
  0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0* Connected to 10.0.0.106 (10.0.0.106) port 9311 (#0)
> GET /v1/secrets/a3c400ce-8b94-4ee5-90e9-564bab6c823b HTTP/1.1
> User-Agent: curl/7.29.0
> Host: 10.0.0.106:9311
> X-Auth-Token: gAAAAABa_W8gLDSMIfr7hzDC385Qpjewpy2awYIrqyO0O8U5VceB4YX_xyDlH7zBBPyR68L5krAEvCzkJq-b335TbGGeqQ_EDFNa9pclZo7Qm3m0_E8ofv0W9Ny8XWwhKERNK-3BxuUUMf1N7CgexHnkIgFye23EpzZF8lcxAKWmNCIiY_p2h9g
> Accept: application/octet-stream
>
< HTTP/1.1 200 OK
< Date: Thu, 17 May 2018 12:12:16 GMT
< Server: Apache
< x-openstack-request-id: req-e32e0e58-8234-4fd3-90d8-50f9f72d617c
< Content-Length: 32
< Content-Type: application/octet-stream
<
{ [data not shown]
100 32 100 32 0 0 115 0 --:--:-- --:--:-- --:--:-- 115
* Connection #0 to host 10.0.0.106 left intact

$ curl -vv -H "X-Auth-Token: $TOKEN" -H 'Accept: application/octet-stream' -o 3b88eedc-813e-4e01-bec7-d8d2b7d2ef42 http://10.0.0.106:9311/v1/secrets/3b88eedc-813e-4e01-bec7-d8d2b7d2ef42
* About to connect() to 10.0.0.106 port 9311 (#0)
* Trying 10.0.0.106...
  % Total % Received % Xferd Average Speed Time Time Time Curren...

Read more...

affects: nova → cinder
Eric Harney (eharney)
Changed in cinder:
importance: Undecided → High
assignee: nobody → Eric Harney (eharney)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (master)

Fix proposed to branch: master
Review: https://review.openstack.org/571010

Changed in cinder:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (master)

Reviewed: https://review.openstack.org/571010
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=2792be3ffced9b7faeb69fa100b3c50b76d00dc3
Submitter: Zuul
Branch: master

commit 2792be3ffced9b7faeb69fa100b3c50b76d00dc3
Author: Eric Harney <email address hidden>
Date: Tue May 29 12:12:13 2018 -0400

    Fix handling of 'cinder_encryption_key_id' image metadata

    The Cinder code that processes Glance image metadata
    is a bit confused about whether this particular field
    is a Glance property or metadata.

    Since it isn't a defined a Glance property and is stored
    in image metadata, ensure that Cinder also tracks it
    metadata and not as a property.

    This mismatch prior to this fix causes Cinder to create
    volumes with the wrong encryption key when creating a
    volume from an encrypted image, which results in an
    unreadable volume.

    Closes-Bug: #1764125
    Change-Id: Ie5af3703eaa82d23b50127f611235d86e4104369

Changed in cinder:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/571345

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/queens)

Reviewed: https://review.openstack.org/571345
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=55edbdd8ca1455ff46fd3343d6ca2f34679efb3e
Submitter: Zuul
Branch: stable/queens

commit 55edbdd8ca1455ff46fd3343d6ca2f34679efb3e
Author: Eric Harney <email address hidden>
Date: Tue May 29 12:12:13 2018 -0400

    Fix handling of 'cinder_encryption_key_id' image metadata

    The Cinder code that processes Glance image metadata
    is a bit confused about whether this particular field
    is a Glance property or metadata.

    Since it isn't a defined a Glance property and is stored
    in image metadata, ensure that Cinder also tracks it
    metadata and not as a property.

    This mismatch prior to this fix causes Cinder to create
    volumes with the wrong encryption key when creating a
    volume from an encrypted image, which results in an
    unreadable volume.

    Closes-Bug: #1764125
    Change-Id: Ie5af3703eaa82d23b50127f611235d86e4104369
    (cherry picked from commit 2792be3ffced9b7faeb69fa100b3c50b76d00dc3)

tags: added: in-stable-queens
Eric Harney (eharney)
summary: - Re-attaching an encrypted(Barbican) Cinder (RBD) volume to an instance
- fails
+ Re-attaching an encrypted(Barbican) Cinder volume to an instance fails
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 13.0.0.0b2

This issue was fixed in the openstack/cinder 13.0.0.0b2 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/cinder 12.0.3

This issue was fixed in the openstack/cinder 12.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to cinder (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.opendev.org/688723

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to cinder (stable/pike)

Reviewed: https://review.opendev.org/688723
Committed: https://git.openstack.org/cgit/openstack/cinder/commit/?id=26ae52cf0b843951d0eea3c3b8a8e8c0b1bbd0ba
Submitter: Zuul
Branch: stable/pike

commit 26ae52cf0b843951d0eea3c3b8a8e8c0b1bbd0ba
Author: Eric Harney <email address hidden>
Date: Tue May 29 12:12:13 2018 -0400

    Fix handling of 'cinder_encryption_key_id' image metadata

    The Cinder code that processes Glance image metadata
    is a bit confused about whether this particular field
    is a Glance property or metadata.

    Since it isn't a defined a Glance property and is stored
    in image metadata, ensure that Cinder also tracks it
    metadata and not as a property.

    This mismatch prior to this fix causes Cinder to create
    volumes with the wrong encryption key when creating a
    volume from an encrypted image, which results in an
    unreadable volume.

    Conflicts:
        cinder/image/glance.py
    Note(elod.illes): conflict is due to not having patch
    Ice379db9ae83420bacf9e96e242c7515930eae86 in stable pike.

    Closes-Bug: #1764125
    Change-Id: Ie5af3703eaa82d23b50127f611235d86e4104369
    (cherry picked from commit 2792be3ffced9b7faeb69fa100b3c50b76d00dc3)
    (cherry picked from commit 55edbdd8ca1455ff46fd3343d6ca2f34679efb3e)

tags: added: in-stable-pike
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.