Encryption broken for the nvmeof connector

Bug #1964379 reported by Gorka Eguileor
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
os-brick
Fix Released
Medium
Gorka Eguileor

Bug Description

For non LUKS v1 encryptions the os-brick nvmeof connector is broken.

The attach and detach process works, but after we have detached the device we will no longer be able to attach another nvme device on the same host, seeing errors like this in the node (in this case is compute, but same thing would happen on controllers):

Mar 09 17:03:12 localhost.localdomain nova-compute[1015947]: ERROR oslo_messaging.rpc.server raise retry_exc.reraise() Mar 09 17:03:12 localhost.localdomain nova-compute[1015947]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.6/site-packages/tenacity/__init__.py", line 189, in reraise
Mar 09 17:03:12 localhost.localdomain nova-compute[1015947]: ERROR oslo_messaging.rpc.server raise self.last_attempt.result() Mar 09 17:03:12 localhost.localdomain nova-compute[1015947]: ERROR oslo_messaging.rpc.server File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 425, in result
Mar 09 17:03:12 localhost.localdomain nova-compute[1015947]: ERROR oslo_messaging.rpc.server return self.__get_result() Mar 09 17:03:12 localhost.localdomain nova-compute[1015947]: ERROR oslo_messaging.rpc.server File "/usr/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result
Mar 09 17:03:12 localhost.localdomain nova-compute[1015947]: ERROR oslo_messaging.rpc.server raise self._exception Mar 09 17:03:12 localhost.localdomain nova-compute[1015947]: ERROR oslo_messaging.rpc.server File "/usr/local/lib/python3.6/site-packages/tenacity/__init__.py", line 426, in __call__
Mar 09 17:03:12 localhost.localdomain nova-compute[1015947]: ERROR oslo_messaging.rpc.server result = fn(*args, **kwargs) Mar 09 17:03:12 localhost.localdomain nova-compute[1015947]: ERROR oslo_messaging.rpc.server File "/opt/remote_brick/os_brick/initiator/connectors/nvmeof.py", line 186, in _get_device_path
Mar 09 17:03:12 localhost.localdomain nova-compute[1015947]: ERROR oslo_messaging.rpc.server raise exception.VolumePathsNotFound() Mar 09 17:03:12 localhost.localdomain nova-compute[1015947]: ERROR oslo_messaging.rpc.server os_brick.exception.VolumePathsNotFound: Could not find any paths for the volume.

The reason is that the connector returns a real path instead of a symlink, in the very same way as the iSCSI one did before (https://bugs.launchpad.net/os-brick/+bug/1703954).

The problem is that the os-brick encryptor needs to replace the device with a symlink, resulting in the device being a symlink which nobody cleans up on detach, so the next attach won't be able to use that device.

After the disconnect we can see this:

$ ls -l /dev/nvme*
crw-------. 1 root root 10, 61 Mar 1 16:57 /dev/nvme-fabrics
crw-------. 1 root root 244, 0 Mar 9 17:03 /dev/nvme0
lrwxrwxrwx. 1 root root 25 Mar 9 16:54 /dev/nvme0n1 -> /dev/mapper/crypt-nvme0n1

Changed in os-brick:
importance: Undecided → Medium
Changed in os-brick:
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/os-brick/+/836061

Changed in os-brick:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/os-brick/+/836391

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (master)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/836391
Committed: https://opendev.org/openstack/os-brick/commit/1583a5038d34fd560b554e0eee5c1c5a7612722f
Submitter: "Zuul (22348)"
Branch: master

commit 1583a5038d34fd560b554e0eee5c1c5a7612722f
Author: Gorka Eguileor <email address hidden>
Date: Mon Apr 4 20:01:39 2022 +0200

    Fix encryption symlink issues

    This patch fixes 2 issues related to the symlinks, or lack of, that
    connectors' connect_volume methods return.

    Some connectors always return the block device instead of a symlink for
    encrypted volumes, and other connectors return a symlink that is owned
    by the system's udev rules. Both cases are problematic

    Returning the real device can prevent the encryptor connect_volume to
    complete successfully, and in other cases (such as nvmeof) it completes,
    but on the connector's disconnect volume it will leave the device behind
    (i.e., /dev/nvme0n1) preventing new connections that would use that same
    device name.

    Returning a symlink owned by the system's udev rules means that they can
    be reclaimed by those rules at any time. This can happen with
    cryptsetup, because when it creates a new mapping it triggers udev rules
    for the device that can reclaim the symlink after os-brick has replaced
    it.

    This patch creates a couple of decorators to facilitate this for all
    connectors. These decorators transform the paths so that the callers
    gets the expected symlink, but the connector doesn't need to worry about
    it and will always see the value it returns regardless of what symlink
    the caller gets.

    From this moment onwards we use our own custom symlink that starts with
    "/dev/disk/by-id/os-brick".

    The patch fixes bugs in other connectors (such as the RBD local
    connection), but since there are no open bugs they have not been
    reported.

    Closes-Bug: #1964379
    Closes-Bug: #1967790
    Change-Id: Ie373ab050dcc0a35c749d9a53b6cf5ca060bcb58

Changed in os-brick:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (stable/yoga)

Fix proposed to branch: stable/yoga
Review: https://review.opendev.org/c/openstack/os-brick/+/845845

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (stable/yoga)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/845845
Committed: https://opendev.org/openstack/os-brick/commit/0bd5dc99152261d1d59ecb788faf4aea3e299edd
Submitter: "Zuul (22348)"
Branch: stable/yoga

commit 0bd5dc99152261d1d59ecb788faf4aea3e299edd
Author: Gorka Eguileor <email address hidden>
Date: Mon Apr 4 20:01:39 2022 +0200

    Fix encryption symlink issues

    This patch fixes 2 issues related to the symlinks, or lack of, that
    connectors' connect_volume methods return.

    Some connectors always return the block device instead of a symlink for
    encrypted volumes, and other connectors return a symlink that is owned
    by the system's udev rules. Both cases are problematic

    Returning the real device can prevent the encryptor connect_volume to
    complete successfully, and in other cases (such as nvmeof) it completes,
    but on the connector's disconnect volume it will leave the device behind
    (i.e., /dev/nvme0n1) preventing new connections that would use that same
    device name.

    Returning a symlink owned by the system's udev rules means that they can
    be reclaimed by those rules at any time. This can happen with
    cryptsetup, because when it creates a new mapping it triggers udev rules
    for the device that can reclaim the symlink after os-brick has replaced
    it.

    This patch creates a couple of decorators to facilitate this for all
    connectors. These decorators transform the paths so that the callers
    gets the expected symlink, but the connector doesn't need to worry about
    it and will always see the value it returns regardless of what symlink
    the caller gets.

    From this moment onwards we use our own custom symlink that starts with
    "/dev/disk/by-id/os-brick".

    The patch fixes bugs in other connectors (such as the RBD local
    connection), but since there are no open bugs they have not been
    reported.

    Closes-Bug: #1964379
    Closes-Bug: #1967790
    Change-Id: Ie373ab050dcc0a35c749d9a53b6cf5ca060bcb58
    (cherry picked from commit 1583a5038d34fd560b554e0eee5c1c5a7612722f)

tags: added: in-stable-yoga
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/os-brick/+/846343

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/846343
Committed: https://opendev.org/openstack/os-brick/commit/b31b109f0f95c0102300bbda15b011f3623a2873
Submitter: "Zuul (22348)"
Branch: stable/xena

commit b31b109f0f95c0102300bbda15b011f3623a2873
Author: Gorka Eguileor <email address hidden>
Date: Mon Apr 4 20:01:39 2022 +0200

    Fix encryption symlink issues

    This patch fixes 2 issues related to the symlinks, or lack of, that
    connectors' connect_volume methods return.

    Some connectors always return the block device instead of a symlink for
    encrypted volumes, and other connectors return a symlink that is owned
    by the system's udev rules. Both cases are problematic

    Returning the real device can prevent the encryptor connect_volume to
    complete successfully, and in other cases (such as nvmeof) it completes,
    but on the connector's disconnect volume it will leave the device behind
    (i.e., /dev/nvme0n1) preventing new connections that would use that same
    device name.

    Returning a symlink owned by the system's udev rules means that they can
    be reclaimed by those rules at any time. This can happen with
    cryptsetup, because when it creates a new mapping it triggers udev rules
    for the device that can reclaim the symlink after os-brick has replaced
    it.

    This patch creates a couple of decorators to facilitate this for all
    connectors. These decorators transform the paths so that the callers
    gets the expected symlink, but the connector doesn't need to worry about
    it and will always see the value it returns regardless of what symlink
    the caller gets.

    From this moment onwards we use our own custom symlink that starts with
    "/dev/disk/by-id/os-brick".

    The patch fixes bugs in other connectors (such as the RBD local
    connection), but since there are no open bugs they have not been
    reported.

    Closes-Bug: #1964379
    Closes-Bug: #1967790
    Change-Id: Ie373ab050dcc0a35c749d9a53b6cf5ca060bcb58
    (cherry picked from commit 1583a5038d34fd560b554e0eee5c1c5a7612722f)
    (cherry picked from commit 0bd5dc99152261d1d59ecb788faf4aea3e299edd)
    Conflicts:
            os_brick/initiator/connectors/lightos.py
            os_brick/utils.py

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 5.2.1

This issue was fixed in the openstack/os-brick 5.2.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on os-brick (master)

Change abandoned by "Gorka Eguileor <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/os-brick/+/836061
Reason: Fixed in https://review.opendev.org/c/openstack/os-brick/+/836391

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 6.0.0

This issue was fixed in the openstack/os-brick 6.0.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 5.0.3

This issue was fixed in the openstack/os-brick 5.0.3 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/os-brick/+/856576

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/856576
Committed: https://opendev.org/openstack/os-brick/commit/cf69f9247061bd0a945a6fb6bf688acff617eb2c
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit cf69f9247061bd0a945a6fb6bf688acff617eb2c
Author: Gorka Eguileor <email address hidden>
Date: Mon Apr 4 20:01:39 2022 +0200

    Fix encryption symlink issues

    This patch fixes 2 issues related to the symlinks, or lack of, that
    connectors' connect_volume methods return.

    Some connectors always return the block device instead of a symlink for
    encrypted volumes, and other connectors return a symlink that is owned
    by the system's udev rules. Both cases are problematic

    Returning the real device can prevent the encryptor connect_volume to
    complete successfully, and in other cases (such as nvmeof) it completes,
    but on the connector's disconnect volume it will leave the device behind
    (i.e., /dev/nvme0n1) preventing new connections that would use that same
    device name.

    Returning a symlink owned by the system's udev rules means that they can
    be reclaimed by those rules at any time. This can happen with
    cryptsetup, because when it creates a new mapping it triggers udev rules
    for the device that can reclaim the symlink after os-brick has replaced
    it.

    This patch creates a couple of decorators to facilitate this for all
    connectors. These decorators transform the paths so that the callers
    gets the expected symlink, but the connector doesn't need to worry about
    it and will always see the value it returns regardless of what symlink
    the caller gets.

    From this moment onwards we use our own custom symlink that starts with
    "/dev/disk/by-id/os-brick".

    The patch fixes bugs in other connectors (such as the RBD local
    connection), but since there are no open bugs they have not been
    reported.

    Closes-Bug: #1964379
    Closes-Bug: #1967790
    Change-Id: Ie373ab050dcc0a35c749d9a53b6cf5ca060bcb58
    (cherry picked from commit 1583a5038d34fd560b554e0eee5c1c5a7612722f)
    (cherry picked from commit 0bd5dc99152261d1d59ecb788faf4aea3e299edd)
    (cherry picked from commit b31b109f0f95c0102300bbda15b011f3623a2873)
    Conflicts:
            os_brick/initiator/connectors/iscsi.py

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 4.3.4

This issue was fixed in the openstack/os-brick 4.3.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/os-brick/+/866461

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on os-brick (stable/victoria)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/victoria
Review: https://review.opendev.org/c/openstack/os-brick/+/866461
Reason: stable/victoria branch of openstack/os-brick is about to be deleted. To be able to do that, all open patches need to be abandoned. Please cherry pick the patch to unmaintained/victoria if you want to further work on this patch.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.