NVMe-oF connector returning the wrong nqn

Bug #1928944 reported by Gorka Eguileor
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
os-brick
Fix Released
Medium
Gorka Eguileor

Bug Description

The NVMe-oF connector is not returning the right nqn of the host on the get_connector_properties method when '/etc/nvme/hostnqn' doesn't exist.

The issues around this are threefold:

- os-brick's auto create of hostnqn is not working
- when the hostnqn file doesn't exist, get_connector_properties will return a different nqn value on each call
- even if auto creation of hostnqn worked, it could be problematic, since it could be replacing the one in use by other nvme-of connections since we are using "nvme gen-hostnqn"

Currently this will be problematic for any Cinder driver that uses the connector's nqn for ACL of the volumes or for any other purposes.

Changed in os-brick:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/os-brick/+/792237

Changed in os-brick:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (master)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/792237
Committed: https://opendev.org/openstack/os-brick/commit/37d57c4306acc59dfe8f025040b7701dea2d304c
Submitter: "Zuul (22348)"
Branch: master

commit 37d57c4306acc59dfe8f025040b7701dea2d304c
Author: Gorka Eguileor <email address hidden>
Date: Wed May 19 16:57:30 2021 +0200

    NVMe-oF: Return right nqn when missing hostnqn

    This patch fixes a problem where we don't return the right nqn value
    when the /etc/nvme/hostnqn file doesn't exist.

    Currently get_connector_properties returns a new nqn value every time it
    gets called if the hostnqn file is not present in the system.

    The reason for this is that we are calling "nvme gen-hostnqn" and piping
    it to tee, but for some reason tee is failing to write the file, so on
    each call we'll generate a new nqn.

    Even if we were successfully creating the hostnqn file, that could
    create problems if we have already attached some volumes, since those
    would have used the existing nqn.

    This patch replaces the series of command executions (mkdir, nvme,
    chmod) with a privsep method that then leverages the equivalent python
    methods when possible and also tries to use the show-hostnqn nvme
    subcommand first to get the existing nqn before defaulting to generating
    a new one with gen-hostnqn. This way we don't replace the current nqn
    value that may be already being used by other nvme connections.

    Subcommand show-hostnqn returns the host NQN configured for the system.
    First looks for /etc/nvme/hostnqn, if it's not present (our case when we
    call the new create_hostnqn method) tries to construct it from dmi
    (/sys/firmware/dmi/entries) or from systemd's application-specific
    machine IDs for the system [1].

    It's important to differentiate between the ENOENT returned as OSError
    and the same value returned by nvme [2] when show-hostnqn cannot return
    anything.

    [1]: https://github.com/linux-nvme/nvme-cli/blob/ed9538622ac6fc9e0094dd6804fc3b2ab46477b9/fabrics.c#L858-L875
    [2]: https://github.com/linux-nvme/nvme-cli/blob/5b8b065b1d036d8e9050adc49ffeb3b7adad1dbf/nvme.c#L5642-L5643

    Closes-Bug: #1928944
    Change-Id: I252dd958767dcdd4f9a2767b362aaf675edb79c4

Changed in os-brick:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 5.0.0

This issue was fixed in the openstack/os-brick 5.0.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/os-brick/+/834604

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/834604
Committed: https://opendev.org/openstack/os-brick/commit/a5f53fd3fcbc35952f0e890beef7fbb8c18a45c5
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit a5f53fd3fcbc35952f0e890beef7fbb8c18a45c5
Author: Gorka Eguileor <email address hidden>
Date: Wed May 19 16:57:30 2021 +0200

    NVMe-oF: Return right nqn when missing hostnqn

    This patch fixes a problem where we don't return the right nqn value
    when the /etc/nvme/hostnqn file doesn't exist.

    Currently get_connector_properties returns a new nqn value every time it
    gets called if the hostnqn file is not present in the system.

    The reason for this is that we are calling "nvme gen-hostnqn" and piping
    it to tee, but for some reason tee is failing to write the file, so on
    each call we'll generate a new nqn.

    Even if we were successfully creating the hostnqn file, that could
    create problems if we have already attached some volumes, since those
    would have used the existing nqn.

    This patch replaces the series of command executions (mkdir, nvme,
    chmod) with a privsep method that then leverages the equivalent python
    methods when possible and also tries to use the show-hostnqn nvme
    subcommand first to get the existing nqn before defaulting to generating
    a new one with gen-hostnqn. This way we don't replace the current nqn
    value that may be already being used by other nvme connections.

    Subcommand show-hostnqn returns the host NQN configured for the system.
    First looks for /etc/nvme/hostnqn, if it's not present (our case when we
    call the new create_hostnqn method) tries to construct it from dmi
    (/sys/firmware/dmi/entries) or from systemd's application-specific
    machine IDs for the system [1].

    It's important to differentiate between the ENOENT returned as OSError
    and the same value returned by nvme [2] when show-hostnqn cannot return
    anything.

    [1]: https://github.com/linux-nvme/nvme-cli/blob/ed9538622ac6fc9e0094dd6804fc3b2ab46477b9/fabrics.c#L858-L875
    [2]: https://github.com/linux-nvme/nvme-cli/blob/5b8b065b1d036d8e9050adc49ffeb3b7adad1dbf/nvme.c#L5642-L5643

    Closes-Bug: #1928944
    Change-Id: I252dd958767dcdd4f9a2767b362aaf675edb79c4
    (cherry picked from commit 37d57c4306acc59dfe8f025040b7701dea2d304c)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 4.3.4

This issue was fixed in the openstack/os-brick 4.3.4 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.