Bug #2035695 “NVMe-oF cannot connect” : Bugs : os-brick

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2023-09-14: Fix proposed to os-brick (master)

#1

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/os-brick/+/895193

Changed in os-brick:
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-09: Fix merged to os-brick (master)

#2

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/895193
Committed: https://opendev.org/openstack/os-brick/commit/ec22c32de6820184d7737c5af70e573c0634cd38
Submitter: "Zuul (22348)"
Branch: master

commit ec22c32de6820184d7737c5af70e573c0634cd38
Author: Gorka Eguileor <email address hidden>
Date: Thu Sep 14 12:19:26 2023 +0200

NVMe-oF: Fix attach when reconnecting

When an nvme subsystem has all portals in connecting state and we try
to attach a new volume to that same subsystem it will fail.

    We can reproduce it with LVM+nvmet if we configure it to share targets
    and then:
    - Create instance
    - Attach 2 volumes
    - Delete instance (this leaves the subsystem in connecting state [1])
    - Create instance
    - Attach volume <== FAILS

    The problem comes from the '_connect_target' method that ignores
    subsystems in 'connecting' state, so if they are all in that state it
    considers it equivalent to all portals being inaccessible.

    This patch changes this behavior and if we cannot connect to a target
    but we have portals in 'connecting' state we wait for the next retry of
    the nvme linux driver. Specifically we wait 10 more seconds that the
    interval between retries.

[1]: https://bugs.launchpad.net/nova/+bug/2035375

Closes-Bug: #2035695
Change-Id: Ife710f52c339d67f2dcb160c20ad0d75480a1f48

Changed in os-brick:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-10: Fix proposed to os-brick (stable/2023.2)

#3

Fix proposed to branch: stable/2023.2
Review: https://review.opendev.org/c/openstack/os-brick/+/905230

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-12: Fix included in openstack/os-brick 6.6.0

#4

This issue was fixed in the openstack/os-brick 6.6.0 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-16: Fix merged to os-brick (stable/2023.2)

#5

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/905230
Committed: https://opendev.org/openstack/os-brick/commit/7419306d2669568b8ef1aac6283e680da841d82f
Submitter: "Zuul (22348)"
Branch: stable/2023.2

commit 7419306d2669568b8ef1aac6283e680da841d82f
Author: Gorka Eguileor <email address hidden>
Date: Thu Sep 14 12:19:26 2023 +0200

NVMe-oF: Fix attach when reconnecting

When an nvme subsystem has all portals in connecting state and we try
to attach a new volume to that same subsystem it will fail.

    We can reproduce it with LVM+nvmet if we configure it to share targets
    and then:
    - Create instance
    - Attach 2 volumes
    - Delete instance (this leaves the subsystem in connecting state [1])
    - Create instance
    - Attach volume <== FAILS

    The problem comes from the '_connect_target' method that ignores
    subsystems in 'connecting' state, so if they are all in that state it
    considers it equivalent to all portals being inaccessible.

    This patch changes this behavior and if we cannot connect to a target
    but we have portals in 'connecting' state we wait for the next retry of
    the nvme linux driver. Specifically we wait 10 more seconds that the
    interval between retries.

[1]: https://bugs.launchpad.net/nova/+bug/2035375

    Closes-Bug: #2035695
    Change-Id: Ife710f52c339d67f2dcb160c20ad0d75480a1f48
    (cherry picked from commit ec22c32de6820184d7737c5af70e573c0634cd38)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-17: Fix proposed to os-brick (stable/2023.1)

#6

Fix proposed to branch: stable/2023.1
Review: https://review.opendev.org/c/openstack/os-brick/+/905991

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-01-18: Fix merged to os-brick (stable/2023.1)

#7

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/905991
Committed: https://opendev.org/openstack/os-brick/commit/c0fded9fcd6bd58f883868840df5d932a68b6bad
Submitter: "Zuul (22348)"
Branch: stable/2023.1

commit c0fded9fcd6bd58f883868840df5d932a68b6bad
Author: Gorka Eguileor <email address hidden>
Date: Thu Sep 14 12:19:26 2023 +0200

NVMe-oF: Fix attach when reconnecting

When an nvme subsystem has all portals in connecting state and we try
to attach a new volume to that same subsystem it will fail.

    We can reproduce it with LVM+nvmet if we configure it to share targets
    and then:
    - Create instance
    - Attach 2 volumes
    - Delete instance (this leaves the subsystem in connecting state [1])
    - Create instance
    - Attach volume <== FAILS

    The problem comes from the '_connect_target' method that ignores
    subsystems in 'connecting' state, so if they are all in that state it
    considers it equivalent to all portals being inaccessible.

    This patch changes this behavior and if we cannot connect to a target
    but we have portals in 'connecting' state we wait for the next retry of
    the nvme linux driver. Specifically we wait 10 more seconds that the
    interval between retries.

[1]: https://bugs.launchpad.net/nova/+bug/2035375

    Closes-Bug: #2035695
    Change-Id: Ife710f52c339d67f2dcb160c20ad0d75480a1f48
    (cherry picked from commit ec22c32de6820184d7737c5af70e573c0634cd38)
    (cherry picked from commit 7419306d2669568b8ef1aac6283e680da841d82f)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-02-23: Fix included in openstack/os-brick 6.4.1

#8

This issue was fixed in the openstack/os-brick 6.4.1 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2024-02-23: Fix included in openstack/os-brick 6.2.3

#9

This issue was fixed in the openstack/os-brick 6.2.3 release.

os-brick

NVMe-oF cannot connect

Bug Description

Other bug subscribers

Remote bug watches