multipath: multipathd doesn't remove path devies timely, and orphan paths can prevent a multipath device creation later

Bug #1924652 reported by Takashi Kajinami
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
os-brick
Fix Released
Undecided
Takashi Kajinami

Bug Description

This issue was initially reported in the following downstream bug.
 https://bugzilla.redhat.com/show_bug.cgi?id=1950172

We noticed that after hard rebooting an instance, sometimes the instance uses a single iscsi device instead of mutlipath device.
We confirmed that there are no errors or problems with iscsi device attachment but mutlipath device(dm-X) is not created even though "multipathd add" command succeeds, and os-brick decides to use a single path device because dm-X is not available.

After investigation and discussion with engineers covering multipathd, we found the following situation.
 - Recent multipathd delays path removal when it receives burst of udev events.
 - When os-brick detaches a multipath volume, it flushes a multipath device then removes the path devices directly in a short time. This is likely to cause "burst" of udev events
 - Multipathd delays path removal, but volume attachment process started very shortly. A multipath device is again created but because old orphan paths are not removed at this moment multipathd rejects to create a multipath device.

Because os-brick requires very timely device removal, it should not rely on multipathd to remove device paths based on udev events but explicitly request to remove paths when detaching a device.

Changed in os-brick:
status: New → In Progress
Revision history for this message
Takashi Kajinami (kajinamit) wrote :
Changed in os-brick:
assignee: nobody → Takashi Kajinami (kajinamit)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (master)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/785818
Committed: https://opendev.org/openstack/os-brick/commit/1b2e2295421615847d86508dcd487ec51fa45f25
Submitter: "Zuul (22348)"
Branch: master

commit 1b2e2295421615847d86508dcd487ec51fa45f25
Author: Takashi Kajinami <email address hidden>
Date: Mon Apr 12 14:07:46 2021 +0900

    multipath/iscsi: remove devices from multipath monitoring

    Recent multipathd doesn't remove path devices timely when it receives
    burst of udev events but wait for a while to start actual removal.
    Because os-brick removes path devices in a short time during detaching
    a multipath device, it is likely to hit this burst limit and sometimes
    path devices are not removed before a subsequent operation is started.

    This change ensures that os-brick tells mutlipathd to remove path
    devices when the devices should be deleted, so that orphan paths are
    not left when starting a subsequent attach operation.

    Closes-Bug: #1924652
    Change-Id: I65204aa7495740dc1545bff2c5c485a8041e7930

Changed in os-brick:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/os-brick/+/788599

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/os-brick/+/788793

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/os-brick/+/788794

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to os-brick (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/os-brick/+/788795

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/788599
Committed: https://opendev.org/openstack/os-brick/commit/0cd58a9b0a5520e184a7d3ef45f03ed8cda93732
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 0cd58a9b0a5520e184a7d3ef45f03ed8cda93732
Author: Takashi Kajinami <email address hidden>
Date: Mon Apr 12 14:07:46 2021 +0900

    multipath/iscsi: remove devices from multipath monitoring

    Recent multipathd doesn't remove path devices timely when it receives
    burst of udev events but wait for a while to start actual removal.
    Because os-brick removes path devices in a short time during detaching
    a multipath device, it is likely to hit this burst limit and sometimes
    path devices are not removed before a subsequent operation is started.

    This change ensures that os-brick tells mutlipathd to remove path
    devices when the devices should be deleted, so that orphan paths are
    not left when starting a subsequent attach operation.

    Closes-Bug: #1924652
    Change-Id: I65204aa7495740dc1545bff2c5c485a8041e7930
    (cherry picked from commit 1b2e2295421615847d86508dcd487ec51fa45f25)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/788793
Committed: https://opendev.org/openstack/os-brick/commit/a0e995b1308b9a188e8719fdc6d6bad15508dba6
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit a0e995b1308b9a188e8719fdc6d6bad15508dba6
Author: Takashi Kajinami <email address hidden>
Date: Mon Apr 12 14:07:46 2021 +0900

    multipath/iscsi: remove devices from multipath monitoring

    Recent multipathd doesn't remove path devices timely when it receives
    burst of udev events but wait for a while to start actual removal.
    Because os-brick removes path devices in a short time during detaching
    a multipath device, it is likely to hit this burst limit and sometimes
    path devices are not removed before a subsequent operation is started.

    This change ensures that os-brick tells mutlipathd to remove path
    devices when the devices should be deleted, so that orphan paths are
    not left when starting a subsequent attach operation.

    Closes-Bug: #1924652
    Change-Id: I65204aa7495740dc1545bff2c5c485a8041e7930
    (cherry picked from commit 1b2e2295421615847d86508dcd487ec51fa45f25)
    (cherry picked from commit 0cd58a9b0a5520e184a7d3ef45f03ed8cda93732)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/788794
Committed: https://opendev.org/openstack/os-brick/commit/5c031cb2d9e9b10a33c82586d11139594c96a25f
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 5c031cb2d9e9b10a33c82586d11139594c96a25f
Author: Takashi Kajinami <email address hidden>
Date: Mon Apr 12 14:07:46 2021 +0900

    multipath/iscsi: remove devices from multipath monitoring

    Recent multipathd doesn't remove path devices timely when it receives
    burst of udev events but wait for a while to start actual removal.
    Because os-brick removes path devices in a short time during detaching
    a multipath device, it is likely to hit this burst limit and sometimes
    path devices are not removed before a subsequent operation is started.

    This change ensures that os-brick tells mutlipathd to remove path
    devices when the devices should be deleted, so that orphan paths are
    not left when starting a subsequent attach operation.

    Closes-Bug: #1924652
    Change-Id: I65204aa7495740dc1545bff2c5c485a8041e7930
    (cherry picked from commit 1b2e2295421615847d86508dcd487ec51fa45f25)
    (cherry picked from commit 0cd58a9b0a5520e184a7d3ef45f03ed8cda93732)
    (cherry picked from commit a0e995b1308b9a188e8719fdc6d6bad15508dba6)

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to os-brick (stable/train)

Reviewed: https://review.opendev.org/c/openstack/os-brick/+/788795
Committed: https://opendev.org/openstack/os-brick/commit/30244b175cac6e643915d762a80b8c8f5848f256
Submitter: "Zuul (22348)"
Branch: stable/train

commit 30244b175cac6e643915d762a80b8c8f5848f256
Author: Takashi Kajinami <email address hidden>
Date: Mon Apr 12 14:07:46 2021 +0900

    multipath/iscsi: remove devices from multipath monitoring

    Recent multipathd doesn't remove path devices timely when it receives
    burst of udev events but wait for a while to start actual removal.
    Because os-brick removes path devices in a short time during detaching
    a multipath device, it is likely to hit this burst limit and sometimes
    path devices are not removed before a subsequent operation is started.

    This change ensures that os-brick tells mutlipathd to remove path
    devices when the devices should be deleted, so that orphan paths are
    not left when starting a subsequent attach operation.

    Closes-Bug: #1924652
    Change-Id: I65204aa7495740dc1545bff2c5c485a8041e7930
    (cherry picked from commit 1b2e2295421615847d86508dcd487ec51fa45f25)
    (cherry picked from commit 0cd58a9b0a5520e184a7d3ef45f03ed8cda93732)
    (cherry picked from commit a0e995b1308b9a188e8719fdc6d6bad15508dba6)
    (cherry picked from commit 5c031cb2d9e9b10a33c82586d11139594c96a25f)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 2.10.7

This issue was fixed in the openstack/os-brick 2.10.7 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 4.4.0

This issue was fixed in the openstack/os-brick 4.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 3.0.7

This issue was fixed in the openstack/os-brick 3.0.7 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 4.3.2

This issue was fixed in the openstack/os-brick 4.3.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/os-brick 4.0.4

This issue was fixed in the openstack/os-brick 4.0.4 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.