Bug #1838392 “BDMNotFound raised and stale block devices left ov...” : Bugs : OpenStack Compute (nova)

OpenStack Infra (hudson-openstack) on 2019-07-30

Changed in nova:
assignee:	nobody → Lee Yarwood (lyarwood)
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-11-26: Fix merged to nova (master)

#1

Reviewed: https://review.opendev.org/673463
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=9ad54f3dacbd372271f441baea5380f913072dde
Submitter: Zuul
Branch: master

commit 9ad54f3dacbd372271f441baea5380f913072dde
Author: Lee Yarwood <email address hidden>
Date: Mon Jul 29 16:25:45 2019 +0100

compute: Take an instance.uuid lock when rebooting

Previously simultaneous requests to reboot and delete an instance could
race as only the latter took a lock against the uuid of the instance.

    With the Libvirt driver this race could potentially result in attempts
    being made to reconnect previously disconnected volumes on the host.
    Depending on the volume backend being used this could then result in
    stale block devices point to unmapped volumes being left on the host
    that in turn could cause failures later on when connecting newly mapped
    volumes.

    This change avoids this race by ensuring any request to reboot an
    instance takes an instance.uuid lock within the compute manager,
    serialising requests to reboot and then delete the instance.

Closes-Bug: #1838392
Change-Id: Ieb59de10c63bb067f92ec054535766cdd722dae2

Changed in nova:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-11-26: Fix proposed to nova (stable/train)

#2

Fix proposed to branch: stable/train
Review: https://review.opendev.org/696151

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-11-26: Fix proposed to nova (stable/stein)

#3

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/696152

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-11-26: Fix proposed to nova (stable/rocky)

#4

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/696153

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-11-26: Fix proposed to nova (stable/queens)

#5

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/696154

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-12-06: Fix merged to nova (stable/train)

#6

Reviewed: https://review.opendev.org/696151
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=939cd9b177db8f12952e72145a5c00a0574959eb
Submitter: Zuul
Branch: stable/train

commit 939cd9b177db8f12952e72145a5c00a0574959eb
Author: Lee Yarwood <email address hidden>
Date: Mon Jul 29 16:25:45 2019 +0100

compute: Take an instance.uuid lock when rebooting

Previously simultaneous requests to reboot and delete an instance could
race as only the latter took a lock against the uuid of the instance.

    With the Libvirt driver this race could potentially result in attempts
    being made to reconnect previously disconnected volumes on the host.
    Depending on the volume backend being used this could then result in
    stale block devices point to unmapped volumes being left on the host
    that in turn could cause failures later on when connecting newly mapped
    volumes.

    This change avoids this race by ensuring any request to reboot an
    instance takes an instance.uuid lock within the compute manager,
    serialising requests to reboot and then delete the instance.

    Closes-Bug: #1838392
    Change-Id: Ieb59de10c63bb067f92ec054535766cdd722dae2
    (cherry picked from commit 9ad54f3dacbd372271f441baea5380f913072dde)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-12-06: Fix merged to nova (stable/stein)

#7

Reviewed: https://review.opendev.org/696152
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=304d3f62a4e3bdbaab6fe7dd665174bc5b696d08
Submitter: Zuul
Branch: stable/stein

commit 304d3f62a4e3bdbaab6fe7dd665174bc5b696d08
Author: Lee Yarwood <email address hidden>
Date: Mon Jul 29 16:25:45 2019 +0100

compute: Take an instance.uuid lock when rebooting

Previously simultaneous requests to reboot and delete an instance could
race as only the latter took a lock against the uuid of the instance.

    With the Libvirt driver this race could potentially result in attempts
    being made to reconnect previously disconnected volumes on the host.
    Depending on the volume backend being used this could then result in
    stale block devices point to unmapped volumes being left on the host
    that in turn could cause failures later on when connecting newly mapped
    volumes.

    This change avoids this race by ensuring any request to reboot an
    instance takes an instance.uuid lock within the compute manager,
    serialising requests to reboot and then delete the instance.

    Closes-Bug: #1838392
    Change-Id: Ieb59de10c63bb067f92ec054535766cdd722dae2
    (cherry picked from commit 9ad54f3dacbd372271f441baea5380f913072dde)
    (cherry picked from commit 939cd9b177db8f12952e72145a5c00a0574959eb)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-12-07: Fix merged to nova (stable/rocky)

#8

Reviewed: https://review.opendev.org/696153
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7d14b6a5170821c65e55d7b39ccf4419a81640f8
Submitter: Zuul
Branch: stable/rocky

commit 7d14b6a5170821c65e55d7b39ccf4419a81640f8
Author: Lee Yarwood <email address hidden>
Date: Mon Jul 29 16:25:45 2019 +0100

compute: Take an instance.uuid lock when rebooting

Previously simultaneous requests to reboot and delete an instance could
race as only the latter took a lock against the uuid of the instance.

    With the Libvirt driver this race could potentially result in attempts
    being made to reconnect previously disconnected volumes on the host.
    Depending on the volume backend being used this could then result in
    stale block devices point to unmapped volumes being left on the host
    that in turn could cause failures later on when connecting newly mapped
    volumes.

    This change avoids this race by ensuring any request to reboot an
    instance takes an instance.uuid lock within the compute manager,
    serialising requests to reboot and then delete the instance.

    Closes-Bug: #1838392
    Change-Id: Ieb59de10c63bb067f92ec054535766cdd722dae2
    (cherry picked from commit 9ad54f3dacbd372271f441baea5380f913072dde)
    (cherry picked from commit 939cd9b177db8f12952e72145a5c00a0574959eb)
    (cherry picked from commit 304d3f62a4e3bdbaab6fe7dd665174bc5b696d08)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-12-07: Fix merged to nova (stable/queens)

#9

Reviewed: https://review.opendev.org/696154
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=16fb8ac3f4c2fe94ae83d65fbcf6f49665a0dd60
Submitter: Zuul
Branch: stable/queens

commit 16fb8ac3f4c2fe94ae83d65fbcf6f49665a0dd60
Author: Lee Yarwood <email address hidden>
Date: Mon Jul 29 16:25:45 2019 +0100

compute: Take an instance.uuid lock when rebooting

Previously simultaneous requests to reboot and delete an instance could
race as only the latter took a lock against the uuid of the instance.

    With the Libvirt driver this race could potentially result in attempts
    being made to reconnect previously disconnected volumes on the host.
    Depending on the volume backend being used this could then result in
    stale block devices point to unmapped volumes being left on the host
    that in turn could cause failures later on when connecting newly mapped
    volumes.

    This change avoids this race by ensuring any request to reboot an
    instance takes an instance.uuid lock within the compute manager,
    serialising requests to reboot and then delete the instance.

    Closes-Bug: #1838392
    Change-Id: Ieb59de10c63bb067f92ec054535766cdd722dae2
    (cherry picked from commit 9ad54f3dacbd372271f441baea5380f913072dde)
    (cherry picked from commit 939cd9b177db8f12952e72145a5c00a0574959eb)
    (cherry picked from commit 304d3f62a4e3bdbaab6fe7dd665174bc5b696d08)
    (cherry picked from commit 7d14b6a5170821c65e55d7b39ccf4419a81640f8)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-02-06: Fix included in openstack/nova 20.1.0

#10

This issue was fixed in the openstack/nova 20.1.0 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-02-06: Fix included in openstack/nova 19.1.0

#11

This issue was fixed in the openstack/nova 19.1.0 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2020-02-25: Fix included in openstack/nova 18.3.0

#12

This issue was fixed in the openstack/nova 18.3.0 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-06-28: Related fix proposed to nova (master)

#13

Related fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/847965

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-11-11: Fix included in openstack/nova queens-eol

#14

This issue was fixed in the openstack/nova queens-eol release.

	Status	Importance	Assigned to
OpenStack Compute (nova)	Fix Released	Undecided	Lee Yarwood
Queens	Fix Released	Undecided	Lee Yarwood
Rocky	Fix Committed	Undecided	Lee Yarwood
Stein	Fix Committed	Undecided	Lee Yarwood
Train	Fix Committed	Undecided	Lee Yarwood

OpenStack Compute (nova)

BDMNotFound raised and stale block devices left over when simultaneously reboot and deleting an instance

Bug Description

Other bug subscribers

Remote bug watches