Bug #1414895 “failed compute node didn't delete instance's path ...” : Bugs : OpenStack Compute (nova)

Revision history for this message

Sean Dague (sdague) wrote on 2015-03-24:

#1

This seems to be a ceph specific issue, marking as low

Changed in nova:
status:	New → Confirmed
importance:	Undecided → Low
tags:	added: ceph

Bartosz Fic (bartosz-fic) on 2015-03-31

Changed in nova:
assignee:	nobody → Bartosz Fic (bartosz-fic)

Pawel Koniszewski (pawel-koniszewski) on 2015-06-01

tags:

added: live-migrate

Pawel Koniszewski (pawel-koniszewski) on 2015-10-02

Changed in nova:
assignee:	Bartosz Fic (bartosz-fic) → nobody

Paul Murray (pmurray) on 2015-11-06

tags:

added: live-migration
removed: live-migrate

lvmxh (shaohef) on 2015-11-17

Changed in nova:
assignee:	nobody → lvmxh (shaohef)

lvmxh (shaohef) on 2015-12-03

Changed in nova:
assignee:	lvmxh (shaohef) → nobody

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-02-18: Fix proposed to nova (master)

#2

Fix proposed to branch: master
Review: https://review.openstack.org/281913

Changed in nova:
assignee:	nobody → Lee Yarwood (lyarwood)
status:	Confirmed → In Progress

Revision history for this message

Timofey Durakov (tdurakov) wrote on 2016-02-20:

#3

This problem isn't affect live migration because there are 2 similar methods for checking shared storage, one described here is used during revert resize and cleanup compute after fail. So removing live-migrate tag.
P.S. this check duplication should be refactored.

tags:

removed: live-migration

Revision history for this message

Timofey Durakov (tdurakov) wrote on 2016-02-20:

#4

oh, side effect for live-migration process.

tags:

added: live-migration

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-03-15: Change abandoned on nova (master)

#5

Change abandoned by Lee Yarwood (<email address hidden>) on branch: master
Review: https://review.openstack.org/281913

Revision history for this message

Markus Zoeller (markus_z) (mzoeller) wrote on 2016-04-05:

#6

@Lee Yarwood: Are you still working on this?

Revision history for this message

Lee Yarwood (lyarwood) wrote on 2016-04-06:

#7

Yes but I see that @tdurakov also has a review up re-factoring much of this code [1]. I'll review this and reassign if required.

[1] https://review.openstack.org/#/c/280653/

Revision history for this message

Augustina Ragwitz (auggy) wrote on 2016-06-16:

#8

I left a comment on that review to include this bug in that "closes bug" list. If that's inaccurate then please feel free to pipe up and clarify.

Revision history for this message

Sean Dague (sdague) wrote on 2017-06-23:

#9

There are no currently open reviews on this bug, changing
the status back to the previous state and unassigning. If
there are active reviews related to this bug, please include
links in comments.

Changed in nova:
status:	In Progress → Confirmed
assignee:	Lee Yarwood (lyarwood) → nobody

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-06-27: Fix proposed to nova (master)

#10

Fix proposed to branch: master
Review: https://review.openstack.org/477780

Changed in nova:
assignee:	nobody → Jeffrey Zhang (jeffrey4l)
status:	Confirmed → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-11-16:

#11

Fix proposed to branch: master
Review: https://review.openstack.org/618478

Changed in nova:
assignee:	Jeffrey Zhang (jeffrey4l) → Lee Yarwood (lyarwood)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-01-02: Fix proposed to nova (stable/rocky)

#12

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/627958

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-01-04: Fix merged to nova (master)

#13

Reviewed: https://review.openstack.org/618478
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d6c1f6a1032ed2ea99f3d8b70ccf38065163d785
Submitter: Zuul
Branch: master

commit d6c1f6a1032ed2ea99f3d8b70ccf38065163d785
Author: Lee Yarwood <email address hidden>
Date: Mon Dec 3 09:03:26 2018 +0000

libvirt: Add workaround to cleanup instance dir when using rbd

    At present all virt drivers provide a cleanup method that takes a single
    destroy_disks boolean to indicate when the underlying storage of an
    instance should be destroyed.

    When cleaning up after an evacuation or revert resize the value of
    destroy_disks is determined by the compute layer calling down both into
    the check_instance_shared_storage_local method of the local virt driver
    and remote check_instance_shared_storage method of the virt driver on
    the host now running the instance.

    For the Libvirt driver the initial local call will return None when
    using the shared block RBD imagebackend as it is assumed all instance
    storage is shared resulting in destroy_disks always being False when
    cleaning up. This behaviour is wrong as the instance disks are stored
    separately to the instance directory that still needs to be cleaned up
    on the host. Additionally this directory could also be shared
    independently of the disks on a NFS share for example and would need to
    also be checked before removal.

    This change introduces a backportable workaround configurable for the
    Libvirt driver with which operators can ensure that the instance
    directory is always removed during cleanup when using the RBD
    imagebackend. When enabling this workaround operators will need to
    ensure that the instance directories are not shared between computes.

    Future work will allow for the removal of this workaround by separating
    the shared storage checks from the compute to virt layers between the
    actual instance disks and any additional storage required by the
    specific virt backend.

    Related-Bug: #1761062
    Partial-Bug: #1414895
    Change-Id: I8fd6b9f857a1c4919c3365951e2652d2d477df77

Reviewed:  https://review.openstack.org/618478
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d6c1f6a1032ed2ea99f3d8b70ccf38065163d785
Submitter: Zuul
Branch:    master

commit d6c1f6a1032ed2ea99f3d8b70ccf38065163d785
Author: Lee Yarwood <lyarwood@redhat.com>
Date:   Mon Dec 3 09:03:26 2018 +0000

libvirt: Add workaround to cleanup instance dir when using rbd
    
    At present all virt drivers provide a cleanup method that takes a single
    destroy_disks boolean to indicate when the underlying storage of an
    instance should be destroyed.
    
    When cleaning up after an evacuation or revert resize the value of
    destroy_disks is determined by the compute layer calling down both into
    the check_instance_shared_storage_local method of the local virt driver
    and remote check_instance_shared_storage method of the virt driver on
    the host now running the instance.
    
    For the Libvirt driver the initial local call will return None when
    using the shared block RBD imagebackend as it is assumed all instance
    storage is shared resulting in destroy_disks always being False when
    cleaning up. This behaviour is wrong as the instance disks are stored
    separately to the instance directory that still needs to be cleaned up
    on the host. Additionally this directory could also be shared
    independently of the disks on a NFS share for example and would need to
    also be checked before removal.
    
    This change introduces a backportable workaround configurable for the
    Libvirt driver with which operators can ensure that the instance
    directory is always removed during cleanup when using the RBD
    imagebackend. When enabling this workaround operators will need to
    ensure that the instance directories are not shared between computes.
    
    Future work will allow for the removal of this workaround by separating
    the shared storage checks from the compute to virt layers between the
    actual instance disks and any additional storage required by the
    specific virt backend.
    
    Related-Bug: #1761062
    Partial-Bug: #1414895
    Change-Id: I8fd6b9f857a1c4919c3365951e2652d2d477df77

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-01-05: Fix proposed to nova (stable/queens)

#14

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/628726

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-01-16: Fix merged to nova (stable/rocky)

#15

Reviewed: https://review.openstack.org/627958
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8c678ae57299076a5013f0be985621e064acfee0
Submitter: Zuul
Branch: stable/rocky

commit 8c678ae57299076a5013f0be985621e064acfee0
Author: Lee Yarwood <email address hidden>
Date: Mon Dec 3 09:03:26 2018 +0000

libvirt: Add workaround to cleanup instance dir when using rbd

    At present all virt drivers provide a cleanup method that takes a single
    destroy_disks boolean to indicate when the underlying storage of an
    instance should be destroyed.

    When cleaning up after an evacuation or revert resize the value of
    destroy_disks is determined by the compute layer calling down both into
    the check_instance_shared_storage_local method of the local virt driver
    and remote check_instance_shared_storage method of the virt driver on
    the host now running the instance.

    For the Libvirt driver the initial local call will return None when
    using the shared block RBD imagebackend as it is assumed all instance
    storage is shared resulting in destroy_disks always being False when
    cleaning up. This behaviour is wrong as the instance disks are stored
    separately to the instance directory that still needs to be cleaned up
    on the host. Additionally this directory could also be shared
    independently of the disks on a NFS share for example and would need to
    also be checked before removal.

    This change introduces a backportable workaround configurable for the
    Libvirt driver with which operators can ensure that the instance
    directory is always removed during cleanup when using the RBD
    imagebackend. When enabling this workaround operators will need to
    ensure that the instance directories are not shared between computes.

    Future work will allow for the removal of this workaround by separating
    the shared storage checks from the compute to virt layers between the
    actual instance disks and any additional storage required by the
    specific virt backend.

    Related-Bug: #1761062
    Partial-Bug: #1414895
    Change-Id: I8fd6b9f857a1c4919c3365951e2652d2d477df77
    (cherry picked from commit d6c1f6a1032ed2ea99f3d8b70ccf38065163d785)

Reviewed:  https://review.openstack.org/627958
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8c678ae57299076a5013f0be985621e064acfee0
Submitter: Zuul
Branch:    stable/rocky

commit 8c678ae57299076a5013f0be985621e064acfee0
Author: Lee Yarwood <lyarwood@redhat.com>
Date:   Mon Dec 3 09:03:26 2018 +0000

libvirt: Add workaround to cleanup instance dir when using rbd
    
    At present all virt drivers provide a cleanup method that takes a single
    destroy_disks boolean to indicate when the underlying storage of an
    instance should be destroyed.
    
    When cleaning up after an evacuation or revert resize the value of
    destroy_disks is determined by the compute layer calling down both into
    the check_instance_shared_storage_local method of the local virt driver
    and remote check_instance_shared_storage method of the virt driver on
    the host now running the instance.
    
    For the Libvirt driver the initial local call will return None when
    using the shared block RBD imagebackend as it is assumed all instance
    storage is shared resulting in destroy_disks always being False when
    cleaning up. This behaviour is wrong as the instance disks are stored
    separately to the instance directory that still needs to be cleaned up
    on the host. Additionally this directory could also be shared
    independently of the disks on a NFS share for example and would need to
    also be checked before removal.
    
    This change introduces a backportable workaround configurable for the
    Libvirt driver with which operators can ensure that the instance
    directory is always removed during cleanup when using the RBD
    imagebackend. When enabling this workaround operators will need to
    ensure that the instance directories are not shared between computes.
    
    Future work will allow for the removal of this workaround by separating
    the shared storage checks from the compute to virt layers between the
    actual instance disks and any additional storage required by the
    specific virt backend.
    
    Related-Bug: #1761062
    Partial-Bug: #1414895
    Change-Id: I8fd6b9f857a1c4919c3365951e2652d2d477df77
    (cherry picked from commit d6c1f6a1032ed2ea99f3d8b70ccf38065163d785)

tags:

added: in-stable-rocky

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-03-22: Fix merged to nova (stable/queens)

#16

Reviewed: https://review.openstack.org/628726
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b7bf1fbe4917c285f7bb635e791204d67b809049
Submitter: Zuul
Branch: stable/queens

commit b7bf1fbe4917c285f7bb635e791204d67b809049
Author: Lee Yarwood <email address hidden>
Date: Mon Dec 3 09:03:26 2018 +0000

libvirt: Add workaround to cleanup instance dir when using rbd

    At present all virt drivers provide a cleanup method that takes a single
    destroy_disks boolean to indicate when the underlying storage of an
    instance should be destroyed.

    When cleaning up after an evacuation or revert resize the value of
    destroy_disks is determined by the compute layer calling down both into
    the check_instance_shared_storage_local method of the local virt driver
    and remote check_instance_shared_storage method of the virt driver on
    the host now running the instance.

    For the Libvirt driver the initial local call will return None when
    using the shared block RBD imagebackend as it is assumed all instance
    storage is shared resulting in destroy_disks always being False when
    cleaning up. This behaviour is wrong as the instance disks are stored
    separately to the instance directory that still needs to be cleaned up
    on the host. Additionally this directory could also be shared
    independently of the disks on a NFS share for example and would need to
    also be checked before removal.

    This change introduces a backportable workaround configurable for the
    Libvirt driver with which operators can ensure that the instance
    directory is always removed during cleanup when using the RBD
    imagebackend. When enabling this workaround operators will need to
    ensure that the instance directories are not shared between computes.

    Future work will allow for the removal of this workaround by separating
    the shared storage checks from the compute to virt layers between the
    actual instance disks and any additional storage required by the
    specific virt backend.

    NOTE(lyarwood): Conflicts as If1b6e5f20d2ea82d94f5f0550f13189fc9bc16c4
    only merged in Rocky and the backports of
    Id3c74c019da29070811ffc368351e2238b3f6da5 and
    I217fba9138132b107e9d62895d699d238392e761 have yet to land on
    stable/queens from stable/rocky.

Conflicts:
nova/conf/workarounds.py

    Related-Bug: #1761062
    Partial-Bug: #1414895
    Change-Id: I8fd6b9f857a1c4919c3365951e2652d2d477df77
    (cherry picked from commit d6c1f6a1032ed2ea99f3d8b70ccf38065163d785)
    (cherry picked from commit 8c678ae57299076a5013f0be985621e064acfee0)

Reviewed:  https://review.openstack.org/628726
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b7bf1fbe4917c285f7bb635e791204d67b809049
Submitter: Zuul
Branch:    stable/queens

commit b7bf1fbe4917c285f7bb635e791204d67b809049
Author: Lee Yarwood <lyarwood@redhat.com>
Date:   Mon Dec 3 09:03:26 2018 +0000

libvirt: Add workaround to cleanup instance dir when using rbd
    
    At present all virt drivers provide a cleanup method that takes a single
    destroy_disks boolean to indicate when the underlying storage of an
    instance should be destroyed.
    
    When cleaning up after an evacuation or revert resize the value of
    destroy_disks is determined by the compute layer calling down both into
    the check_instance_shared_storage_local method of the local virt driver
    and remote check_instance_shared_storage method of the virt driver on
    the host now running the instance.
    
    For the Libvirt driver the initial local call will return None when
    using the shared block RBD imagebackend as it is assumed all instance
    storage is shared resulting in destroy_disks always being False when
    cleaning up. This behaviour is wrong as the instance disks are stored
    separately to the instance directory that still needs to be cleaned up
    on the host. Additionally this directory could also be shared
    independently of the disks on a NFS share for example and would need to
    also be checked before removal.
    
    This change introduces a backportable workaround configurable for the
    Libvirt driver with which operators can ensure that the instance
    directory is always removed during cleanup when using the RBD
    imagebackend. When enabling this workaround operators will need to
    ensure that the instance directories are not shared between computes.
    
    Future work will allow for the removal of this workaround by separating
    the shared storage checks from the compute to virt layers between the
    actual instance disks and any additional storage required by the
    specific virt backend.
    
    NOTE(lyarwood): Conflicts as If1b6e5f20d2ea82d94f5f0550f13189fc9bc16c4
    only merged in Rocky and the backports of
    Id3c74c019da29070811ffc368351e2238b3f6da5 and
    I217fba9138132b107e9d62895d699d238392e761 have yet to land on
    stable/queens from stable/rocky.
    
    Conflicts:
            nova/conf/workarounds.py
    
    Related-Bug: #1761062
    Partial-Bug: #1414895
    Change-Id: I8fd6b9f857a1c4919c3365951e2652d2d477df77
    (cherry picked from commit d6c1f6a1032ed2ea99f3d8b70ccf38065163d785)
    (cherry picked from commit 8c678ae57299076a5013f0be985621e064acfee0)

tags:

added: in-stable-queens

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-07-02: Change abandoned on nova (master)

#17

Change abandoned by Jeffrey Zhang (<email address hidden>) on branch: master
Review: https://review.opendev.org/477780

Lee Yarwood (lyarwood) on 2020-09-15

Changed in nova:
status:	In Progress → Fix Released

OpenStack Compute (nova)

failed compute node didn't delete instance's path directory in init_host

Bug Description

Duplicates of this bug

Other bug subscribers

Remote bug watches

	Status	Importance	Assigned to
OpenStack Compute (nova)	Fix Released	Low	Lee Yarwood
Queens	Fix Released	Low	Lee Yarwood
Rocky	Fix Released	Low	Lee Yarwood