Resize a boot-from-volume instance with NFS destroys instance
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| OpenStack Compute (nova) |
High
|
Matt Riedemann | ||
| Ocata |
High
|
Matt Riedemann | ||
| Pike |
High
|
Matt Riedemann |
Bug Description
Turns out that the fix for https:/
https:/
if os.path.
try:
except OSError as e:
if e.errno != errno.ENOENT:
Causes the instance basedir which includes the instances libvirt.XML file to be deleted.
The above needs to be changed to this in order to prevent BFV instances from being destroyed on resize...
if os.path.
This bug was reported and the fix confirmed by Joris S'heeren
Matt Riedemann (mriedem) wrote : | #1 |
Matt Riedemann (mriedem) wrote : | #2 |
Note that we do run an NFS-based job in the nova experimental CI queue but I'd have to check if it runs any resize tests - it probably should, and those are likely broken.
Matt Riedemann (mriedem) wrote : | #3 |
So we do run resize tests in the NFS job:
But they don't fail on the regression, I'm assuming because the Tempest test doesn't actually try to do anything with the instance once the resize is confirmed? Or because it's not a boot from volume scenario?
tags: | added: libvirt |
Jay Pipes (jaypipes) wrote : | #4 |
Yeah, it's the boot-from-volume thing that's the difference I believe.
description: | updated |
Matt Riedemann (mriedem) wrote : | #5 |
I can't say I understand this really - why would the instance base directory that contains the libvirt domain xml files for the guest, which are ephemeral per compute host, be the same location as the root disk that is backed by the Cinder volume? Is the local ephemeral disk on the compute host the same NFS storage as the Cinder volumes?
Matt Riedemann (mriedem) wrote : | #6 |
Oh nevermind I think I understand, the root disk in the volume isn't the same storage, the problem is that because the root disk is in a cinder volume, "root_disk.
Fix proposed to branch: master
Review: https:/
Changed in nova: | |
assignee: | nobody → Matt Riedemann (mriedem) |
status: | Confirmed → In Progress |
Matt Riedemann (mriedem) wrote : | #8 |
Reproduced with a Tempest test in the NFS job here:
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit f02afc6569bd930
Author: Matt Riedemann <email address hidden>
Date: Mon Oct 30 12:49:31 2017 -0400
libvirt: do not remove inst_base when volume-backed during resize
When confirming a resize, the libvirt driver on the source host checks
to see if the instance base directory (which contains the domain xml
files, etc) exists and if the root disk image does not, it removes the
instance base directory.
However, the root image disk won't exist on local storage for a
volume-backed instance and if the instance base directory is on shared
storage, e.g. NFS or Ceph, between the source and destination host, the
instance base directory is incorrectly deleted.
This adds a check to see if the instance is volume-backed when checking
to see if the instance base directory should be removed from the source
host when confirming a resize.
Change-Id: I29fac80d08baf6
Closes-Bug: #1728603
Changed in nova: | |
status: | In Progress → Fix Released |
Fix proposed to branch: stable/pike
Review: https:/
Fix proposed to branch: stable/ocata
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/pike
commit d5d81a29050966f
Author: Matt Riedemann <email address hidden>
Date: Mon Oct 30 12:49:31 2017 -0400
libvirt: do not remove inst_base when volume-backed during resize
When confirming a resize, the libvirt driver on the source host checks
to see if the instance base directory (which contains the domain xml
files, etc) exists and if the root disk image does not, it removes the
instance base directory.
However, the root image disk won't exist on local storage for a
volume-backed instance and if the instance base directory is on shared
storage, e.g. NFS or Ceph, between the source and destination host, the
instance base directory is incorrectly deleted.
This adds a check to see if the instance is volume-backed when checking
to see if the instance base directory should be removed from the source
host when confirming a resize.
Change-Id: I29fac80d08baf6
Closes-Bug: #1728603
(cherry picked from commit f02afc6569bd930
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/ocata
commit 64f773ab96619b9
Author: Matt Riedemann <email address hidden>
Date: Mon Oct 30 12:49:31 2017 -0400
libvirt: do not remove inst_base when volume-backed during resize
When confirming a resize, the libvirt driver on the source host checks
to see if the instance base directory (which contains the domain xml
files, etc) exists and if the root disk image does not, it removes the
instance base directory.
However, the root image disk won't exist on local storage for a
volume-backed instance and if the instance base directory is on shared
storage, e.g. NFS or Ceph, between the source and destination host, the
instance base directory is incorrectly deleted.
This adds a check to see if the instance is volume-backed when checking
to see if the instance base directory should be removed from the source
host when confirming a resize.
Change-Id: I29fac80d08baf6
Closes-Bug: #1728603
(cherry picked from commit f02afc6569bd930
(cherry picked from commit d5d81a29050966f
This issue was fixed in the openstack/nova 15.0.8 release.
This issue was fixed in the openstack/nova 16.0.3 release.
This issue was fixed in the openstack/nova 17.0.0.0b2 development milestone.
Related fix proposed to branch: master
Review: https:/
Related fix proposed to branch: stable/queens
Review: https:/
Related fix proposed to branch: stable/pike
Review: https:/
Related fix proposed to branch: stable/ocata
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit 8e3385707cb1ced
Author: Matt Riedemann <email address hidden>
Date: Fri May 4 12:58:07 2018 -0400
libvirt: check image type before removing snapshots in _cleanup_resize
Change Ic683f83e428106
cleanup logic to _cleanup_resize because some image backends (Qcow2,
Flat and Ploop) will re-create the instance directory and disk.info
file when initializing the image backend object.
However, that change did not take into account volume-backed instances
being resized will not have a root disk *and* if the local disk is
shared storage, removing the instance directory effectively deletes
the instance files, like the console.log, on the destination host
as well. Change I29fac80d08baf6
to resolve that issue.
However (see the pattern?), if you're doing a resize of a
volume-backed instance that is not on shared storage, we won't remove
the instance directory from the source host in _cleanup_resize. If the
admin then later tries to live migrate the instance back to that host,
it will fail with DestinationDisk
method.
This change is essentially a revert of
I29fac80d08
Ic683f83e42
is that creating certain imagebackend objects will recreate the
instance directory and disk.info on the source host, we simply need
to avoid creating the imagebackend object. The only reason we are
getting an imagebackend object in _cleanup_resize is to remove
image snapshot clones, which is only implemented by the Rbd image
backend. Therefore, we can check to see if the image type supports
clones and if not, don't go through the imagebackend init routine
that, for some, will recreate the disk.
Change-Id: Ib10081150e1259
Closes-Bug: #1769131
Related-Bug: #1666831
Related-Bug: #1728603
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/queens
commit 174764340d3c965
Author: Matt Riedemann <email address hidden>
Date: Fri May 4 12:58:07 2018 -0400
libvirt: check image type before removing snapshots in _cleanup_resize
Change Ic683f83e428106
cleanup logic to _cleanup_resize because some image backends (Qcow2,
Flat and Ploop) will re-create the instance directory and disk.info
file when initializing the image backend object.
However, that change did not take into account volume-backed instances
being resized will not have a root disk *and* if the local disk is
shared storage, removing the instance directory effectively deletes
the instance files, like the console.log, on the destination host
as well. Change I29fac80d08baf6
to resolve that issue.
However (see the pattern?), if you're doing a resize of a
volume-backed instance that is not on shared storage, we won't remove
the instance directory from the source host in _cleanup_resize. If the
admin then later tries to live migrate the instance back to that host,
it will fail with DestinationDisk
method.
This change is essentially a revert of
I29fac80d08
Ic683f83e42
is that creating certain imagebackend objects will recreate the
instance directory and disk.info on the source host, we simply need
to avoid creating the imagebackend object. The only reason we are
getting an imagebackend object in _cleanup_resize is to remove
image snapshot clones, which is only implemented by the Rbd image
backend. Therefore, we can check to see if the image type supports
clones and if not, don't go through the imagebackend init routine
that, for some, will recreate the disk.
Conflicts:
NOTE(mriedem): The conflict is due to not having change
Icdd039bb43
Change-Id: Ib10081150e1259
Closes-Bug: #1769131
Related-Bug: #1666831
Related-Bug: #1728603
(cherry picked from commit 8e3385707cb1ced
tags: | added: in-stable-queens |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/pike
commit c72a0a7665e9621
Author: Matt Riedemann <email address hidden>
Date: Fri May 4 12:58:07 2018 -0400
libvirt: check image type before removing snapshots in _cleanup_resize
Change Ic683f83e428106
cleanup logic to _cleanup_resize because some image backends (Qcow2,
Flat and Ploop) will re-create the instance directory and disk.info
file when initializing the image backend object.
However, that change did not take into account volume-backed instances
being resized will not have a root disk *and* if the local disk is
shared storage, removing the instance directory effectively deletes
the instance files, like the console.log, on the destination host
as well. Change I29fac80d08baf6
to resolve that issue.
However (see the pattern?), if you're doing a resize of a
volume-backed instance that is not on shared storage, we won't remove
the instance directory from the source host in _cleanup_resize. If the
admin then later tries to live migrate the instance back to that host,
it will fail with DestinationDisk
method.
This change is essentially a revert of
I29fac80d08
Ic683f83e42
is that creating certain imagebackend objects will recreate the
instance directory and disk.info on the source host, we simply need
to avoid creating the imagebackend object. The only reason we are
getting an imagebackend object in _cleanup_resize is to remove
image snapshot clones, which is only implemented by the Rbd image
backend. Therefore, we can check to see if the image type supports
clones and if not, don't go through the imagebackend init routine
that, for some, will recreate the disk.
Change-Id: Ib10081150e1259
Closes-Bug: #1769131
Related-Bug: #1666831
Related-Bug: #1728603
(cherry picked from commit 8e3385707cb1ced
(cherry picked from commit 174764340d3c965
tags: | added: in-stable-pike |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/ocata
commit a16fa14ce47bd2d
Author: Matt Riedemann <email address hidden>
Date: Fri May 4 12:58:07 2018 -0400
libvirt: check image type before removing snapshots in _cleanup_resize
Change Ic683f83e428106
cleanup logic to _cleanup_resize because some image backends (Qcow2,
Flat and Ploop) will re-create the instance directory and disk.info
file when initializing the image backend object.
However, that change did not take into account volume-backed instances
being resized will not have a root disk *and* if the local disk is
shared storage, removing the instance directory effectively deletes
the instance files, like the console.log, on the destination host
as well. Change I29fac80d08baf6
to resolve that issue.
However (see the pattern?), if you're doing a resize of a
volume-backed instance that is not on shared storage, we won't remove
the instance directory from the source host in _cleanup_resize. If the
admin then later tries to live migrate the instance back to that host,
it will fail with DestinationDisk
method.
This change is essentially a revert of
I29fac80d08
Ic683f83e42
is that creating certain imagebackend objects will recreate the
instance directory and disk.info on the source host, we simply need
to avoid creating the imagebackend object. The only reason we are
getting an imagebackend object in _cleanup_resize is to remove
image snapshot clones, which is only implemented by the Rbd image
backend. Therefore, we can check to see if the image type supports
clones and if not, don't go through the imagebackend init routine
that, for some, will recreate the disk.
Change-Id: Ib10081150e1259
Closes-Bug: #1769131
Related-Bug: #1666831
Related-Bug: #1728603
(cherry picked from commit 8e3385707cb1ced
(cherry picked from commit 174764340d3c965
(cherry picked from commit c72a0a7665e9621
tags: | added: in-stable-ocata |
The referenced change was backported to ocata so marking this as affecting pike and ocata:
https:/ /review. openstack. org/#/c/ 441037/