Ceph VM images leak on instance deletion if there are snapshots of that image

Bug #1975637 reported by Andrew Bogott
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
In Progress
Undecided
Unassigned

Bug Description

Description
===========

I'm using backy2 to back up instance images. For the sake of incremental backups, we keep a pending last-backed-up snapshot associated with each instance at all times.

When an instance is deleted, the rbd delete call fails silently and leaves both the VM and snapshot behind forever.

Ideally two things would be different:

1) The logs would reflect this failure
2) A config option would allow me to demand that all associated snaps are purged on instance deletion.

I thought I had a weirdo edge use-case but I see at least one other user encountering this same leakage, here: https://heiterbiswolkig.blogs.nde.ag/2019/03/07/orphaned-instances-part-2/

(I'm running version Wallaby but the code is the same on the current git head)

Steps to reproduce
==================

* Create an instance backed with ceph/rbd
* Take a snapshot of that instance (in my case, I'm doing this out of band using rbd commands, not a nova API)
* Delete the instance
* Note that the volume is still present on the storage backend

Expected result

* Log messages should announce a failure to delete, or
* (optionally) server volume is actually deleted

Actual result
=============

* Logfile is silent about failure to delete
* server volume is leaked and lives on forever, invisible to Nova

Environment
===========

Openstack Wallaby installed from the debian BPO on Debian Bullseye

# dpkg --list | grep nova
ii nova-common 2:23.1.0-2~bpo11+1 all OpenStack Compute - common files
ii nova-compute 2:23.1.0-2~bpo11+1 all OpenStack Compute - compute node
ii nova-compute-kvm 2:23.1.0-2~bpo11+1 all OpenStack Compute - compute node (KVM)
ii python3-nova 2:23.1.0-2~bpo11+1 all OpenStack Compute - libraries
ii python3-novaclient 2:17.4.0-1~bpo11+1 all client library for OpenStack Compute API - 3.x

The same issue is present in the latest git code.

description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/843228

Changed in nova:
status: New → In Progress
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.