Volume not removed on instance deletion

Bug #1834659 reported by François Palin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
François Palin
Queens
Fix Released
Undecided
Unassigned
Rocky
Fix Released
Undecided
Unassigned
Stein
Fix Released
Undecided
Unassigned
Train
Fix Released
Undecided
Unassigned

Bug Description

Description
===========
When we deploy a non-ephemeral instance (i.e. Creating a new volume), and
indicate "YES" in "Delete Volume on Instance delete",then delete the
instance, and volume driver terminate connection in cinder takes too long
to return, the volume is not removed.

The volume status remains as "In-use" and "Attached to None on /dev/vda".
For example:
abcfa1db-1748-4f04-9a29-128cf22efcc5 - 130GiB In-use - Attached to None on /dev/vda

Steps to reproduce
==================
Please refer to this bug comment #2 below

Expected result
===============
Volume gets removed

Actual result
=============
Volume remains attached

Environment
===========
Issue was initially reported downstream against Newton release (see
comment #1 below). Customer was using hitachi volume driver:
   volume_driver = cinder.volume.drivers.hitachi.hbsd.hbsd_fc.HBSDFCDriver
As a note, the hitachi drivers are unsupported as of Pike (see cinder
commit 595c8d3f8523a9612ccc64ff4147eab993493892

Issue was reproduced in a devstack environment runnning the Stein release.
Volume driver used was lvm (devstack default)

Revision history for this message
François Palin (francois.palin) wrote :

this is a clone of the following downstream bug: https://bugzilla.redhat.com/show_bug.cgi?id=1622072

Revision history for this message
François Palin (francois.palin) wrote :
Download full text (3.4 KiB)

I was able to reproduce the issue in a Stein devstack setup,
(put a time.sleep(180) command in cinder.volume.drivers.lvm, method terminate_connection).

stack@fpalin-devstack:~$ openstack server list
+--------------------------------------+----------+--------+---------------------------------------------------------+-------+---------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+----------+--------+---------------------------------------------------------+-------+---------+
| aa68525e-6541-404a-bde1-5d21dbc3b1fb | test_VM3 | ACTIVE | private=fd3a:ef1e:3d05:0:f816:3eff:fe92:4f1d, 10.0.0.20 | | m1.nano |
+--------------------------------------+----------+--------+---------------------------------------------------------+-------+---------+
stack@fpalin-devstack:~$
stack@fpalin-devstack:~$ openstack volume list
+--------------------------------------+------+--------+------+-----------------------------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+------+--------+------+-----------------------------------+
| ece0973b-9940-4849-a99a-71e6b62e46f7 | | in-use | 1 | Attached to test_VM3 on /dev/vda |
+--------------------------------------+------+--------+------+-----------------------------------+
stack@fpalin-devstack:~$

stack@fpalin-devstack:~$ openstack server delete test_VM3
stack@fpalin-devstack:~$
stack@fpalin-devstack:~$ openstack server list
+--------------------------------------+----------+--------+----------+-------+---------+
| ID | Name | Status | Networks | Image | Flavor |
+--------------------------------------+----------+--------+----------+-------+---------+
| aa68525e-6541-404a-bde1-5d21dbc3b1fb | test_VM3 | ACTIVE | | | m1.nano |
+--------------------------------------+----------+--------+----------+-------+---------+
stack@fpalin-devstack:~$ openstack volume list
+--------------------------------------+------+--------+------+-----------------------------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+------+--------+------+-----------------------------------+
| ece0973b-9940-4849-a99a-71e6b62e46f7 | | in-use | 1 | Attached to test_VM3 on /dev/vda |
+--------------------------------------+------+--------+------+-----------------------------------+
stack@fpalin-devstack:~$

############## then after waiting about 3 minutes, volume remains attached, and stays that way:
stack@fpalin-devstack:~$ openstack server list

stack@fpalin-devstack:~$ openstack volume list
+--------------------------------------+------+--------+------+---------------------------------------------------------------+
| ID | Name | Status | Size | Attached to |
+--------------------------------------+------+--------+------+---------------------------------------------...

Read more...

Revision history for this message
François Palin (francois.palin) wrote :

For nova, we need to improve our error handling here and potentially calling detach even if terminate_connection fails.

tags: added: cinder volumes
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/669674

Changed in nova:
assignee: nobody → François Palin (francois.palin)
status: New → In Progress
Revision history for this message
sean mooney (sean-k-mooney) wrote :

triaging as medium as this is irritating and may cause quota failure but it can be worked around via the api by calling
cinder to delete the volume manually without requiring admin privileges .

Changed in nova:
importance: Undecided → Medium
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/669674
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=01c334cbdd859f4e486ac2c369a4bdb3ec7709cc
Submitter: Zuul
Branch: master

commit 01c334cbdd859f4e486ac2c369a4bdb3ec7709cc
Author: Francois Palin <email address hidden>
Date: Mon Jul 8 10:12:25 2019 -0400

    Add retry to cinder API calls related to volume detach

    When shutting down an instance for which volume needs to be
    deleted, if cinder RPC timeout expires before cinder volume
    driver terminates connection, then an unknown cinder exception
    is received and the volume is not removed.

    This fix adds a retry mechanism directly in cinder API calls
    attachment_delete, terminate_connection, and detach.

    Change-Id: I3c9ae47d0ceb64fa3082a01cb7df27faa4f5a00d
    Closes-Bug: #1834659

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/722142

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/train)

Reviewed: https://review.opendev.org/722142
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=118ee682571a4bd41c8009dbe2e47fdd1f85a630
Submitter: Zuul
Branch: stable/train

commit 118ee682571a4bd41c8009dbe2e47fdd1f85a630
Author: Francois Palin <email address hidden>
Date: Mon Jul 8 10:12:25 2019 -0400

    Add retry to cinder API calls related to volume detach

    When shutting down an instance for which volume needs to be
    deleted, if cinder RPC timeout expires before cinder volume
    driver terminates connection, then an unknown cinder exception
    is received and the volume is not removed.

    This fix adds a retry mechanism directly in cinder API calls
    attachment_delete, terminate_connection, and detach.

    Change-Id: I3c9ae47d0ceb64fa3082a01cb7df27faa4f5a00d
    Closes-Bug: #1834659
    (cherry picked from commit 01c334cbdd859f4e486ac2c369a4bdb3ec7709cc)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/stein)

Fix proposed to branch: stable/stein
Review: https://review.opendev.org/722783

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/stein)

Reviewed: https://review.opendev.org/722783
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=1d66884af7b945dca2831c2b2ade534d87c934c9
Submitter: Zuul
Branch: stable/stein

commit 1d66884af7b945dca2831c2b2ade534d87c934c9
Author: Francois Palin <email address hidden>
Date: Mon Jul 8 10:12:25 2019 -0400

    Add retry to cinder API calls related to volume detach

    When shutting down an instance for which volume needs to be
    deleted, if cinder RPC timeout expires before cinder volume
    driver terminates connection, then an unknown cinder exception
    is received and the volume is not removed.

    This fix adds a retry mechanism directly in cinder API calls
    attachment_delete, terminate_connection, and detach.

    Change-Id: I3c9ae47d0ceb64fa3082a01cb7df27faa4f5a00d
    Closes-Bug: #1834659
    (cherry picked from commit 01c334cbdd859f4e486ac2c369a4bdb3ec7709cc)
    (cherry picked from commit 118ee682571a4bd41c8009dbe2e47fdd1f85a630)

tags: added: in-stable-stein
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/rocky)

Fix proposed to branch: stable/rocky
Review: https://review.opendev.org/725272

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/rocky)

Reviewed: https://review.opendev.org/725272
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=10bd369589ab18094b1b96afe743686473294ef5
Submitter: Zuul
Branch: stable/rocky

commit 10bd369589ab18094b1b96afe743686473294ef5
Author: Francois Palin <email address hidden>
Date: Mon Jul 8 10:12:25 2019 -0400

    Add retry to cinder API calls related to volume detach

    When shutting down an instance for which volume needs to be
    deleted, if cinder RPC timeout expires before cinder volume
    driver terminates connection, then an unknown cinder exception
    is received and the volume is not removed.

    This fix adds a retry mechanism directly in cinder API calls
    attachment_delete, terminate_connection, and detach.

    Change-Id: I3c9ae47d0ceb64fa3082a01cb7df27faa4f5a00d
    Closes-Bug: #1834659
    (cherry picked from commit 01c334cbdd859f4e486ac2c369a4bdb3ec7709cc)
    (cherry picked from commit 118ee682571a4bd41c8009dbe2e47fdd1f85a630)
    (cherry picked from commit 1d66884af7b945dca2831c2b2ade534d87c934c9)

tags: added: in-stable-rocky
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/queens)

Fix proposed to branch: stable/queens
Review: https://review.opendev.org/726508

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/queens)

Reviewed: https://review.opendev.org/726508
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0f46dbb6e413249a5bae39dc2010c928d3fd2f40
Submitter: Zuul
Branch: stable/queens

commit 0f46dbb6e413249a5bae39dc2010c928d3fd2f40
Author: Francois Palin <email address hidden>
Date: Mon Jul 8 10:12:25 2019 -0400

    Add retry to cinder API calls related to volume detach

    When shutting down an instance for which volume needs to be
    deleted, if cinder RPC timeout expires before cinder volume
    driver terminates connection, then an unknown cinder exception
    is received and the volume is not removed.

    This fix adds a retry mechanism directly in cinder API calls
    attachment_delete, terminate_connection, and detach.

    Change-Id: I3c9ae47d0ceb64fa3082a01cb7df27faa4f5a00d
    Closes-Bug: #1834659
    (cherry picked from commit 01c334cbdd859f4e486ac2c369a4bdb3ec7709cc)
    (cherry picked from commit 118ee682571a4bd41c8009dbe2e47fdd1f85a630)
    (cherry picked from commit 1d66884af7b945dca2831c2b2ade534d87c934c9)
    (cherry picked from commit 10bd369589ab18094b1b96afe743686473294ef5)

tags: added: in-stable-queens
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova queens-eol

This issue was fixed in the openstack/nova queens-eol release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova rocky-eol

This issue was fixed in the openstack/nova rocky-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.