A shelved_offload VM's volumes are still attached to a host

Bug #1547142 reported by Shoham Peller
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Andrea Rosa
Ocata
Fix Committed
Medium
Matt Riedemann
Pike
Fix Committed
Medium
Matt Riedemann

Bug Description

When shelve_offloading a VM, the VM loses it's connection to a host.
However, connection to the host is not terminated to it's volumes, so they are still attached to a host.

Afterwards, when the VM is unshleved, nova calls initialize_connection to the new host for it's volumes, and they are now connected to 2 hosts.

The correct behaviour is to call terminate_connection on the VM's volumes when it's being shelved_offloaded

Tags: shelve volumes
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/281990

Changed in nova:
assignee: nobody → Shoham Peller (shoham-peller)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Shoham Peller (<email address hidden>) on branch: master
Review: https://review.openstack.org/281990
Reason: Duplicates https://review.openstack.org/#/c/257275

Revision history for this message
Andrea Rosa (andrea-rosa-m) wrote :

To properly fix this bug we depends on the resolution of a cinder bug, precisely:
https://bugs.launchpad.net/cinder/+bug/1527278
It is not enough to call the terminate_connection in Nova, that call doesn't close correctly the open connections on the Cinder side, we need to call a different method "remove_export" which is not exposed bu the API yet.

Changed in nova:
assignee: Shoham Peller (shoham-peller) → Andrea Rosa (andrea-rosa-m)
Revision history for this message
Andrea Rosa (andrea-rosa-m) wrote :

The resolution of this bug depends on a fix in cinder. I am not actively working on the cinder fix so this bug is stuck at the moment.

Changed in nova:
assignee: Andrea Rosa (andrea-rosa-m) → nobody
tags: added: volumes
Revision history for this message
Sivasathurappan Radhakrishnan (siva-radhakrishnan) wrote :

changing it to confirmed to avoid any confusion on the status of the bug

Changed in nova:
status: In Progress → Confirmed
Changed in nova:
importance: Undecided → Medium
Revision history for this message
Shoham Peller (shoham-peller) wrote :

@andrea-rosa-m
Isn't calling terminate_connection better than nothing? This makes volumes be connected to more than 1 node in cinder.
Why not add the terminate-connection, and then add the remove_export, once it is available?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Michael Still (<email address hidden>) on branch: master
Review: https://review.openstack.org/257275
Reason: This patch has been sitting unchanged for more than 12 weeks. I am therefore going to abandon it to keep the nova review queue sane. Please feel free to restore the change if you're still working on it.

aishwarya (bkaishwarya)
Changed in nova:
assignee: nobody → aishwarya (bkaishwarya)
assignee: aishwarya (bkaishwarya) → nobody
Changed in nova:
assignee: nobody → puja (pujachowdhary)
Changed in nova:
assignee: Puja (pujachowdhary) → nobody
Revision history for this message
Sean Dague (sdague) wrote :

Found open reviews for this bug in gerrit, setting to In Progress.

review: https://review.openstack.org/257275 in branch: master

Changed in nova:
status: Confirmed → In Progress
assignee: nobody → PujaChowdhary (pujachowdhary)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Sean Dague (<email address hidden>) on branch: master
Review: https://review.openstack.org/257275
Reason: This review is > 4 weeks without comment, and is not mergable in it's current state. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Revision history for this message
Sean Dague (sdague) wrote :

There are no currently open reviews on this bug, changing the status back to the previous state and unassigning. If there are active reviews related to this bug, please include links in comments.

Changed in nova:
status: In Progress → Confirmed
assignee: PujaChowdhary (pujachowdhary) → nobody
Matt Riedemann (mriedem)
tags: added: shelve
Changed in nova:
assignee: nobody → Matt Riedemann (mriedem)
status: Confirmed → In Progress
Matt Riedemann (mriedem)
Changed in nova:
assignee: Matt Riedemann (mriedem) → Andrea Rosa (andrea-rosa-m)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/257275
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=e89e1bdc60211622440c964f8be8563da89341ac
Submitter: Jenkins
Branch: master

commit e89e1bdc60211622440c964f8be8563da89341ac
Author: Andrea Rosa <email address hidden>
Date: Thu Sep 14 13:47:06 2017 -0400

    Call terminate_connection when shelve_offloading

    When nova performs a shelve offload for an instance, it needs to terminate
    all the volume connections for that instance as with the shelve offload
    it is not guaranteed that the instance will be placed on the same host once
    it gets unshelved.
    This change adds the call to the terminate_volume_connections on the
    _shelve_offload_instance method in the compute manager.

    Closes-Bug: #1547142

    Change-Id: I8849ae0f54605e003d5b294ca3d66dcef89d7d27

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/504270

Revision history for this message
Matt Riedemann (mriedem) wrote :

To come back to some earlier discussion about this bug, we discussed this with the cinder team at the queens PTG and it is OK to call terminate connection during shelve offload, notes are here:

https://etherpad.openstack.org/p/cinder-ptg-queens

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/504273

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/504270
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8365eb6cb987c834b1f35c04be13aa97db36a4a1
Submitter: Jenkins
Branch: stable/pike

commit 8365eb6cb987c834b1f35c04be13aa97db36a4a1
Author: Andrea Rosa <email address hidden>
Date: Thu Sep 14 13:47:06 2017 -0400

    Call terminate_connection when shelve_offloading

    When nova performs a shelve offload for an instance, it needs to terminate
    all the volume connections for that instance as with the shelve offload
    it is not guaranteed that the instance will be placed on the same host once
    it gets unshelved.
    This change adds the call to the terminate_volume_connections on the
    _shelve_offload_instance method in the compute manager.

    Closes-Bug: #1547142

    Change-Id: I8849ae0f54605e003d5b294ca3d66dcef89d7d27
    (cherry picked from commit e89e1bdc60211622440c964f8be8563da89341ac)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.1

This issue was fixed in the openstack/nova 16.0.1 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.0.0b1

This issue was fixed in the openstack/nova 17.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/ocata)

Reviewed: https://review.openstack.org/504273
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=286fa12ad128ad22d2e9c5002c54dfdb54faac16
Submitter: Zuul
Branch: stable/ocata

commit 286fa12ad128ad22d2e9c5002c54dfdb54faac16
Author: Andrea Rosa <email address hidden>
Date: Thu Sep 14 13:47:06 2017 -0400

    Call terminate_connection when shelve_offloading

    When nova performs a shelve offload for an instance, it needs to terminate
    all the volume connections for that instance as with the shelve offload
    it is not guaranteed that the instance will be placed on the same host once
    it gets unshelved.
    This change adds the call to the terminate_volume_connections on the
    _shelve_offload_instance method in the compute manager.

    Closes-Bug: #1547142

    Conflicts:
          nova/tests/unit/compute/test_shelve.py

    NOTE(mriedem): The conflicts in the test are just due to not having
    resource allocation cleanup for placement in shelve offload in Ocata.

    Change-Id: I8849ae0f54605e003d5b294ca3d66dcef89d7d27
    (cherry picked from commit e89e1bdc60211622440c964f8be8563da89341ac)
    (cherry picked from commit 8365eb6cb987c834b1f35c04be13aa97db36a4a1)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 15.1.0

This issue was fixed in the openstack/nova 15.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.