OpenStack Compute (nova)

cleanup running deleted instance with reap failed with none token context

Series pike
Bug #1734025

Bug #1734025 reported by Li Xipeng on 2017-11-23

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Fix Released	Medium	Li Xipeng
	Pike	Fix Committed	Medium	huanhongda

Bug Description

Description

When zombied instances appear(You can also see bug https://bugs.launchpad.net/nova/+bug/911366),
set running_deleted_instance_poll_interval = 60 and running_deleted_instance_action = reap, then nova-compute service will clear those zombied instances, but if those instances is boot from volume or had volumes attached. After clear, zombied instances cleared, but volumes with attached status exist, and if those volumes are bootable and used to boot volume and set deleted_on_termination=True, thoses volume will still exist and in attached status but instance did not exist.

Steps to reproduce

1. set running_deleted_instance_poll_interval=60 and running_deleted_instance_action = reap.
2. update an running instance status to deleted.
3. restart nova-compute service and wait 60 seconds.

Expected result

Previous test bootable volume was deleted and volumes attached to zombied instances ware detached.

Actual result

Previous test bootable volume was in state attached and in-use, volumes attached to zombied instances ware in-use and attached to those zombied instances.

Li Xipeng (lixipeng) on 2017-11-23

Changed in nova:
status:	New → In Progress
assignee:	nobody → Li Xipeng (lixipeng)

OpenStack Infra (hudson-openstack) on 2018-01-25

Changed in nova:
assignee:	Li Xipeng (lixipeng) → Matt Riedemann (mriedem)

Jay Pipes (jaypipes) on 2018-01-31

summary:

- clearup running deleted instance with reap failed with none token
+ cleanup running deleted instance with reap failed with none token
context

Revision history for this message

Jay Pipes (jaypipes) wrote on 2018-01-31:

does this actually happen for non-boot-from-volume instances that have volumes attached?

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-02-01: Fix merged to nova (master)

Reviewed: https://review.openstack.org/522112
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ca6daf148debb9c9646fcf6db9660c830da5a594
Submitter: Zuul
Branch: master

commit ca6daf148debb9c9646fcf6db9660c830da5a594
Author: lixipeng <email address hidden>
Date: Wed Nov 22 12:03:58 2017 +0800

Fix bug case by none token context

    When set reclaim_instance_interval > 0, and then delete an
    instance which booted from volume with `delete_on_termination`
    set as true. After reclaim_instance_interval time pass,
    all volumes boot instance will with state: attached and in-use,
    but attached instances was deleted.

    This bug case as admin context from
    `nova.compute.manager._reclaim_queued_deletes` did not have
    any token info, then call cinder api would be failed.

    So add user/project CONF with admin role at cinder group,
    and when determine context is_admin and without token, do
    authenticaion with user/project info to call cinder api.

    Change-Id: I3c35bba43fee81baebe8261f546c1424ce3a3383
    Closes-Bug: #1733736
    Closes-Bug: #1734025
    Partial-Bug: #1736773

Changed in nova:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-02-09: Fix included in openstack/nova 17.0.0.0rc1

This issue was fixed in the openstack/nova 17.0.0.0rc1 release candidate.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-09-17: Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/603044

Matt Riedemann (mriedem) on 2019-03-12

Changed in nova:
assignee:	Matt Riedemann (mriedem) → Li Xipeng (lixipeng)
importance:	Undecided → Medium

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-03-22: Fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/603044
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=4d7148709c5de098141fbee12ad2e78c61e3b174
Submitter: Zuul
Branch: stable/pike

commit 4d7148709c5de098141fbee12ad2e78c61e3b174
Author: lixipeng <email address hidden>
Date: Wed Nov 22 12:03:58 2017 +0800

Fix bug case by none token context

    This bug case as admin context from
    `nova.compute.manager._reclaim_queued_deletes` did not have
    any token info, then call cinder api would be failed.

    So add user/project CONF with admin role at cinder group,
    and when determine context is_admin and without token, do
    authenticaion with user/project info to call cinder api.

    Conflicts:
        nova/volume/cinder.py
        nova/tests/unit/test_cinder.py

NOTE(huanhongda): The conflict is due to not having change
Ifc01dbf98545104c998ab96f65ff8623a6db0f28 in Pike.

    Change-Id: I3c35bba43fee81baebe8261f546c1424ce3a3383
    Closes-Bug: #1733736
    Closes-Bug: #1734025
    Partial-Bug: #1736773
    (cherry picked from commit ca6daf148debb9c9646fcf6db9660c830da5a594)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-04-26: Fix included in openstack/nova 16.1.8

This issue was fixed in the openstack/nova 16.1.8 release.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.