concurrent snapshot and restore operations can result in duplicate / orphan vm's

Bug #1412952 reported by Patrick Crews
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Heat
Incomplete
Medium
Qiming Teng

Bug Description

Testing with 2 concurrent users performing random snapshot and restore operations on a single stack can result in one or more 'ghost' / orphan / duplicate vm's using devstack.

I have not isolated the exact combination of events yet, but it seems that while the system is restoring a stack from a snapshot that it is possible for the 'main' / current vm in the stack to be snapshotted. This then might prevent the auto-deletion of this vm once the restored vm is ready (just a theory).

Command line for test + further info follows.

Revision history for this message
Patrick Crews (patrick-crews) wrote :
Download full text (6.1 KiB)

NOTE: the test stack can be anything. This is a basic test stack created from this template:
http://git.openstack.org/cgit/openstack/heat-templates/plain/hot/F20/WordPress_Native.yaml

$ heat stack-list
+--------------------------------------+--------------+-----------------+----------------------+
| id | stack_name | stack_status | creation_time |
+--------------------------------------+--------------+-----------------+----------------------+
| 1a83b2e8-bb00-4b72-883a-8c84fc75264e | test-stack-0 | CREATE_COMPLETE | 2015-01-20T20:01:29Z |
+--------------------------------------+--------------+-----------------+----------------------+
$ nova list
+--------------------------------------+----------------------------------------------+--------+------------+-------------+------------------+
| ID | Name | Status | Task State | Power State | Networks |
+--------------------------------------+----------------------------------------------+--------+------------+-------------+------------------+
| be5f8c3d-bb27-4536-916c-685ed686ea59 | test-stack-0-wordpress_instance-zxjcxj3xvvlz | ACTIVE | - | Running | private=10.0.0.2 |
+--------------------------------------+----------------------------------------------+--------+------------+-------------+------------------+

# START RANDOM TESTING
# clone https://github.com/pcrews/rannsaka
# cd rannsaka/rannsake and run the command line
# 2 'users', 500 total requests - simple set of randomized heat api calls
$ time python rannsaka.py --host=http://192.168.0.5 --requests 500 -w 2 --test-file=locust_files/heat_basic_stress.py
# POST TESTING
 Name # reqs # fails Avg Min Max | Median req/s
--------------------------------------------------------------------------------------------------------------------------------------------
 GET /v1/81e7409a431d4974a0e5ce6130859aea/stacks 235 0(0.00%) 951 32 11595 | 45 0.80
 POST /v2.0/tokens 12 0(0.00%) 373 167 2098 | 180 0.00
 GET stacks/[name]/[id]/resources 46 0(0.00%) 396 40 15616 | 47 0.10
 GET stacks/[name]/[id]/snapshots 99 0(0.00%) 461 39 13461 | 91 0.50
 POST stacks/[name]/[id]/snapshots 29 15(34.09%) 1986 120 8277 | 470 0.10
 POST stacks/[name]/[id]/snapshots/[restore_snapshot] 53 11(17.19%) 596 76 5653 | 180 0.10
--------------------------------------------------------------------------------------------------------------------------------------------
 Total 474 26(5.49%) 1.60

$ heat stack-list
+--------------------------------------+--------------+-------------------+----------------------+
| id | stack_name ...

Read more...

Qiming Teng (tengqim)
Changed in heat:
assignee: nobody → Qiming Teng (tengqim)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to heat (master)

Fix proposed to branch: master
Review: https://review.openstack.org/150665

Changed in heat:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to heat (master)

Reviewed: https://review.openstack.org/150665
Committed: https://git.openstack.org/cgit/openstack/heat/commit/?id=1c88c182f9bd3a5ee249fe6ca26657f9782bdcbf
Submitter: Jenkins
Branch: master

commit 1c88c182f9bd3a5ee249fe6ca26657f9782bdcbf
Author: tengqm <email address hidden>
Date: Wed Jan 28 11:17:39 2015 +0800

    Don't do snapshot when other action in progress

    According to the referenced bug report, snapshot and restore operations
    are not checking the stack status. The result is that some duplicate or
    orphan VMs left. This patch adds a check in snapshot operation that will
    reject a snapshot request when it seems unsafe.

    partial-bug: 1412952
    Change-Id: I5b5da312ac08fd5f15418dd6cf9670463180d7ad

Revision history for this message
Steve Baker (steve-stevebaker) wrote :

Can there be an update on this bug about what is remaining to be done?

Changed in heat:
importance: Undecided → Medium
status: In Progress → Incomplete
Rico Lin (rico-lin)
Changed in heat:
milestone: none → no-priority-tag-bugs
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.