commit 9a01e62693a28a73120544b27ee2104558e0250e
Author: Matt Riedemann <email address hidden>
Date: Fri Jun 27 06:46:10 2014 -0700
Enforce task_state is None in ec2 create_image stop instance wait loop
We're hitting races in the gate where the instance.vm_state is STOPPED
but the task_state is POWERING_OFF so when the compute_api.start method
is called, we're in an invalid task state and fail.
The compute manager's stop_instance method is correctly setting the
vm_state to STOPPED and the task_state to None when the instance is
powered off via the virt driver, so we must be hitting this from races
in the ec2 API as noted in the TODO above the method definition.
This change simply checks the task_state in addition to the vm_state
in the wait loop before continuing. The error message is also updated
for context by including the instance uuid, vm_state and task_state,
and removes the timeout value in the message since it was in
milliseconds, not seconds, to begin with.
There is already a unit test that covers this change (which was racing,
hence the bug). There are no changes to that unit test since it's really
an integration test that's running through the compute API and compute
manager code, so the fix tests itself.
Reviewed: https:/ /review. openstack. org/103161 /git.openstack. org/cgit/ openstack/ nova/commit/ ?id=9a01e62693a 28a73120544b27e e2104558e0250e
Committed: https:/
Submitter: Jenkins
Branch: master
commit 9a01e62693a28a7 3120544b27ee210 4558e0250e
Author: Matt Riedemann <email address hidden>
Date: Fri Jun 27 06:46:10 2014 -0700
Enforce task_state is None in ec2 create_image stop instance wait loop
We're hitting races in the gate where the instance.vm_state is STOPPED
but the task_state is POWERING_OFF so when the compute_api.start method
is called, we're in an invalid task state and fail.
The compute manager's stop_instance method is correctly setting the
vm_state to STOPPED and the task_state to None when the instance is
powered off via the virt driver, so we must be hitting this from races
in the ec2 API as noted in the TODO above the method definition.
This change simply checks the task_state in addition to the vm_state
in the wait loop before continuing. The error message is also updated
for context by including the instance uuid, vm_state and task_state,
and removes the timeout value in the message since it was in
milliseconds, not seconds, to begin with.
There is already a unit test that covers this change (which was racing,
hence the bug). There are no changes to that unit test since it's really
an integration test that's running through the compute API and compute
manager code, so the fix tests itself.
Closes-Bug: #1334345
Change-Id: I13f0c743cadda6 439ae15607a9ef6 e4e4985626d