resizing a stopped server fails with xenapi

Bug #1308064 reported by Matt Riedemann
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Unassigned

Bug Description

I wrote a Tempest test for making sure that resizing a stopped server works since the API supports that scenario, and Jenkins passes (with the libvirt driver) but XenServer CI failed on the new tests:

http://dd6b71949550285df7dc-dda4e480e005aaa13ec303551d2d8155.r49.cf1.rackcdn.com/12/87312/3/testr_results.html.gz

Traceback (most recent call last):
  File "tempest/api/compute/servers/test_server_actions.py", line 220, in test_resize_server_confirm_from_stopped
    self._test_resize_server_confirm(stop=True)
  File "tempest/api/compute/servers/test_server_actions.py", line 201, in _test_resize_server_confirm
    self.client.wait_for_server_status(self.server_id, 'VERIFY_RESIZE')
  File "tempest/services/compute/json/servers_client.py", line 168, in wait_for_server_status
    raise_on_error=raise_on_error)
  File "tempest/common/waiters.py", line 89, in wait_for_server_status
    raise exceptions.TimeoutException(message)
TimeoutException: Request timed out
Details: Server a3bdc31f-c8db-4751-af96-db9e17ce744c failed to reach VERIFY_RESIZE status and task state "None" within the required time (196 s). Current status: ACTIVE. Current task state: None.

Tags: xenserver
Changed in nova:
status: New → Incomplete
status: Incomplete → Triaged
importance: Undecided → High
Revision history for this message
John Garbutt (johngarbutt) wrote :

Hmm, looks like this could be an environment issue with the XenServer CI.

Either way, looks like it should be investigated a little more form the XenServer CI point of view.

Revision history for this message
Bob Ball (bob-ball) wrote :

Why do you suspect the CI rather than a nova issue for this?

Revision history for this message
Matt Riedemann (mriedem) wrote :

I am seeing this in the n-cpu log:

"VM already halted, skipping shutdown"

That's when confirm_migration is called on the driver and it shuts down the VM, and in this case it's already shutdown, but that shouldn't cause any failures, it probably shouldn't be a warning in this case though.

Revision history for this message
Bob Ball (bob-ball) wrote :

I'm seeing http://paste.openstack.org/show/76070/ as a possible cause here - although that's only being seen in my local environment with https://review.openstack.org/#/c/87958/ applied.

Introduced by https://github.com/openstack/nova/blob/master/nova/network/api.py#L473 (from https://review.openstack.org/#/c/87019/)

Revision history for this message
Matt Riedemann (mriedem) wrote :

Bob, that was fixed in rc2 with this: https://review.openstack.org/86194

Revision history for this message
Openstack Gerrit (openstack-gerrit) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/87958
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=b634bf3a0b3dbccec6c3d7b26ab2a67d8eb1a6e5
Submitter: Jenkins
Branch: master

commit b634bf3a0b3dbccec6c3d7b26ab2a67d8eb1a6e5
Author: Bob Ball <email address hidden>
Date: Wed Apr 16 13:23:45 2014 +0100

    XenAPI: Use local rsync rather than remote if possible

    Using ssh depends on the host being set up to be able to SSH into
    itself which is not a common scenario. While this is unavoidable for
    the current implementation of resize across multiple hosts, if there
    is a single host (i.e. a test scenario) or the resize is restricted
    to the same host then we can succeed without SSH access

    Dependency on Ia310e31d31aaf5c979e41c64af8223202a18e03a is because the
    tests will always fail without Ia310 therefore this fix cannot be tested
    without taking Ia310 as well.

    Closes-bug: 1308064

    Change-Id: I15802a1d97d380b1c5b74fc9f92ece2494fe789a

Changed in nova:
status: Triaged → Fix Committed
Revision history for this message
Matt Riedemann (mriedem) wrote :

Resize from stopped still doesn't seem to be working for xenserver, otherwise there are other infra issues:

https://review.openstack.org/#/c/87312/

Changed in nova:
status: Fix Committed → New
Revision history for this message
Matt Riedemann (mriedem) wrote :

Re-opening since this is still failing in XenServer CI with resize from stopped tests.

Revision history for this message
Bob Ball (bob-ball) wrote :

I think this bug is fixed, but I've raised bug 1317792 as a new race condition I have identified which can affect resize test cases.

Revision history for this message
Matt Riedemann (mriedem) wrote :

Bob, thanks, yeah looks like the resize from stopped test I added in that one patch is passed xenserver CI now, but hitting another race with shelve/unshelve:

http://dd6b71949550285df7dc-dda4e480e005aaa13ec303551d2d8155.r49.cf1.rackcdn.com/12/87312/7/7848/testr_results.html.gz

So a different bug - this one is good, thanks!

Matt Riedemann (mriedem)
Changed in nova:
status: New → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.