ServerMovingTests.test_evacuate sometimes fails but not always

Bug #1710509 reported by Chris Dent
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Chris Dent

Bug Description

The newly added test_evacuate test in ServerMovingTests is lightly racey. It seems to fail about 1 in 10 times. A recent failure is at http://logs.openstack.org/72/489772/2/gate/gate-nova-tox-functional-py35-ubuntu-xenial/07f4a29/console.html#_2017-08-12_12_51_52_867765

Will look into this more closely tomorrow when I've got time, and add elastic recheck entry etc, but wanted to get it written down.

Revision history for this message
Chris Dent (cdent) wrote :

It looks like the most likely thing is that the loop with a sleep in _wait_for_state_change doesn't wait long enough.

Revision history for this message
jichenjc (jichenjc) wrote :

I am *guessing* that between those 2 lines , instance is ACTIVE but the host not changed yet
so we can change the loop time but better to add another check ?

https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2791

https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L2927

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/493448

Changed in nova:
assignee: nobody → jichenjc (jichenjc)
status: New → In Progress
Changed in nova:
assignee: jichenjc (jichenjc) → Chris Dent (cdent)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/493448
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=548e93c4b45df05bb0352cc3ef6554d0f65870bd
Submitter: Jenkins
Branch: master

commit 548e93c4b45df05bb0352cc3ef6554d0f65870bd
Author: jichenjc <email address hidden>
Date: Tue Aug 15 01:16:16 2017 +0800

    Avoid race in test_evacuate

    We want the host to be the destination and the status to be active so
    use _wait_for_server_parameter to wait on both of those, rather than
    waiting only for the server to change to ACTIVE.

    Without this, we sometimes see the server go ACTIVE but the host has
    not yet changed.

    Co-Authored-By: Balazs Gibizer <email address hidden>
    Co-Authored-By: Chris Dent <email address hidden>
    Change-Id: I273998ebc03f3a832cc44787a5c2396da58e5e25
    Closes-Bug: 1710509

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/494624

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/pike)

Reviewed: https://review.openstack.org/494624
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3c9980468383434ce7a1ded44e6125c44c2cde24
Submitter: Jenkins
Branch: stable/pike

commit 3c9980468383434ce7a1ded44e6125c44c2cde24
Author: jichenjc <email address hidden>
Date: Tue Aug 15 01:16:16 2017 +0800

    Avoid race in test_evacuate

    We want the host to be the destination and the status to be active so
    use _wait_for_server_parameter to wait on both of those, rather than
    waiting only for the server to change to ACTIVE.

    Without this, we sometimes see the server go ACTIVE but the host has
    not yet changed.

    Co-Authored-By: Balazs Gibizer <email address hidden>
    Co-Authored-By: Chris Dent <email address hidden>
    Change-Id: I273998ebc03f3a832cc44787a5c2396da58e5e25
    Closes-Bug: 1710509
    (cherry picked from commit 548e93c4b45df05bb0352cc3ef6554d0f65870bd)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 16.0.0.0rc2

This issue was fixed in the openstack/nova 16.0.0.0rc2 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.0.0b1

This issue was fixed in the openstack/nova 17.0.0.0b1 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Balazs Gibizer (<email address hidden>) on branch: master
Review: https://review.opendev.org/494458
Reason: This patch is stale. Feel free to restore it (or gibi on IRC to do so) if you are still working on this.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.