evacuate test fails due to timeout waiting for evacuate to complete
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| OpenStack Compute (nova) |
Fix Released
|
High
|
Matt Riedemann | ||
Bug Description
In the post-test hook in the nova-live-migration job where we test evacuate, we're doing the following:
1. create an image-backed and volume-backed server on the subnode
2. stop libvirtd on the local node
3. run evacuate to see it fail because nova-compute is disabled on the local node
4. restart libvirtd, wait for the local nova-compute service to be enabled, and then evacuate each server
In this failure, the evacuate times out because libvirtd is still unavailable on the local node after we started the evacuate:
2018-12-05 10:05:50.130 | + /opt/stack/
nova-compute on the local host is back up here:
Dec 05 10:05:49.341595 ubuntu-
The evacuate starts here:
Dec 05 10:05:54.156579 ubuntu-
After that I don't see any failures, but the evacuation doesn't complete within the 30 second timeout - maybe the timeout isn't long enough?
It looks like while we timeout, we're waiting for the network-vif-plugged event from neutron:
Dec 05 10:06:04.554322 ubuntu-
The VIF is plugged here:
Dec 05 10:06:04.620986 ubuntu-
And we timeout about a second or so later, but vif plugging usually takes about 5 seconds to get the event back from neutron, and this is a slower ovh node, so our timeout is likely just not long enough. To compare, tempest's compute build_timeout is 300 seconds:

http:// logstash. openstack. org/#dashboard/ file/logstash. json?query= message% 3A%5C%22test_ evacuate. sh%3Aevacuate_ and_wait_ for_active% 5C%22%20AND% 20message% 3A%5C%22echo% 20'Timed% 20out%20waiting %20for% 20server% 20to%20go% 20to%20ACTIVE% 20status' %5C%22% 20AND%20filenam e%3A%5C% 22logs% 2Fdevstack- gate-post_ test_hook. txt%5C% 22&from= 7d