Timeout during VM resize
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
StarlingX |
Fix Released
|
Medium
|
Lucas |
Bug Description
Brief Description
-----------------
Our VM resize/migrate tests are failing
Severity
--------
<Major: System/Feature is usable but degraded>. Running a test suite this might affect the tests executed after.
Steps to Reproduce
------------------
Try to resize VM
Expected Behavior
------------------
It should work fine
Actual Behavior
----------------
Timeout during VM resize
Reproducibility
---------------
This happens pretty often, I guess it might be 100%
System Configuration
-------
One node system, Two node system, Multi-node system, Dedicated storage
Branch/Pull Time/Commit
-------
master
Last Pass
---------
-
Timestamp/Logs
--------------
The collected logs will be attached.
[2022-01-01 08:59:51,126] 54 DEBUG MainThread conftest.
***Details: launch_instances = 'b07d5807-
create_
@mark.
def test_resize_
> vm_helper.
testcases/
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _.
keywords/
timeout=300, con_ssh=con_ssh)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _.
vm_id = 'b07d5807-
status = ['VERIFY_RESIZE', 'ERROR'], timeout = 300, check_interval = 3
fail_ok = False, con_ssh = None
auth_info = {'domain': 'Default', 'password': 'St4rlingX*', 'tenant': 'admin', 'user': 'admin'}
def wait_for_
"""
....
Args:
vm_id:
status (list|str):
fail_ok (bool):
....
Returns: The Status of the vm_id depend on what Status it is looking for
....
"""
end_time = time.time() + timeout
if isinstance(status, str):
status = [status]
....
while time.time() < end_time:
for expected_status in status:
if current_status == expected_status:
....
....
err_msg = "Timed out waiting for vm status: {}. Actual vm status: " \
if fail_ok:
return None
else:
> raise exceptions.
E utils.exception
E Details: Timed out waiting for vm status: ['VERIFY_RESIZE', 'ERROR']. Actual vm status: ACTIVE
keywords/
Test Activity
-------------
Sanity
Workaround
----------
-
tags: | added: stx.distro.openstack |
Changed in starlingx: | |
importance: | Undecided → Medium |
Changed in starlingx: | |
status: | Triaged → In Progress |
tags: |
added: in-r-stx60 removed: stx.cherrypickneeded |
I observed this behavior too, yesterday. My comments below:
The cold migrate that is failing logs errors like these on nova-compute: /paste. opendev. org/show/ 811898/
https:/
And because it fails, it returns the instance to ACTIVE state, so that it never gets on VERIFY_RESIZE and causes the tests to fail.
Resize (not as part of cold migrate) seems to fail intermittently.