Guest VM is stuck in RESIZE state randomly, never going to VERIFY_RESIZE as expected; Connection aborted, Remote end closed connection without response
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Nova Cloud Controller Charm |
Invalid
|
Undecided
|
Unassigned |
Bug Description
== Environment
focal/ussuri + ovn, latest stable charms
juju status: https:/
Hardware: Huawei CH121 V5 with MZ532,4*25GE Mezzanine Card,PCIE 3.0 X16 NICs
External storage: Huawei OceanStor Dorado 6000 V6
juju crashdump: https:/
== Problem
As a part of the cloud testing, we are launching a set of Rally tests, including NovaServers.
Currently, we are observing a situation when after invoking a server resize, the guest might remain in RESIZE state, never going to the VERIFY_RESIZE (or, as another theory, it might get there but test can't reach out to ncc to verify this), as expected by the test [1], failing with the following traceback:
Traceback (most recent call last):
File "/snap/
getattr(
File "/snap/
self.
File "/snap/
f = func(self, *args, **kwargs)
File "/snap/
check_
File "/snap/
resource = update_
File "/snap/
raise exceptions.
rally.exception
[0] https:/
[1] https:/
Rally log is provided for the timestamp reference; crashdump to be attached as long as we'll get it from the environment.
2021-09-21 19:54:08.331 1051375 INFO rally.task.runner [-] Task 084a8dd3-
t the resource <Server: s_rally_
8-4869-
2021-09-21 19:54:08.333 1051375 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 19:55:27.800 1051377 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 19:55:27.802 1051377 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 19:56:37.150 1051375 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 19:56:37.152 1051375 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 19:57:41.926 1051377 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 19:57:41.928 1051377 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 19:58:24.976 1051375 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 19:58:24.978 1051375 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 19:59:30.661 1051375 INFO rally.task.runner [-] Task 084a8dd3-
t the resource <Server: s_rally_
9-475e-
2021-09-21 19:59:30.663 1051375 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 20:00:09.757 1051377 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 20:00:09.759 1051377 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 20:00:38.332 1051375 INFO rally.task.runner [-] Task 084a8dd3-
t the resource <Server: s_rally_
7-49ea-
2021-09-21 20:00:38.334 1051375 INFO rally.task.runner [-] Task 084a8dd3-
2021-09-21 20:01:12.398 1051377 INFO rally.task.runner [-] Task 084a8dd3-
t the resource <Server: s_rally_
8-4bd5-
2021-09-21 20:01:37.845 1051375 INFO rally.task.runner [-] Task 084a8dd3-
et the resource <Server: s_rally_
e9-4582-
2021-09-21 20:01:37.862 1050774 INFO rally.task.context [-] Task 084a8dd3-
== Steps were tried to reproduce the issue manually (note: I was unable to reproduce it by hand - will try replicating the Rally's behaviour)
openstack server create --boot-from-volume 30 --image auto-sync/
for x in $(seq 1 20); do openstack server resize test-resize-$x --flavor m1.large; done
crashdump: https:/ /drive. google. com/file/ d/1oZ6K8PW2euEK gKyaqtq2bdr8gIp znvX5/view? usp=sharing