We just upgraded nova to wallaby(23.2.1) in our test environment and after the upgrade our tempest tests fail for the revert resize tests. (e.g. tempest.api.compute.servers.test_server_actions.ServerActionsTestJSON.test_resize_server_revert)
After we confirmed that resize revert is not working anymore by manually testing it we searched and found this bug report.
I can confirm exactly what Will Szumski states in this bug report, its the exakt same issue for us.
-> Resize instance to bigger flavor
-> Revert the resize
-> Wait for log event (Timeout waiting for Neutron events: [('network-vif-plugged', '4aa5c785-816b-4ccc-bd4c-54c9cf408ba4')])
-> Instance is back in VERIFY_RESIZE state
As Will Szumski also stated after that the instance does not seem to be recoverable a confirm aswell as a revert resize both results in http 400 (Instance has not been resized) afterwards.
First we thought that it might be a problem that we were still running neutron victoria in our test environment but I can confirm that this is still an issue with both nova and neutron running wallaby together. As the nova upgrade to wallaby triggered this issue I assume there might be a bug somewhere in nova.
We are running nova with libvirt 8.0.0 / qemu 4.2.1, and neutron ovs/hybridIptables.
This is a complete showstopper for us to further upgrade as users are able to effectively bring vms in a state which they cannot recover from anymore.
Is there any help we can give to find the root cause of this issue?
Because there are at least 3 cloud operators in this bug report that are hitting this issue independently I moved this issue back to New from Incomplete.
We just upgraded nova to wallaby(23.2.1) in our test environment and after the upgrade our tempest tests fail for the revert resize tests. (e.g. tempest. api.compute. servers. test_server_ actions. ServerActionsTe stJSON. test_resize_ server_ revert)
After we confirmed that resize revert is not working anymore by manually testing it we searched and found this bug report.
I can confirm exactly what Will Szumski states in this bug report, its the exakt same issue for us. vif-plugged' , '4aa5c785- 816b-4ccc- bd4c-54c9cf408b a4')])
-> Resize instance to bigger flavor
-> Revert the resize
-> Wait for log event (Timeout waiting for Neutron events: [('network-
-> Instance is back in VERIFY_RESIZE state
As Will Szumski also stated after that the instance does not seem to be recoverable a confirm aswell as a revert resize both results in http 400 (Instance has not been resized) afterwards.
First we thought that it might be a problem that we were still running neutron victoria in our test environment but I can confirm that this is still an issue with both nova and neutron running wallaby together. As the nova upgrade to wallaby triggered this issue I assume there might be a bug somewhere in nova.
We are running nova with libvirt 8.0.0 / qemu 4.2.1, and neutron ovs/hybridIptables.
This is a complete showstopper for us to further upgrade as users are able to effectively bring vms in a state which they cannot recover from anymore.
Is there any help we can give to find the root cause of this issue?
Because there are at least 3 cloud operators in this bug report that are hitting this issue independently I moved this issue back to New from Incomplete.