2) run tox with tox -e functional-py38 "test_description_errors|test_unshelve_offloaded_server_with_qos_port_pci_update_fails" -- --serial
This make one of the test_unshelve_offloaded_server_with_qos_port_pci_update_fails test case fail with the reported error.
What I know so far:
* nova.tests.functional.test_servers.ServersTestV219.test_description_errors() starts a new instance but does not wait for it to become ACTIVE. The test case passes and finishes.
* But the build_and_run_instance RPC call still runs in compute service in a greenlet building up the instance. The service.kill at the end of the test case does not kill the running / waiting greenlets. I proved this by dumping a gmr at the end of the test run after the service.kill was called by the Fixture.cleanup. The build_and_run_instance greenlet was visible there.
* Still other fixture cleanup drops the database from behind the compute service. This leads to ComputeNodeNotFound error. You can see this by simply adding a self.fail() at the end of test_description_errors test case and run it. There will be ComputeNodeNotFound in the logs.
* After test_description_errors passes the test executor runs the next test cases
* After ~60 seconds an RPC timeout happens in the original greenlet from test_description_errors() and it *somehow* interferes with the currently running test making that to fail.
What I don't know yet is what is the way of the interference.
I managed to create a stable reproduction locally. \o/
1) duplicate the failing tests
diff --git a/nova/ tests/functiona l/test_ servers_ resource_ request. py b/nova/ tests/functiona l/test_ servers_ resource_ request. py .d678067c18 100644 tests/functiona l/test_ servers_ resource_ request. py tests/functiona l/test_ servers_ resource_ request. py ortResourceRequ estTest(
self. _delete_ server_ and_check_ allocations(
server, qos_normal_port, qos_sriov_port)
index e2746b3669.
--- a/nova/
+++ b/nova/
@@ -2572,6 +2572,18 @@ class ServerMoveWithP
+ def test_unshelve_ offloaded_ server_ with_qos_ port_pci_ update_ fails0( self): unshelve_ offloaded_ server_ with_qos_ port_pci_ update_ fails() offloaded_ server_ with_qos_ port_pci_ update_ fails1( self): unshelve_ offloaded_ server_ with_qos_ port_pci_ update_ fails() offloaded_ server_ with_qos_ port_pci_ update_ fails2( self): unshelve_ offloaded_ server_ with_qos_ port_pci_ update_ fails() offloaded_ server_ with_qos_ port_pci_ update_ fails3( self): unshelve_ offloaded_ server_ with_qos_ port_pci_ update_ fails()
+ self.test_
+
+ def test_unshelve_
+ self.test_
+
+ def test_unshelve_
+ self.test_
+
+ def test_unshelve_
+ self.test_
+
2) run tox with tox -e functional-py38 "test_descripti on_errors| test_unshelve_ offloaded_ server_ with_qos_ port_pci_ update_ fails" -- --serial
This make one of the test_unshelve_ offloaded_ server_ with_qos_ port_pci_ update_ fails test case fail with the reported error.
What I know so far: functional. test_servers. ServersTestV219 .test_descripti on_errors( ) starts a new instance but does not wait for it to become ACTIVE. The test case passes and finishes.
* nova.tests.
* But the build_and_ run_instance RPC call still runs in compute service in a greenlet building up the instance. The service.kill at the end of the test case does not kill the running / waiting greenlets. I proved this by dumping a gmr at the end of the test run after the service.kill was called by the Fixture.cleanup. The build_and_ run_instance greenlet was visible there.
* Still other fixture cleanup drops the database from behind the compute service. This leads to ComputeNodeNotFound error. You can see this by simply adding a self.fail() at the end of test_descriptio n_errors test case and run it. There will be ComputeNodeNotFound in the logs.
* After test_descriptio n_errors passes the test executor runs the next test cases
* After ~60 seconds an RPC timeout happens in the original greenlet from test_descriptio n_errors( ) and it *somehow* interferes with the currently running test making that to fail.
What I don't know yet is what is the way of the interference.