Noticed here:
https://logs.opendev.org/32/634832/43/check/nova-tox-functional-py36/d4f3be5/testr_results.html.gz
With this test:
nova.tests.functional.notification_sample_tests.test_service.TestServiceUpdateNotificationSampleLatest.test_service_disabled
That's a simple test which disables a service and then asserts there is a service.update notification, but there is another notification happening as well:
Traceback (most recent call last):
File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/notification_sample_tests/test_service.py", line 122, in test_service_disabled
'uuid': self.service_uuid})
File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/notification_sample_tests/test_service.py", line 37, in _verify_notification
base._verify_notification(sample_file_name, replacements, actual)
File "/home/zuul/src/opendev.org/openstack/nova/nova/tests/functional/notification_sample_tests/notification_sample_base.py", line 148, in _verify_notification
self.assertEqual(1, len(fake_notifier.VERSIONED_NOTIFICATIONS))
File "/home/zuul/src/opendev.org/openstack/nova/.tox/functional-py36/lib/python3.6/site-packages/testtools/testcase.py", line 411, in assertEqual
self.assertThat(observed, matcher, message)
File "/home/zuul/src/opendev.org/openstack/nova/.tox/functional-py36/lib/python3.6/site-packages/testtools/testcase.py", line 498, in assertThat
raise mismatch_error
testtools.matchers._impl.MismatchError: 1 != 2
And in the error output, we can see this weird traceback of a resize revert failure b/c the NeutronFixture isn't being used:
2019-08-07 23:22:23,621 ERROR [nova.network.neutronv2.api] The [neutron] section of your nova configuration file must be configured for authentication with the networking service endpoint. See the networking service install guide for details: https://docs.openstack.org/neutron/latest/install/
2019-08-07 23:22:23,634 ERROR [nova.compute.manager] Setting instance vm_state to ERROR
Traceback (most recent call last):
File "/home/zuul/src/opendev.org/openstack/nova/nova/compute/manager.py", line 8656, in _error_out_instance_on_exception
yield
File "/home/zuul/src/opendev.org/openstack/nova/nova/compute/manager.py", line 4830, in _resize_instance
migration_p)
File "/home/zuul/src/opendev.org/openstack/nova/nova/network/neutronv2/api.py", line 2697, in migrate_instance_start
client = _get_ksa_client(context, admin=True)
File "/home/zuul/src/opendev.org/openstack/nova/nova/network/neutronv2/api.py", line 215, in _get_ksa_client
auth_plugin = _get_auth_plugin(context, admin=admin)
File "/home/zuul/src/opendev.org/openstack/nova/nova/network/neutronv2/api.py", line 151, in _get_auth_plugin
_ADMIN_AUTH = _load_auth_plugin(CONF)
File "/home/zuul/src/opendev.org/openstack/nova/nova/network/neutronv2/api.py", line 82, in _load_auth_plugin
raise neutron_client_exc.Unauthorized(message=err_msg)
neutronclient.common.exceptions.Unauthorized: Unknown auth type: None
According to logstash this started showing up around 8/5:
http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22ERROR%20%5Bnova.network.neutronv2.api%5D%20The%20%5Bneutron%5D%20section%20of%20your%20nova%20configuration%20file%20must%20be%20configured%20for%20authentication%20with%20the%20networking%20service%20endpoint.%5C%22%20AND%20tags%3A%5C%22console%5C%22&from=7d
Which makes me think this change, which is restarting a compute service and sleeping in a stub:
https://review.opendev.org/#/c/670393/
Might be screwing up concurrently running tests.
Looking at when that test runs and the ones that fails:
2019-08-07 23:21:54.157918 | ubuntu-bionic | {4} nova.tests.functional.compute.test_init_host.ComputeManagerInitHostTestCase.test_migrate_disk_and_power_off_crash_finish_revert_migration [4.063814s] ... ok
2019-08-07 23:25:00.073443 | ubuntu-bionic | {4} nova.tests.functional.notification_sample_tests.test_service.TestServiceUpdateNotificationSampleLatest.test_service_disabled [160.155643s] ... FAILED
We can see they are on the same worker process and run at about the same time.
Furthermore, we can see that TestServiceUpdateNotificationSampleLatest.test_service_disabled eventually times out after 160 seconds and this is in the error output:
2019-08-07 23:24:59,911 ERROR [nova.compute.api] An error occurred while updating the COMPUTE_STATUS_DISABLED trait on compute node resource providers managed by host host1. The trait will be synchronized automatically by the compute service when the update_available_resource periodic task runs.
Traceback (most recent call last):
File "/home/zuul/src/opendev.org/openstack/nova/nova/compute/api.py", line 5034, in _update_compute_provider_status
self.rpcapi.set_host_enabled(context, service.host, enabled)
File "/home/zuul/src/opendev.org/openstack/nova/nova/compute/rpcapi.py", line 996, in set_host_enabled
return cctxt.call(ctxt, 'set_host_enabled', enabled=enabled)
File "/home/zuul/src/opendev.org/openstack/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_messaging/rpc/client.py", line 181, in call
transport_options=self.transport_options)
File "/home/zuul/src/opendev.org/openstack/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_messaging/transport.py", line 129, in _send
transport_options=transport_options)
File "/home/zuul/src/opendev.org/openstack/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_messaging/_drivers/impl_fake.py", line 224, in send
transport_options)
File "/home/zuul/src/opendev.org/openstack/nova/.tox/functional-py36/lib/python3.6/site-packages/oslo_messaging/_drivers/impl_fake.py", line 208, in _send
reply, failure = reply_q.get(timeout=timeout)
File "/home/zuul/src/opendev.org/openstack/nova/.tox/functional-py36/lib/python3.6/site-packages/eventlet/queue.py", line 322, in get
return waiter.wait()
File "/home/zuul/src/opendev.org/openstack/nova/.tox/functional-py36/lib/python3.6/site-packages/eventlet/queue.py", line 141, in wait
return get_hub().switch()
File "/home/zuul/src/opendev.org/openstack/nova/.tox/functional-py36/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 298, in switch
return self.greenlet.switch()
File "/home/zuul/src/opendev.org/openstack/nova/.tox/functional-py36/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 350, in run
self.wait(sleep_time)
File "/home/zuul/src/opendev.org/openstack/nova/.tox/functional-py36/lib/python3.6/site-packages/eventlet/hubs/poll.py", line 77, in wait
time.sleep(seconds)
File "/home/zuul/src/opendev.org/openstack/nova/.tox/functional-py36/lib/python3.6/site-packages/fixtures/_fixtures/timeout.py", line 52, in signal_handler
raise TimeoutException()
fixtures._fixtures.timeout.TimeoutException
So test_migrate_disk_and_power_off_crash_finish_revert_migration is probably not cleaning up properly.
http:// logstash. openstack. org/#dashboard/ file/logstash. json?query= message% 3A%5C%22ERROR% 20%5Bnova. compute. manager% 5D%20Did% 20not%20find% 20resource% 20allocations% 20for%20migrati on%5C%22% 20AND%20message %3A%5C% 22on%20source% 20node% 5C%22%20AND% 20message% 3A%5C%22Unable% 20to%20revert% 20source% 20node% 20allocations% 20back% 20to%20the% 20instance. %5C%22% 20AND%20tags% 3A%5C%22console %5C%22% 20AND%20( build_name% 3A%5C%22nova- tox-functional% 5C%22%20OR% 20build_ name%3A% 5C%22nova- tox-functional- py36%5C% 22)&from= 7d is a better query, gets more hits including in the gate.