test_bfv_quota_race_local_delete intermittently fails with testtools.matchers._impl.MismatchError: ['bfv'] != []

Bug #1820337 reported by Matt Riedemann on 2019-03-15
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Matthew Booth
Pike
High
Matt Riedemann
Queens
High
Matt Riedemann
Rocky
High
Matt Riedemann
Stein
High
Matt Riedemann

Bug Description

Seen here:

http://logs.openstack.org/72/640772/1/gate/nova-tox-functional-py35/44118f7/job-output.txt.gz#_2019-03-13_18_36_04_685027

2019-03-13 18:36:04.685027 | ubuntu-xenial | {2} nova.tests.functional.regressions.test_bug_1806064.BootFromVolumeOverQuotaRaceDeleteTest.test_bfv_quota_race_local_delete [2.015433s] ... FAILED
2019-03-13 18:36:04.685120 | ubuntu-xenial |
2019-03-13 18:36:04.685187 | ubuntu-xenial | Captured traceback:
2019-03-13 18:36:04.685251 | ubuntu-xenial | ~~~~~~~~~~~~~~~~~~~
2019-03-13 18:36:04.685350 | ubuntu-xenial | b'Traceback (most recent call last):'
2019-03-13 18:36:04.685658 | ubuntu-xenial | b' File "/home/zuul/src/git.openstack.org/openstack/nova/nova/tests/functional/regressions/test_bug_1806064.py", line 129, in test_bfv_quota_race_local_delete'
2019-03-13 18:36:04.685776 | ubuntu-xenial | b" self.assertEqual(['bfv'], server['tags'])"
2019-03-13 18:36:04.686080 | ubuntu-xenial | b' File "/home/zuul/src/git.openstack.org/openstack/nova/.tox/py35/lib/python3.5/site-packages/testtools/testcase.py", line 411, in assertEqual'
2019-03-13 18:36:04.686207 | ubuntu-xenial | b' self.assertThat(observed, matcher, message)'
2019-03-13 18:36:04.686487 | ubuntu-xenial | b' File "/home/zuul/src/git.openstack.org/openstack/nova/.tox/py35/lib/python3.5/site-packages/testtools/testcase.py", line 498, in assertThat'
2019-03-13 18:36:04.686569 | ubuntu-xenial | b' raise mismatch_error'
2019-03-13 18:36:04.686699 | ubuntu-xenial | b"testtools.matchers._impl.MismatchError: ['bfv'] != []"
2019-03-13 18:36:04.686740 | ubuntu-xenial | b''
2019-03-13 18:36:04.686770 | ubuntu-xenial |
2019-03-13 18:36:04.686838 | ubuntu-xenial | Captured pythonlogging:
2019-03-13 18:36:04.686905 | ubuntu-xenial | ~~~~~~~~~~~~~~~~~~~~~~~
2019-03-13 18:36:04.695625 | ubuntu-xenial | b"2019-03-13 18:36:02,849 INFO [nova.api.openstack.placement.objects.resource_provider] Synced traits from os_traits into API DB: {'HW_NIC_SRIOV_QOS_RX', 'HW_CPU_X86_SVM', 'HW_CPU_X86_AVX512VL', 'HW_GPU_API_DIRECT3D_V9_0', 'HW_NIC_OFFLOAD_SG', 'HW_NIC_OFFLOAD_TSO', 'HW_NIC_OFFLOAD_GSO', 'HW_GPU_API_CUDA_V2_0', 'HW_GPU_RESOLUTION_W3840H2160', 'HW_GPU_API_OPENGL_V1_2', 'HW_CPU_X86_AVX2', 'HW_GPU_API_CUDA_V7_0', 'HW_CPU_X86_AVX512BW', 'HW_CPU_HYPERTHREADING', 'HW_NIC_SRIOV_TRUSTED', 'HW_GPU_RESOLUTION_W2560H1600', 'HW_CPU_X86_TBM', 'HW_GPU_API_OPENGL_V4_2', 'HW_NIC_OFFLOAD_RX', 'HW_NIC_OFFLOAD_TXVLAN', 'HW_NIC_ACCEL_ECC', 'HW_CPU_X86_AVX512CD', 'HW_NIC_VMDQ', 'HW_GPU_MAX_DISPLAY_HEADS_1', 'HW_CPU_AARCH64_SHA512', 'HW_GPU_API_OPENGL_V3_3', 'HW_GPU_RESOLUTION_W800H600', 'HW_CPU_AARCH64_SM4', 'COMPUTE_VOLUME_ATTACH_WITH_TAG', 'HW_CPU_AARCH64_ASIMDHP', 'HW_CPU_AARCH64_DCPOP', 'HW_CPU_AARCH64_FCMA', 'HW_GPU_API_OPENCL_V2_0', 'HW_NIC_ACCEL_RSA', 'HW_GPU_API_VULKAN', 'HW_GPU_MAX_DISPLAY_HEADS_2', 'HW_NIC_DCB_PFC', 'HW_GPU_API_DIRECT3D_V9_0L', 'HW_NIC_MULTIQUEUE', 'HW_GPU_RESOLUTION_W1600H1200', 'HW_CPU_X86_CLMUL', 'HW_GPU_API_CUDA_V7_1', 'HW_CPU_AARCH64_FPHP', 'HW_GPU_RESOLUTION_W640H480', 'HW_CPU_AARCH64_PMULL', 'HW_NIC_OFFLOAD_TXUDP', 'HW_GPU_MAX_DISPLAY_HEADS_6', 'HW_NIC_ACCEL_SSL', 'HW_CPU_X86_3DNOW', 'STORAGE_DISK_SSD', 'HW_GPU_RESOLUTION_W1680H1050', 'HW_NIC_DCB_ETS', 'HW_CPU_X86_SSE2', 'HW_NIC_OFFLOAD_RXHASH', 'HW_GPU_API_CUDA_V6_0', 'MISC_SHARES_VIA_AGGREGATE', 'HW_GPU_RESOLUTION_W1152H864', 'HW_CPU_X86_MPX', 'HW_GPU_RESOLUTION_W1360H768', 'HW_GPU_API_OPENGL_V1_5', 'HW_CPU_X86_F16C', 'HW_GPU_RESOLUTION_W1024H600', 'HW_CPU_X86_AVX512F', 'HW_GPU_API_DIRECT3D_V9_0B', 'HW_NIC_OFFLOAD_QINQ', 'HW_GPU_RESOLUTION_W1024H768', 'HW_GPU_API_OPENGL_V4_3', 'HW_GPU_API_OPENGL_V2_0', 'HW_GPU_API_DIRECT3D_V8_1', 'HW_NIC_OFFLOAD_GRO', 'HW_GPU_RESOLUTION_W1366H768', 'HW_CPU_X86_AVX512ER', 'HW_NIC_OFFLOAD_UFO', 'HW_CPU_AARCH64_SVE', 'HW_GPU_RESOLUTION_W1280H800', 'HW_NIC_ACCEL_LZS', 'COMPUTE_DEVICE_TAGGING', 'HW_CPU_AARCH64_ATOMICS', 'HW_NIC_OFFLOAD_RXVLAN', 'HW_GPU_API_CUDA_V5_0', 'HW_GPU_API_OPENCL_V1_0', 'HW_CPU_X86_BMI2', 'HW_CPU_X86_MMX', 'COMPUTE_VOLUME_MULTI_ATTACH', 'HW_NIC_OFFLOAD_LRO', 'HW_GPU_API_CUDA_V3_2', 'HW_NIC_OFFLOAD_UCS', 'COMPUTE_TRUSTED_CERTS', 'HW_CPU_AARCH64_ASIMDRDM', 'HW_NIC_OFFLOAD_SCS', 'COMPUTE_NET_ATTACH_INTERFACE', 'HW_CPU_AARCH64_SHA3', 'HW_CPU_X86_SSSE3', 'COMPUTE_VOLUME_EXTEND', 'HW_GPU_RESOLUTION_W1600H900', 'HW_GPU_API_OPENCL_V2_1', 'HW_CPU_AARCH64_CRC32', 'HW_GPU_API_OPENGL_V4_4', 'HW_GPU_API_DIRECTX_V10', 'HW_GPU_API_CUDA_V6_2', 'HW_GPU_API_OPENCL_V1_2', 'HW_GPU_API_OPENGL_V2_1', 'HW_GPU_API_DIRECT3D_V10_1', 'HW_NIC_ACCEL_IPSEC', 'HW_CPU_AARCH64_EVTSTRM', 'HW_CPU_X86_FMA3', 'HW_GPU_API_OPENGL_V4_0', 'HW_NIC_ACCEL_DEFLATE', 'HW_NIC_OFFLOAD_TX', 'HW_CPU_X86_SSE41', 'HW_GPU_API_OPENCL_V1_1', 'HW_NIC_OFFLOAD_SWITCHDEV', 'HW_GPU_API_OPENGL_V3_1', 'HW_NIC_OFFLOAD_GRE', 'HW_CPU_AARCH64_FP', 'HW_GPU_RESOLUTION_W1440H900', 'HW_GPU_API_DIRECT3D_V6_0', 'HW_NIC_SRIOV', 'HW_CPU_AARCH64_ASIMDDP', 'HW_GPU_API_OPENGL_V4_5', 'HW_CPU_X86_SSE', 'HW_CPU_X86_SSE4A', 'HW_CPU_X86_XOP', 'HW_NIC_SRIOV_MULTIQUEUE', 'HW_CPU_X86_SHA', 'HW_CPU_AARCH64_SM3', 'HW_GPU_RESOLUTION_W1280H768', 'HW_GPU_RESOLUTION_W1920H1080', 'HW_NIC_OFFLOAD_LSO', 'HW_GPU_RESOLUTION_W320H240', 'COMPUTE_VOLUME_ATTACH', 'HW_GPU_API_CUDA_V6_1', 'HW_GPU_API_DIRECT3D_V12_0', 'HW_CPU_AARCH64_ASIMD', 'HW_CPU_X86_ABM', 'HW_CPU_X86_AVX512DQ', 'HW_GPU_RESOLUTION_W7680H4320', 'HW_GPU_API_CUDA_V5_2', 'HW_CPU_X86_VMX', 'HW_GPU_API_DIRECT3D_V8_0', 'HW_GPU_API_CUDA_V1_0', 'HW_GPU_RESOLUTION_W2560H1440', 'HW_CPU_X86_SSE3', 'HW_GPU_API_DIRECT3D_V9_0C', 'HW_NIC_OFFLOAD_GENEVE', 'HW_GPU_API_CUDA_V3_7', 'HW_CPU_X86_FMA4', 'HW_GPU_API_DIRECT2D', 'HW_CPU_AARCH64_CPUID', 'HW_NIC_SRIOV_QOS_TX', 'HW_CPU_X86_AVX', 'HW_CPU_X86_AVX512PF', 'HW_CPU_AARCH64_AES', 'HW_GPU_API_CUDA_V1_1', 'COMPUTE_NET_ATTACH_INTERFACE_WITH_TAG', 'HW_GPU_MAX_DISPLAY_HEADS_4', 'HW_CPU_AARCH64_LRCPC', 'HW_GPU_API_OPENGL_V3_2', 'HW_GPU_API_OPENGL_V1_3', 'HW_CPU_X86_ASF', 'HW_CPU_X86_TSX', 'HW_GPU_API_CUDA_V3_5', 'HW_GPU_API_DIRECT3D_V11_2', 'HW_GPU_API_CUDA_V1_2', 'STORAGE_DISK_HDD', 'HW_CPU_AARCH64_SHA1', 'HW_GPU_API_DIRECT3D_V11_0', 'HW_CPU_X86_SGX', 'HW_NIC_PROGRAMMABLE_PIPELINE', 'HW_GPU_API_DIRECTX_V12', 'HW_NIC_OFFLOAD_FDF', 'HW_NIC_DCB_QCN', 'HW_GPU_API_DXVA', 'HW_CPU_X86_BMI', 'HW_NIC_OFFLOAD_L2CRC', 'HW_GPU_MAX_DISPLAY_HEADS_8', 'HW_GPU_API_OPENGL_V1_4', 'HW_GPU_API_OPENGL_V3_0', 'HW_GPU_API_CUDA_V1_3', 'HW_GPU_API_OPENGL_V4_1', 'HW_CPU_AARCH64_JSCVT', 'HW_GPU_API_OPENCL_V2_2', 'HW_NIC_ACCEL_TLS', 'HW_GPU_RESOLUTION_W1920H1200', 'HW_GPU_API_DIRECT3D_V11_3', 'HW_GPU_API_CUDA_V2_1', 'HW_GPU_API_CUDA_V5_3', 'HW_NIC_OFFLOAD_RDMA', 'HW_GPU_RESOLUTION_W1280H1024', 'HW_GPU_API_DIRECT3D_V11_1', 'HW_GPU_RESOLUTION_W1280H720', 'HW_GPU_API_CUDA_V3_0', 'HW_GPU_API_DIRECTX_V11', 'HW_GPU_API_DIRECT3D_V10_0', 'HW_NIC_OFFLOAD_TCS', 'HW_CPU_AARCH64_SHA2', 'HW_NIC_OFFLOAD_VXLAN', 'HW_CPU_X86_AESNI', 'HW_GPU_API_OPENGL_V1_1', 'HW_NIC_ACCEL_DIFFIEH', 'HW_CPU_X86_SSE42', 'HW_GPU_API_DIRECT3D_V7_0'}"
2019-03-13 18:36:04.695837 | ubuntu-xenial | b'2019-03-13 18:36:03,394 INFO [nova.service] Starting conductor node (version 18.1.1)'
2019-03-13 18:36:04.696023 | ubuntu-xenial | b'2019-03-13 18:36:03,418 INFO [nova.service] Starting scheduler node (version 18.1.1)'
2019-03-13 18:36:04.696222 | ubuntu-xenial | b"2019-03-13 18:36:03,447 INFO [nova.virt.driver] Loading compute driver 'fake.SmallFakeDriver'"
2019-03-13 18:36:04.696401 | ubuntu-xenial | b'2019-03-13 18:36:03,447 INFO [nova.service] Starting compute node (version 18.1.1)'
2019-03-13 18:36:04.696782 | ubuntu-xenial | b'2019-03-13 18:36:03,478 WARNING [nova.compute.manager] No compute node record found for host compute. If this is the first time this service is starting on this host, then you can ignore this warning.'
2019-03-13 18:36:04.697117 | ubuntu-xenial | b'2019-03-13 18:36:03,478 WARNING [nova.compute.monitors] Excluding nova.compute.monitors.cpu monitor virt_driver. Not in the list of enabled monitors (CONF.compute_monitors).'
2019-03-13 18:36:04.697340 | ubuntu-xenial | b'2019-03-13 18:36:03,483 WARNING [nova.compute.resource_tracker] No compute node record for compute:fake-mini'
2019-03-13 18:36:04.697649 | ubuntu-xenial | b'2019-03-13 18:36:03,488 INFO [nova.compute.resource_tracker] Compute node record created for compute:fake-mini with uuid: 01b19de8-1ab7-4e7b-ab4d-e075de00ccc8'
2019-03-13 18:36:04.705131 | ubuntu-xenial | b'2019-03-13 18:36:03,523 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers/01b19de8-1ab7-4e7b-ab4d-e075de00ccc8/allocations" status: 404 len: 303 microversion: 1.0'
2019-03-13 18:36:04.705527 | ubuntu-xenial | b'2019-03-13 18:36:03,524 INFO [nova.compute.resource_tracker] Final resource view: name=fake-mini phys_ram=8192MB used_ram=512MB phys_disk=1028GB used_disk=0GB total_vcpus=2 used_vcpus=0 pci_stats=[]'
2019-03-13 18:36:04.705934 | ubuntu-xenial | b'2019-03-13 18:36:03,545 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers?in_tree=01b19de8-1ab7-4e7b-ab4d-e075de00ccc8" status: 200 len: 26 microversion: 1.14'
2019-03-13 18:36:04.706247 | ubuntu-xenial | b'2019-03-13 18:36:03,555 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "POST /placement/resource_providers" status: 200 len: 832 microversion: 1.20'
2019-03-13 18:36:04.706700 | ubuntu-xenial | b'2019-03-13 18:36:03,556 INFO [nova.scheduler.client.report] [req-c1e03ec8-485a-4fd2-8447-521ee1d3b2e1] Created resource provider record via placement API for resource provider with UUID 01b19de8-1ab7-4e7b-ab4d-e075de00ccc8 and name fake-mini.'
2019-03-13 18:36:04.707086 | ubuntu-xenial | b'2019-03-13 18:36:03,567 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/resource_providers/01b19de8-1ab7-4e7b-ab4d-e075de00ccc8/inventories" status: 200 len: 54 microversion: 1.0'
2019-03-13 18:36:04.707475 | ubuntu-xenial | b'2019-03-13 18:36:03,585 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "PUT /placement/resource_providers/01b19de8-1ab7-4e7b-ab4d-e075de00ccc8/inventories" status: 200 len: 399 microversion: 1.26'
2019-03-13 18:36:04.707738 | ubuntu-xenial | b'2019-03-13 18:36:03,592 INFO [nova.api.openstack.requestlog] 127.0.0.1 "GET /v2.1" status: 204 len: 0 microversion: - time: 0.000460'
2019-03-13 18:36:04.708095 | ubuntu-xenial | b'2019-03-13 18:36:03,632 INFO [nova.api.openstack.requestlog] 127.0.0.1 "GET /v2.1/6f70656e737461636b20342065766572/flavors/detail" status: 200 len: 2747 microversion: 2.52 time: 0.034817'
2019-03-13 18:36:04.708440 | ubuntu-xenial | b'2019-03-13 18:36:03,770 INFO [nova.api.openstack.requestlog] 127.0.0.1 "POST /v2.1/6f70656e737461636b20342065766572/servers" status: 202 len: 480 microversion: 2.52 time: 0.132783'
2019-03-13 18:36:04.708847 | ubuntu-xenial | b'2019-03-13 18:36:03,814 INFO [nova.api.openstack.requestlog] 127.0.0.1 "GET /v2.1/6f70656e737461636b20342065766572/servers/8dba4237-4817-4f31-a930-35a8ab7e7469" status: 200 len: 1183 microversion: 2.52 time: 0.039646'
2019-03-13 18:36:04.709248 | ubuntu-xenial | b'2019-03-13 18:36:03,872 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/allocation_candidates?limit=1000&resources=MEMORY_MB%3A512%2CVCPU%3A1" status: 200 len: 307 microversion: 1.25'
2019-03-13 18:36:04.709610 | ubuntu-xenial | b'2019-03-13 18:36:04,231 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "GET /placement/allocations/8dba4237-4817-4f31-a930-35a8ab7e7469" status: 200 len: 19 microversion: 1.0'
2019-03-13 18:36:04.709987 | ubuntu-xenial | b'2019-03-13 18:36:04,263 INFO [nova.api.openstack.placement.requestlog] 127.0.0.1 "PUT /placement/allocations/8dba4237-4817-4f31-a930-35a8ab7e7469" status: 204 len: 0 microversion: 1.25'
2019-03-13 18:36:04.710248 | ubuntu-xenial | b'2019-03-13 18:36:04,315 WARNING [nova.scheduler.utils] Failed to compute_task_build_instances: test_bfv_quota_race_local_delete'
2019-03-13 18:36:04.710436 | ubuntu-xenial | b'2019-03-13 18:36:04,316 WARNING [nova.scheduler.utils] Setting instance to ERROR state.'
2019-03-13 18:36:04.710629 | ubuntu-xenial | b'2019-03-13 18:36:04,470 ERROR [oslo_messaging.rpc.server] Exception during message handling'
2019-03-13 18:36:04.710727 | ubuntu-xenial | b'Traceback (most recent call last):'
2019-03-13 18:36:04.711030 | ubuntu-xenial | b' File "/home/zuul/src/git.openstack.org/openstack/nova/.tox/py35/lib/python3.5/site-packages/oslo_messaging/rpc/server.py", line 163, in _process_incoming'
2019-03-13 18:36:04.711144 | ubuntu-xenial | b' res = self.dispatcher.dispatch(message)'
2019-03-13 18:36:04.711438 | ubuntu-xenial | b' File "/home/zuul/src/git.openstack.org/openstack/nova/.tox/py35/lib/python3.5/site-packages/oslo_messaging/rpc/dispatcher.py", line 265, in dispatch'
2019-03-13 18:36:04.711577 | ubuntu-xenial | b' return self._do_dispatch(endpoint, method, ctxt, args)'
2019-03-13 18:36:04.711879 | ubuntu-xenial | b' File "/home/zuul/src/git.openstack.org/openstack/nova/.tox/py35/lib/python3.5/site-packages/oslo_messaging/rpc/dispatcher.py", line 194, in _do_dispatch'
2019-03-13 18:36:04.711979 | ubuntu-xenial | b' result = func(ctxt, **new_args)'
2019-03-13 18:36:04.712232 | ubuntu-xenial | b' File "/home/zuul/src/git.openstack.org/openstack/nova/nova/conductor/manager.py", line 1323, in schedule_and_build_instances'
2019-03-13 18:36:04.712312 | ubuntu-xenial | b' cell_mapping_cache)'
2019-03-13 18:36:04.712589 | ubuntu-xenial | b' File "/home/zuul/src/git.openstack.org/openstack/nova/.tox/py35/lib/python3.5/site-packages/oslo_utils/excutils.py", line 220, in __exit__'
2019-03-13 18:36:04.712670 | ubuntu-xenial | b' self.force_reraise()'
2019-03-13 18:36:04.712955 | ubuntu-xenial | b' File "/home/zuul/src/git.openstack.org/openstack/nova/.tox/py35/lib/python3.5/site-packages/oslo_utils/excutils.py", line 196, in force_reraise'
2019-03-13 18:36:04.713077 | ubuntu-xenial | b' six.reraise(self.type_, self.value, self.tb)'
2019-03-13 18:36:04.713325 | ubuntu-xenial | b' File "/home/zuul/src/git.openstack.org/openstack/nova/.tox/py35/lib/python3.5/site-packages/six.py", line 693, in reraise'
2019-03-13 18:36:04.713391 | ubuntu-xenial | b' raise value'
2019-03-13 18:36:04.713645 | ubuntu-xenial | b' File "/home/zuul/src/git.openstack.org/openstack/nova/nova/conductor/manager.py", line 1316, in schedule_and_build_instances'
2019-03-13 18:36:04.713748 | ubuntu-xenial | b' orig_num_req=len(build_requests))'
2019-03-13 18:36:04.714063 | ubuntu-xenial | b' File "/home/zuul/src/git.openstack.org/openstack/nova/nova/tests/functional/regressions/test_bug_1806064.py", line 96, in stub_check_num_instances_quota'
2019-03-13 18:36:04.714174 | ubuntu-xenial | b" 'test_bfv_quota_race_local_delete')"
2019-03-13 18:36:04.714324 | ubuntu-xenial | b'nova.exception.TooManyInstances: test_bfv_quota_race_local_delete'
2019-03-13 18:36:04.714731 | ubuntu-xenial | b'2019-03-13 18:36:04,475 INFO [nova.api.openstack.requestlog] 127.0.0.1 "GET /v2.1/6f70656e737461636b20342065766572/servers/8dba4237-4817-4f31-a930-35a8ab7e7469" status: 200 len: 1340 microversion: 2.52 time: 0.109376'
2019-03-13 18:36:04.714777 | ubuntu-xenial | b''

Interestingly this seems to be only happening in stable branches:

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22testtools.matchers._impl.MismatchError%3A%20%5B'bfv'%5D%20!%3D%20%5B%5D%5C%22%20AND%20tags%3A%5C%22console%5C%22&from=7d

Matthew Booth (mbooth-9) wrote :

This is a genuine race which is still present in master, but for some reason I haven't managed to work out I can only reproduce on Rocky. I can reproduce it reliably on Rocky by adding a pdb breakpoint (a sleep doesn't work) to stub_check_num_instances_quota in test_bfv_quota_race_local_delete. The same does not reproduce on master... although the bug is still present.

The issue is that in conductor manager when the check_num_instances_quota check raises _cleanup_build_artifacts we create the instance mapping before creating BDMs and tags in the target cell, i.e. before the instance record is complete in that cell. When the functional test races, the fetch in _wait_for_state_change pulls the instance from the cell before it has finished being written, and therefore before the tags are present. See compute.API._get_instance().

I don't understand why this race occurs only in Rocky, but in master the test's GET is just never scheduled there. This is highly likely to be some really subtle eventlet thing which isn't important. However, if it was it would fail in the same way.

Fix proposed to branch: master
Review: https://review.openstack.org/645546

Changed in nova:
assignee: nobody → Matthew Booth (mbooth-9)
status: Confirmed → In Progress
Elod Illes (elod-illes) wrote :

Just a note, that according to Logstash the problem is there in Queens and Pike, too.

Changed in nova:
assignee: Matthew Booth (mbooth-9) → Matt Riedemann (mriedem)
Matt Riedemann (mriedem) on 2019-03-26
Changed in nova:
assignee: Matt Riedemann (mriedem) → Matthew Booth (mbooth-9)

Reviewed: https://review.openstack.org/645546
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=64f6cbc9120e3c288f312eddc59452dee4998f93
Submitter: Zuul
Branch: master

commit 64f6cbc9120e3c288f312eddc59452dee4998f93
Author: Matthew Booth <email address hidden>
Date: Fri Mar 22 11:47:32 2019 +0000

    Fix incomplete instance data returned after build failure

    This change fixes a race in _cleanup_build_artifacts. We were updating
    the instance mapping to point at the cell in which the instance was
    created before the instance record was complete, i.e. before the related
    BDMs and tags were created in the cell DB. Updating the instance mapping
    exposes the cell's version of the instance to the API. If the API happened
    to fetch it before it was finished being created it would return
    incomplete data.

    Co-Authored-By: Matt Riedemann <email address hidden>

    Closes-Bug: #1820337
    Change-Id: If966eb1161c842ff49aa530e4482dbca87b61a3e

Changed in nova:
status: In Progress → Fix Released

Reviewed: https://review.openstack.org/647902
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=eec39762cf5343b354e2ac06a6682a7c3b4230c2
Submitter: Zuul
Branch: stable/stein

commit eec39762cf5343b354e2ac06a6682a7c3b4230c2
Author: Matthew Booth <email address hidden>
Date: Fri Mar 22 11:47:32 2019 +0000

    Fix incomplete instance data returned after build failure

    This change fixes a race in _cleanup_build_artifacts. We were updating
    the instance mapping to point at the cell in which the instance was
    created before the instance record was complete, i.e. before the related
    BDMs and tags were created in the cell DB. Updating the instance mapping
    exposes the cell's version of the instance to the API. If the API happened
    to fetch it before it was finished being created it would return
    incomplete data.

    Co-Authored-By: Matt Riedemann <email address hidden>

    Closes-Bug: #1820337
    Change-Id: If966eb1161c842ff49aa530e4482dbca87b61a3e
    (cherry picked from commit 64f6cbc9120e3c288f312eddc59452dee4998f93)

Reviewed: https://review.openstack.org/647911
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c8cb120964cb33950a80c9775ffbfe4eb5818153
Submitter: Zuul
Branch: stable/rocky

commit c8cb120964cb33950a80c9775ffbfe4eb5818153
Author: Matthew Booth <email address hidden>
Date: Fri Mar 22 11:47:32 2019 +0000

    Fix incomplete instance data returned after build failure

    This change fixes a race in _cleanup_build_artifacts. We were updating
    the instance mapping to point at the cell in which the instance was
    created before the instance record was complete, i.e. before the related
    BDMs and tags were created in the cell DB. Updating the instance mapping
    exposes the cell's version of the instance to the API. If the API happened
    to fetch it before it was finished being created it would return
    incomplete data.

    Co-Authored-By: Matt Riedemann <email address hidden>

    Conflicts:
          nova/tests/unit/conductor/test_conductor.py

    NOTE(mriedem): The conflict is due to not having change
    I0c764e441993e32aafef0b18049a425c3c832a50 in Rocky.

    Closes-Bug: #1820337
    Change-Id: If966eb1161c842ff49aa530e4482dbca87b61a3e
    (cherry picked from commit 64f6cbc9120e3c288f312eddc59452dee4998f93)
    (cherry picked from commit eec39762cf5343b354e2ac06a6682a7c3b4230c2)

Reviewed: https://review.opendev.org/647913
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=ddc2c9118a2907b005c06ce30949c8c3fd7a7aef
Submitter: Zuul
Branch: stable/queens

commit ddc2c9118a2907b005c06ce30949c8c3fd7a7aef
Author: Matthew Booth <email address hidden>
Date: Fri Mar 22 11:47:32 2019 +0000

    Fix incomplete instance data returned after build failure

    This change fixes a race in _cleanup_build_artifacts. We were updating
    the instance mapping to point at the cell in which the instance was
    created before the instance record was complete, i.e. before the related
    BDMs and tags were created in the cell DB. Updating the instance mapping
    exposes the cell's version of the instance to the API. If the API happened
    to fetch it before it was finished being created it would return
    incomplete data.

    Co-Authored-By: Matt Riedemann <email address hidden>

    Closes-Bug: #1820337
    Change-Id: If966eb1161c842ff49aa530e4482dbca87b61a3e
    (cherry picked from commit 64f6cbc9120e3c288f312eddc59452dee4998f93)
    (cherry picked from commit eec39762cf5343b354e2ac06a6682a7c3b4230c2)
    (cherry picked from commit c8cb120964cb33950a80c9775ffbfe4eb5818153)

Reviewed: https://review.opendev.org/647916
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=3d85f7e58197300473676980366369b29d14e187
Submitter: Zuul
Branch: stable/pike

commit 3d85f7e58197300473676980366369b29d14e187
Author: Matthew Booth <email address hidden>
Date: Fri Mar 22 11:47:32 2019 +0000

    Fix incomplete instance data returned after build failure

    This change fixes a race in _cleanup_build_artifacts. We were updating
    the instance mapping to point at the cell in which the instance was
    created before the instance record was complete, i.e. before the related
    BDMs and tags were created in the cell DB. Updating the instance mapping
    exposes the cell's version of the instance to the API. If the API happened
    to fetch it before it was finished being created it would return
    incomplete data.

    Co-Authored-By: Matt Riedemann <email address hidden>

    NOTE(mriedem): The unit test had to be modified slightly due to
    not having change Ibc44e3b2261b314bb92062a88ca9ee6b81298dc3 in Pike.

    Closes-Bug: #1820337
    Change-Id: If966eb1161c842ff49aa530e4482dbca87b61a3e
    (cherry picked from commit 64f6cbc9120e3c288f312eddc59452dee4998f93)
    (cherry picked from commit eec39762cf5343b354e2ac06a6682a7c3b4230c2)
    (cherry picked from commit c8cb120964cb33950a80c9775ffbfe4eb5818153)
    (cherry picked from commit ddc2c9118a2907b005c06ce30949c8c3fd7a7aef)

This issue was fixed in the openstack/nova 16.1.8 release.

This issue was fixed in the openstack/nova 19.0.1 release.

This issue was fixed in the openstack/nova 18.2.1 release.

This issue was fixed in the openstack/nova 17.0.11 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers