I believe I am also seeing this issue. First, a little about the environment. The control plane is containerised, and deployed using an Ocata release of kolla-ansible. The base OS and container OS are both CentOS 7.3. The RDO nova compute package is openstack-nova-compute-15.0.6-2.el7.noarch. There are 3 OpenStack controllers, each with a nova compute service for ironic. There are 4 ironic baremetal nodes. I have seen the issue twice now, and as Hironori described, the main user visible symptom is that one of the ironic nodes becomes unschedulable. Digging into the logs, the compute service to which the ironic node has been mapped shows the following messages occurring every minute: 2017-09-13 09:49:42.618 7 INFO nova.scheduler.client.report [req-569e86cc-a2c6-4043-8efa-ea31e14d86dc - - - - -] Another thread already created a resource provider with the UUID 22787651-ab4a-4c8b-b72b-5e20bb3fad2c. Grabbing that record from the placement API. 2017-09-13 09:49:42.631 7 WARNING nova.scheduler.client.report [req-569e86cc-a2c6-4043-8efa-ea31e14d86dc - - - - -] Unable to refresh my resource provider record 2017-09-13 09:49:42.689 7 DEBUG nova.compute.resource_tracker [req-569e86cc-a2c6-4043-8efa-ea31e14d86dc - - - - -] Total usable vcpus: 64, total allocated vcpus: 0 _report_final_resource_view /usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py:688 2017-09-13 09:49:42.690 7 INFO nova.compute.resource_tracker [req-569e86cc-a2c6-4043-8efa-ea31e14d86dc - - - - -] Final resource view: name=5d1535b1-0984-42b3-a574-a62afddd9307 phys_ram=262144MB used_ram=0MB phys_disk=222GB used_disk=0GB total_vcpus=64 used_vcpus=0 pci_stats=[] 2017-09-13 09:49:42.691 7 DEBUG nova.compute.resource_tracker [req-569e86cc-a2c6-4043-8efa-ea31e14d86dc - - - - -] Compute_service record updated for kef1p-phycon0003-ironic:5d1535b1-0984-42b3-a574-a62afddd9307 _update_available_resource /usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py:626 The placement logs are fairly lacking in useful information, even with logging set to debug. Picking out the relevant logs: 2017-09-13 09:51:43.604 20 DEBUG nova.api.openstack.placement.requestlog [req-298e44a2-5944-4322-87b2-e1b28d9fbc6a ac342c8d47c8416580ec6f3affcd287f 4970f0b152ca41dc968b4473bb8a48d9 - default default] Starting request: 10.105.1.3 "POST /resource_providers" __call__ /usr/lib/python2.7/site-packages/nova/api/openstack/placement/requestlog.py:38 2017-09-13 09:51:43.612 20 INFO nova.api.openstack.placement.requestlog [req-298e44a2-5944-4322-87b2-e1b28d9fbc6a ac342c8d47c8416580ec6f3affcd287f 4970f0b152ca41dc968b4473bb8a48d9 - default default] 10.105.1.3 "POST /resource_providers" status: 409 len: 675 microversion: 1.0 We can see here that the scheduler client first tries to GET the resource_provider for compute node 22787651-ab4a-4c8b-b72b-5e20bb3fad2c, but fails with a 404 not found. Following this, it tries to create a resource provider for the compute node, but fails with a 409, presumably because a resource provider exists with the same name (the ironic node UUID) but a different UUID. Looking at the DB for further info, here's the troublesome RP: +---------------------+---------------------+-------+--------------------------------------+--------------------------------------+------------+----------+ | created_at | updated_at | id | uuid | name | generation | can_host | +---------------------+---------------------+-------+--------------------------------------+--------------------------------------+------------+----------+ | 2017-09-01 18:10:43 | 2017-09-12 15:44:53 | 88 | 2f786d5d-169f-49b7-880f-d63cea9e4906 | 5d1535b1-0984-42b3-a574-a62afddd9307 | 19 | 0 | +---------------------+---------------------+-------+--------------------------------------+--------------------------------------+------------+----------+ The compute node with UUID 2f786d5d-169f-49b7-880f-d63cea9e4906 was actually deleted about the same time as the 'Another thread...' log started appearing: +---------------------+---------------------+---------------------+----+------------+-------+-----------+----------+------------+----------------+---------------+-----------------+--------------------+----------+----------------------+-------------+--------------+------------------+-------------+--------------------------------------+---------+------------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+-------------------------+----------------------+----------------------+--------------------------------------+-----------------------+ | created_at | updated_at | deleted_at | id | service_id | vcpus | memory_mb | local_gb | vcpus_used | memory_mb_used | local_gb_used | hypervisor_type | hypervisor_version | cpu_info | disk_available_least | free_ram_mb | free_disk_gb | current_workload | running_vms | hypervisor_hostname | deleted | host_ip | supported_instances | pci_stats | metrics | extra_resources | stats | numa_topology | host | ram_allocation_ratio | cpu_allocation_ratio | uuid | disk_allocation_ratio | +---------------------+---------------------+---------------------+----+------------+-------+-----------+----------+------------+----------------+---------------+-----------------+--------------------+----------+----------------------+-------------+--------------+------------------+-------------+--------------------------------------+---------+------------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+-------------------------+----------------------+----------------------+--------------------------------------+-----------------------+ | 2017-09-01 18:10:43 | 2017-09-12 15:51:17 | 2017-09-12 15:51:23 | 70 | NULL | 0 | 0 | 0 | 128 | 524288 | 444 | ironic | 1 | | -222 | -524288 | -444 | 0 | 2 | 5d1535b1-0984-42b3-a574-a62afddd9307 | 70 | 10.105.1.6 | [["x86_64", "baremetal", "hvm"]] | {"nova_object.version": "1.1", "nova_object.changes": ["objects"], "nova_object.name": "PciDevicePoolList", "nova_object.data": {"objects": []}, "nova_object.namespace": "nova"} | [] | NULL | {"num_task_None": "2", "cpu_arch": "x86_64", "io_workload": "1", "num_instances": "2", "num_proj_12539bef11ff48d2a04e9cf8c13ac7c3": "2", "cpu_txt": "true", "num_vm_active": "1", "num_vm_building": "1", "cpu_hugepages": "true", "cpu_vt": "true", "boot_option": "local", "num_os_type_None": "2", "cpu_hugepages_1g": "true", "cpu_aes": "true"} | NULL | kef1p-phycon0003-ironic | 1 | 0 | 2f786d5d-169f-49b7-880f-d63cea9e4906 | 0 | +---------------------+---------------------+---------------------+----+------------+-------+-----------+----------+------------+----------------+---------------+-----------------+--------------------+----------+----------------------+-------------+--------------+------------------+-------------+--------------------------------------+---------+------------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-----------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+-------------------------+----------------------+----------------------+--------------------------------------+-----------------------+ The active compute node entry is: +---------------------+---------------------+------------+-----+------------+-------+-----------+----------+------------+----------------+---------------+-----------------+--------------------+----------+----------------------+-------------+--------------+------------------+-------------+--------------------------------------+---------+------------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+-------------------------+----------------------+----------------------+--------------------------------------+-----------------------+ | created_at | updated_at | deleted_at | id | service_id | vcpus | memory_mb | local_gb | vcpus_used | memory_mb_used | local_gb_used | hypervisor_type | hypervisor_version | cpu_info | disk_available_least | free_ram_mb | free_disk_gb | current_workload | running_vms | hypervisor_hostname | deleted | host_ip | supported_instances | pci_stats | metrics | extra_resources | stats | numa_topology | host | ram_allocation_ratio | cpu_allocation_ratio | uuid | disk_allocation_ratio | +---------------------+---------------------+------------+-----+------------+-------+-----------+----------+------------+----------------+---------------+-----------------+--------------------+----------+----------------------+-------------+--------------+------------------+-------------+--------------------------------------+---------+------------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+-------------------------+----------------------+----------------------+--------------------------------------+-----------------------+ | 2017-09-12 15:53:26 | 2017-09-13 10:26:26 | NULL | 101 | NULL | 64 | 262144 | 222 | 0 | 0 | 0 | ironic | 1 | | 222 | 262144 | 222 | 0 | 0 | 5d1535b1-0984-42b3-a574-a62afddd9307 | 0 | 10.105.1.6 | [["x86_64", "baremetal", "hvm"]] | {"nova_object.version": "1.1", "nova_object.changes": ["objects"], "nova_object.name": "PciDevicePoolList", "nova_object.data": {"objects": []}, "nova_object.namespace": "nova"} | [] | NULL | {"cpu_vt": "true", "cpu_arch": "x86_64", "cpu_hugepages": "true", "boot_option": "local", "cpu_txt": "true", "cpu_aes": "true", "cpu_hugepages_1g": "true"} | NULL | kef1p-phycon0003-ironic | 1 | 0 | 22787651-ab4a-4c8b-b72b-5e20bb3fad2c | 0 | +---------------------+---------------------+------------+-----+------------+-------+-----------+----------+------------+----------------+---------------+-----------------+--------------------+----------+----------------------+-------------+--------------+------------------+-------------+--------------------------------------+---------+------------+----------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+---------+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------+---------------+-------------------------+----------------------+----------------------+--------------------------------------+-----------------------+ They are both mapped to the same nova compute service, kef1p-phycon0003-ironic. Picking out some events from the logs in the run up to the compute node switchover: * Creation of an instance on the ironic node under question was aborted. * Cleaning up the instance failed due to maxing out the ironic API retries. 2017-09-12 15:51:17.846 7 WARNING nova.compute.manager [req-3dbcf213-dddd-4462-a1aa-6cfa697449fd 7dc9cbb312404469b3d2ea983387181f 12539bef11ff48d2a04e9cf8c13ac7c3 - - -] Could not clean up failed build, not rescheduling. Error: Node 5d1535b1-0984-42b3-a574-a62afddd9307 is locked by host kef1p-phycon0002, please retry after the current operation is completed. (HTTP 409) * Nova compute deletes the compute node, claiming it is orphaned. 2017-09-12 15:51:23.894 7 INFO nova.compute.manager [req-569e86cc-a2c6-4043-8efa-ea31e14d86dc - - - - -] Deleting orphan compute node 70 hypervisor host is 5d1535b1-0984-42b3-a574-a62afddd9307, nodes are set([u'60c1ee36-b49d-4350-9e4e-e1995a289b2b', u'eb9e200f-0702-4668-849c-6e46a2864e9c'])