Failed to create resource provider

Bug #1876772 reported by YG Kumar
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned

Bug Description

Hi,

We have an openstack ansible rocky 18.1.2 setup. It was working working fine. As part of an upgrade
we have reinstalled it. Now we are unable to create any vms. We have four compute nodes.
When we try to create a vm on a specific compute node, it is failing, showing no host found in the nova-conductor and scheduler logs. When I observed the nova-compute log on the compute, we have found the following errors:

-------------
2020-05-04 12:59:55.800 11245 ERROR nova.scheduler.client.report [req-40cd6d9d-3fb5-4a85-b348-53705f22148e - - - - -] [req-68c57117-a4b0-46b2-a75b-36e8d665da38] Failed to create resource provider record in placement API for UUID 7e6fb27c-ed5b-4c2c-8373-c99e98da7bcc. Got 409: {"errors": [{"status": 409, "request_id": "req-68c57117-a4b0-46b2-a75b-36e8d665da38", "detail": "There was a conflict when trying to complete your request.\n\n Conflicting resource provider name: b2b.blr.example.cloud already exists. ", "title": "Conflict"}]}.

2020-05-04 12:59:55.801 11245 ERROR nova.compute.manager raise exception.ResourceProviderCreationFailed(name=name)
2020-05-04 12:59:55.801 11245 ERROR nova.compute.manager ResourceProviderCreationFailed: Failed to create resource provider b2b.blr.example.cloud
2020-05-04 12:59:55.801 11245 ERROR nova.compute.manager
----------------

Can you help us solve this error ?

Thanks
Kumar

Revision history for this message
Robin Cernin (rcernin) wrote :
Download full text (6.1 KiB)

Hi,

Adding missing Traceback you have shared with me:

2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager [req-34c9d727-3683-45bd-824c-569a9dd8124d - - - - -] Error updating resources for node b1b.blr.example.cloud.: ResourceProviderCreationFailed: Failed to create resource provider b1b.blr.example.cloud
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager Traceback (most recent call last):
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager File "/openstack/venvs/nova-18.1.2/lib/python2.7/site-packages/nova/compute/manager.py", line 7778, in _update_available_resource_for_node
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager rt.update_available_resource(context, nodename)
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager File "/openstack/venvs/nova-18.1.2/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 721, in update_available_resource
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager self._update_available_resource(context, resources)
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager File "/openstack/venvs/nova-18.1.2/lib/python2.7/site-packages/oslo_concurrency/lockutils.py", line 274, in inner
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager return f(*args, **kwargs)
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager File "/openstack/venvs/nova-18.1.2/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 798, in _update_available_resource
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager self._update(context, cn)
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager File "/openstack/venvs/nova-18.1.2/lib/python2.7/site-packages/retrying.py", line 49, in wrapped_f
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager return Retrying(*dargs, **dkw).call(f, *args, **kw)
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager File "/openstack/venvs/nova-18.1.2/lib/python2.7/site-packages/retrying.py", line 206, in call
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager return attempt.get(self._wrap_exception)
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager File "/openstack/venvs/nova-18.1.2/lib/python2.7/site-packages/retrying.py", line 247, in get
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager six.reraise(self.value[0], self.value[1], self.value[2])
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager File "/openstack/venvs/nova-18.1.2/lib/python2.7/site-packages/retrying.py", line 200, in call
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager File "/openstack/venvs/nova-18.1.2/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 960, in _update
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager self._update_to_placement(context, compute_node)
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager File "/openstack/venvs/nova-18.1.2/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 912, in _update_to_placement
2020-05-05 00:56:53.160 29284 ERROR nova.compute.manager context, compute_nod...

Read more...

Revision history for this message
Robin Cernin (rcernin) wrote :

Also highly recommended is to take backup of the DB before doing any changes to your resource provider, so you still have some easy recovery option if nothing else works for you.

Revision history for this message
YG Kumar (ygk-kmr) wrote :

I have removed the resources from the resource_providers table from the nova_api db and it picked up the computes. SO it seems the issue has gone for now. Now its a different AMQP issue with nova-conductor. The nova user authentication is failing in the nova-conductor to the rabbitmq server only when launching an instance. But the same authentication is functioning well from the conductor at other times while the service is running.

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

"As part of an upgrade we have reinstalled it."

What does "reinstalled" here means?

From the attached error it seems that during the reinstall you purged some of the db like nova but did not purge some other db like placement. So nova-compute re-creates the compute node and then the compute resource provider (RP) in placement. The new compute node uuid will be the uuid of the RP and the hypervisor hostname will be the name of the RP. As the compute node is re-created it gets a new uuid but the hostname remains the same. So placement rejects the RP creation as both the uuid and the name should be unique.

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

I talked the reporter over IRC and he said that the RP problem is resolved now. I still think it was a reinstall cleanup problem. Also the rabbit problem was also a reinstall problem having old data in cell_mappings table in the nova_api database. Closing the this ticket as no bug.

Changed in nova:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.