removing a unit leaves a stale provider entry in the nova_api.resource_providers table

Bug #1873521 reported by Andrea Ieri
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Nova Compute Charm
Triaged
Low
Unassigned

Bug Description

In bionic queens, nova-compute charm v269, removing a nova-compute unit will leave a stale entry in the nova_api.resource_providers table.

Redeploying a new nova-compute unit onto the same machine (or possibly onto any new machine reusing the same hostname) will then yield a non-functional hypervisor.

Although charm installation will succeed (aside from other reinstallation-related bugs), no instance will be scheduled on the new compute, and targeted instantiations (i.e. `openstack server create --availability-zone nova:<hostname>`) will fail with a NoValidHost error.

The following will then be seen in the nova-compute.log within the redeployed unit:

2020-04-17 19:05:29.637 1353458 ERROR nova.scheduler.client.report [req-16e06901-8cfa-4078-8647-a33a96fbd63c - - - - -] [req-2a8e8172-8d35-49ec-b1c7-7a0213b11486] Failed to create resource provider record in placement API for UUID e4a8a74e-db7c-4c66-84ca-92aa61e015e9. Got 409: {"errors": [{"status": 409, "request_id": "req-2a8e8172-8d35-49ec-b1c7-7a0213b11486", "detail": "There was a conflict when trying to complete your request.\n\n Conflicting resource provider name: <REDACTED> already exists. ", "title": "Conflict"}]}.
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager [req-16e06901-8cfa-4078-8647-a33a96fbd63c - - - - -] Error updating resources for node <REDACTED>.: ResourceProviderCreationFailed: Failed to create resource provider <REDACTED>
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager Traceback (most recent call last):
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/compute/manager.py", line 7540, in update_available_resource_for_node
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager rt.update_available_resource(context, nodename)
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/compute/resource_tracker.py", line 706, in update_available_resource
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager self._update_available_resource(context, resources)
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/oslo_concurrency/lockutils.py", line 277, in inner
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager return f(*args, **kwargs)
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/compute/resource_tracker.py", line 782, in _update_available_resource
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager self._update(context, cn)
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/compute/resource_tracker.py", line 904, in _update
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager inv_data,
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/__init__.py", line 68, in set_inventory_for_provider
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager parent_provider_uuid=parent_provider_uuid,
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/__init__.py", line 37, in __run_method
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager return getattr(self.instance, __name)(*args, **kwargs)
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/report.py", line 1104, in set_inventory_for_provider
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager parent_provider_uuid=parent_provider_uuid)
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/report.py", line 665, in _ensure_resource_provider
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager parent_provider_uuid=parent_provider_uuid)
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/report.py", line 64, in wrapper
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager return f(self, *a, **k)
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager File "/usr/lib/python2.7/dist-packages/nova/scheduler/client/report.py", line 612, in _create_resource_provider
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager raise exception.ResourceProviderCreationFailed(name=name)
2020-04-17 19:05:29.638 1353458 ERROR nova.compute.manager ResourceProviderCreationFailed: Failed to create resource provider <REDACTED>

Workaround:
Manually delete the stale provider (requires a recent osc-placement client):
name=<hostname of the redeployed unit>.maas
openstack resource provider delete "$(openstack resource provider list --name $name -cuuid -fvalue)"

Tags: scaleback sts
tags: added: scaleback
Changed in charm-nova-compute:
status: New → Triaged
importance: Undecided → Medium
Revision history for this message
Andrea Ieri (aieri) wrote :

Here are more details about the workaround. If the provider deletion fails because it still has allocations, you need to clean those up first.

Connect to the nova_api db, then:

Find the resource provider id
mysql> select * from resource_providers where name='<hypervisor>';

Find allocations for that provider id (or use a join)
mysql> select * from allocations where resource_provider_id='<id>';
+---------------------+------------+--------+----------------------+--------------------------------------+-------------------+------+
| created_at | updated_at | id | resource_provider_id | consumer_id | resource_class_id | used |
+---------------------+------------+--------+----------------------+--------------------------------------+-------------------+------+
| 2019-10-09 08:59:27 | NULL | 124441 | 17 | 0b48fced-2ed2-4cbc-8ac7-78c71227924c | 0 | 1 |
| 2019-10-09 09:00:01 | NULL | 124459 | 17 | fddeeafa-b460-4103-b561-86dff5f78125 | 0 | 1 |
| 2019-10-09 09:00:25 | NULL | 124477 | 17 | 47c5c56e-c901-40ea-9df3-293c30d9721f | 0 | 1 |
| 2019-10-09 08:59:27 | NULL | 124444 | 17 | 0b48fced-2ed2-4cbc-8ac7-78c71227924c | 1 | 512 |
| 2019-10-09 09:00:01 | NULL | 124462 | 17 | fddeeafa-b460-4103-b561-86dff5f78125 | 1 | 512 |
| 2019-10-09 09:00:25 | NULL | 124480 | 17 | 47c5c56e-c901-40ea-9df3-293c30d9721f | 1 | 512 |
| 2019-10-09 08:59:27 | NULL | 124447 | 17 | 0b48fced-2ed2-4cbc-8ac7-78c71227924c | 2 | 20 |
| 2019-10-09 09:00:01 | NULL | 124465 | 17 | fddeeafa-b460-4103-b561-86dff5f78125 | 2 | 20 |
| 2019-10-09 09:00:25 | NULL | 124483 | 17 | 47c5c56e-c901-40ea-9df3-293c30d9721f | 2 | 20 |
+---------------------+------------+--------+----------------------+--------------------------------------+-------------------+------+
9 rows in set (0.00 sec)

Now you can delete the allocations with `openstack resource provider allocation delete <consumer_id>`

Revision history for this message
Andrea Ieri (aieri) wrote :

NOTE: after applying the workaround above, I was able to boot instances on that compute, but I've now noticed that the number of reported running VMs is incorrect:

$ openstack server list --host <host> --all -fvalue | wc -l
63
$ openstack hypervisor show -crunning_vms -fvalue <hypervisor>
5

I suspect this might confuse the placement API and cause oversubscription, so use the workaround at your own risk...

Revision history for this message
Andrea Ieri (aieri) wrote :

Adding field-critical because my workaround has proven only partially successful, and a fully functional one will be needed even after this bug is fixed.

Revision history for this message
James Page (james-page) wrote :

I suspect that the resource provider record needs to be purged as part of the removal of the unit from the deployment but I've not been able to validate that yet.

Revision history for this message
James Page (james-page) wrote :

@aieri - where all of the instances running on the hypervisor destroyed/migrated before the machine was recycled in this way?

Revision history for this message
Andrea Ieri (aieri) wrote :

@james-page - it's very likely that in this second instance they were not.

I filed this bug in April, when I worked on the redeployment of two computes that had been fully evacuated. I applied the workaround, and things seemed (and still seem) to be working correctly. The running_vms count, server list, and virsh list all agree.

I then encountered this bug a second time a few days ago, under a different scenario:
* running_vms being reported as None
* new VMs cannot be started due to the provider conflict
* a good 50 VMs or so running on this compute

When I applied the workaround running_vms suddenly jumped to a seemingly random number (the 5 you see above). New VMs booted on that compute appeared to correctly +1 the counter, but most of the old ones were and still are unaccounted for.

Although I can't find specific records of what was done to this machine, the above seems to imply that the charm redeployment was performed before evacuating the running instances.

Also please note that we have a second compute in this cloud exhibiting this behavior: running_vms=None, server list reports 60 VMs, new instance creation fails. I have not "fixed" it yet because of these uncertainties around the workaround. Let me know if I can use it to gather useful information to help you fix this bug.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

I've confirmed that the resource provider for a compute node still exists after removing a nova-compute unit. So per James' comment #4, it seems the charm should carefully handle that cleanup, perhaps blocking the unit's removal if the resource provider has allocations.

e.g. CLI attempt to delete resource provider - charm would block in this case
$ openstack resource provider delete 6e5346ae-6f25-4da0-a5c1-06d07331e2e7
Unable to delete resource provider 6e5346ae-6f25-4da0-a5c1-06d07331e2e7: Resource provider has allocations. (HTTP 409)

That would prevent future users from getting into the situation that Andrea appears to have gotten into where a compute is removed prior to evacuating instances, and would also cleanup the resource provider and (presumably - needs testing) enable a new unit to be deployed to the same machine.

With that said if instances didn't get migrated before removing the compute node I don't think any charm fixes are going to help Andrea's current situation. There may be some manual DB cleanup required to get rid of the lost instances.

I'm attaching some related testing notes.

Revision history for this message
Andrea Ieri (aieri) wrote :

Just a quick note: there are no lost instances to get rid of. It's rather the opposite: the running_vms counter is under-representing how many VMs are actually running on this compute, even though the nova DB seems to know about every one of them.

root@<host>:~# virsh list --all --name | grep -cvE '^$'
70

$ openstack server list --host <host> --all -fvalue | wc -l
70

$ openstack hypervisor show -crunning_vms -fvalue <host>
12

Is there a way to force a recalculation of running_vms?

Revision history for this message
Corey Bryant (corey.bryant) wrote :

It seems that a compute node has been removed before evacuating it, and you want to re-introduce those virtual machines back into openstack. I don't think OpenStack is designed to be used this way. If you remove a compute node that has instances on it, the database will still have those instances and virsh will still list those instances, but I don't think there are any expectation that you can re-deploy a nova-compute with the same hostname and expect those instances to again be available via nova. You are likely seeing confusing DB counts because the nova database is not designed to be used this way, so it may be recognizing instances with one command but not another.

Revision history for this message
Corey Bryant (corey.bryant) wrote :

I tried recreating this and it actually worked ok for me. I did have to wait a bit for nova host discovery after deploying the new nova-compute to the existing machine. After discovery I was able to deploy a new instance on the new compute.

Steps taken:

1) deploy bionic-queens bundle with converged nova-compute/ceph-osd (3 units each)
2) create 2 instances on nova-compute/0
3) juju remove-unit nova-compute/0 (from machine 1)
4) juju add-unit --to 1
5) wait for host discovery
6) create a new instance on nova-compute/0

Note that I didn't delete the resource provider.

Revision history for this message
Corey Bryant (corey.bryant) wrote :
Revision history for this message
Corey Bryant (corey.bryant) wrote :

@Andrea, I'm going to move this to Low because removing a unit does leave a provider entry in the nova_api.resource_providers table. That can be seen with the following:

openstack resource provider list

use nova_api;
select * from resource_providers;

The charm could potentially clean up the resource provider on unit removal. I think that requires more consideration, specifically considering 2 options:

1) In comment #10 when attempting to recreate, removing a nova-compute unit without cleaning up the corresponding resource provider allowed for removal of the unit with instances on it and recovery of those instances by adding a nova-compute unit back to the same machine.

2) Having the charm clean up the resource provider would require the charm to block on unit removal until all instances are migrated. ie. block until something like [1] is successful. Deletion should be successful once instances (ie. allocations) are migrated to a new compute.

[1]
$ openstack resource provider delete ea271f8f-d0db-43ba-a5e6-6271f35382ea
Unable to delete resource provider ea271f8f-d0db-43ba-a5e6-6271f35382ea: Resource provider has allocations. (HTTP 409)

Changed in charm-nova-compute:
importance: Medium → Low
Revision history for this message
Adam Dyess (addyess) wrote :
Download full text (5.6 KiB)

I encountered a situation where I had to apply this workaround, and doing so did successfully allow the compute to begin hosting images again.

There was a host running compute and ceph happily when a user turned the machine off for maintenance without warning. RAM was upgraded, firmware applied, and a week later the machine was turned back on. In one day the machine was rebooted and shutdown several times outside my control. Finally, about 2 weeks later we were able to remove the units off the device and ceph out the storage. We even removed the machine from juju so that it was no long deployed in maas

Skip forward 2 more weeks, and the host is repaired and ready to be re-integrated into the cloud.
We deployed nova-compute-kvm to the machine but could not enable the compute service

---------------------------------------------------------
$ openstack compute service set --debug --enable my-broken-host.maas nova-compute
....
RESP BODY: {"itemNotFound": {"code": 404, "message": "Host 'my-broken-host.maas' is not mapped to any cell"}}
osc_lib.exceptions.CommandError: Compute service nova-compute of host my-broken-host.maas failed to set.
--------------------------------------------------------

This was failing because the nova db was in a confused state about whether or not this host was in service or not.

--------------------------------------------------------
$ openstack host list | grep my-broken-host # shows that host isn't there.
$ juju ssh n-c-c/1
$ sudo nova-manage cell_v2 discover_hosts --verbose
# doesn't find any unmapped cells
# searching in the mysql compute_nodes -- i found this: so it IS mapped? Wonder when it was mapped
mysql> select host,host_ip,hypervisor_hostname,mapped from compute_nodes where host='my-broken-host';
+----------------+---------------+--------------------------------------+--------+
| host | host_ip | hypervisor_hostname | mapped |
+----------------+---------------+--------------------------------------+--------+
| my-broken-host | 10.20.175.229 | my-broken-host.maas | 1 |
+----------------+---------------+--------------------------------------+--------+

back in n-c-c/1
$ nova-manage cell_v2 delete_host --cell_uuid bf52f53c-81d1-403a-b966-cf5f676a26c9 --host my-broken-host && echo $?
0 # this means the host was found and deleted successfully
$ nova-manage cell_v2 discover_hosts --verbose
Deprecated: Option "logdir" from group "DEFAULT" is deprecated. Use option "log-dir" from group "DEFAULT".
Found 2 cell mappings.
Skipping cell0 since it does not contain hosts.
Getting computes from cell 'cell1': bf52f53c-81d1-403a-b966-cf5f676a26c9
Found 0 unmapped computes in cell: bf52f53c-81d1-403a-b966-cf5f676a26c9
# this isn't great -- where's my new compute host?
--------------------------------------------------------

More work

--------------------------------------------------------
infra $ openstack server list --all-projects --host my-broken-host
infra $ juju remove-unit nova-compute-kvm/80
infra $ #wait -- ok the unit and its subbordinates are gone!
infra $ openstack compute service list | grep my-broken-host
infra $ openstack host list |...

Read more...

Revision history for this message
Adam Dyess (addyess) wrote :

I encountered the same stack trace for a number of days on a host (the logs have since rotated off, so I cannot determine to point of origin), but the same work-around applied resolved the issue.

Changed in charm-nova-compute:
assignee: nobody → Hemanth Nakkina (hemanth-n)
tags: added: sts
Revision history for this message
Edward Hope-Morley (hopem) wrote :

Deleting the compute service via the api should fix this as long as you have the fix from bug 1756179. Of course the question now is when/what is the right time/place to do that delete (safely).

Revision history for this message
Hemanth Nakkina (hemanth-n) wrote :

Deletion of compute service via api (or resource provider/allocations) cannot be done in stop/remove hook since the nova-compute service is broken by that time (nova-compute configuration is modified due to -relations broken).

Another way to achieve the deletion is make use of nova-compute peer relation departed hook and perform the compute service deletion in the peer nova-compute ( or make use of n-c-c relations departed).
However blocking of unit removal in cases of compute service deletion failure (possible scenario instances still running on the compute node) cannot be achieved with above solution.

I raised an RFE on Juju for a new hook to perform application checks on unit before *-relation-departed. https://bugs.launchpad.net/juju/+bug/1884475

Any further thoughts are welcome!

Revision history for this message
Michael Skalka (mskalka) wrote :

crit -> high as this as a workaround

Changed in charm-nova-compute:
assignee: Hemanth Nakkina (hemanth-n) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.