Failed to delete compute service

Bug #1964137 reported by josemiguel.espadero
14
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Expired
Undecided
Unassigned

Bug Description

I have temporally added a new compute host to my openstack wallaby cloud. It was not the first time I have done this, but it is the first time it fails.

To add the compute host I installed nova-compute and neutron-linuxbridge services in it, and then updated the cloud using the command "nova-manage cell_v2 discover_hosts" in my controller.

It worked well, but then I decided to remove the host, so I get the id of the host to remove with the command

openstack compute service list

Executed this command to delete the host from the cloud:

openstack compute service delete 43

and got this error:

Failed to delete compute service with ID '43': Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'ValueError'> (HTTP 500) (Request-ID: req-82511bfa-660d-41de-ab8a-c1b2dc78d623)
1 of 1 compute services failed to delete.

What can I do?

The last lines of my /var/log/nova/nova-api.log file are:

2022-03-08 15:23:31.136 2569818 INFO nova.osapi_compute.wsgi.server [req-b0f02718-bbbc-4cd5-a936-312aa47cca97 d2fba82966a44a82b2624f75f65f9ad0 3de2426941b943fd95633b0dbaedfd3f - default default] 10.100.133.11 "GET /v2.1/3de2426941b943fd95633b0dbaedfd3f/os-services HTTP/1.1" status: 200 len: 3113 time: 0.1815200
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi [req-82511bfa-660d-41de-ab8a-c1b2dc78d623 d2fba82966a44a82b2624f75f65f9ad0 3de2426941b943fd95633b0dbaedfd3f - default default] Unexpected exception in API method: ValueError: No such provider e224cb92-c659-4fed-81e9-5ce7ee2832fd
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi Traceback (most recent call last):
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi File "/usr/lib/python3/dist-packages/nova/api/openstack/wsgi.py", line 658, in wrapped
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi return f(*args, **kwargs)
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi File "/usr/lib/python3/dist-packages/nova/api/openstack/compute/services.py", line 286, in delete
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi self.placementclient.delete_resource_provider(
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi File "/usr/lib/python3/dist-packages/nova/scheduler/client/report.py", line 2257, in delete_resource_provider
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi provider_uuids = self._provider_tree.get_provider_uuids_in_tree(
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi File "/usr/lib/python3/dist-packages/nova/compute/provider_tree.py", line 288, in get_provider_uuids_in_tree
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi return self._find_with_lock(
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi File "/usr/lib/python3/dist-packages/nova/compute/provider_tree.py", line 439, in _find_with_lock
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi raise ValueError(_("No such provider %s") % name_or_uuid)
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi ValueError: No such provider e224cb92-c659-4fed-81e9-5ce7ee2832fd
2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi
2022-03-08 15:23:40.546 2569811 INFO nova.api.openstack.wsgi [req-82511bfa-660d-41de-ab8a-c1b2dc78d623 d2fba82966a44a82b2624f75f65f9ad0 3de2426941b943fd95633b0dbaedfd3f - default default] HTTP exception thrown: Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'ValueError'>
2022-03-08 15:23:40.548 2569811 INFO nova.osapi_compute.wsgi.server [req-82511bfa-660d-41de-ab8a-c1b2dc78d623 d2fba82966a44a82b2624f75f65f9ad0 3de2426941b943fd95633b0dbaedfd3f - default default] 10.100.133.11 "DELETE /v2.1/3de2426941b943fd95633b0dbaedfd3f/os-services/43 HTTP/1.1" status: 500 len: 617 time: 1.2662396

The output of the command openstack --debug compute service delete 43 is (I have deleted first lines, which seems ok):

Instantiating compute client for API Version Major: 2, Minor: 1
Instantiating compute api: <class 'openstackclient.api.compute_v2.APIv2'>
REQ: curl -g -i -X DELETE http://10.100.133.11:8774/v2.1/3de2426941b943fd95633b0dbaedfd3f/os-services/43 -H "Accept: application/json" -H "User-Agent: python-novaclient" -H "X-Auth-Token: {SHA256}8487771e5f65fa77e5306e561573afd56b1810bea648198c2f05e70e6bae3a9c" -H "X-OpenStack-Nova-API-Version: 2.1"
Starting new HTTP connection (1): 10.100.133.11:8774
http://10.100.133.11:8774 "DELETE /v2.1/3de2426941b943fd95633b0dbaedfd3f/os-services/43 HTTP/1.1" 500 184
RESP: [500] Connection: keep-alive Content-Length: 184 Content-Type: application/json; charset=UTF-8 Date: Tue, 08 Mar 2022 14:42:22 GMT Openstack-Api-Version: compute 2.1 Vary: OpenStack-API-Version, X-OpenStack-Nova-API-Version X-Compute-Request-Id: req-01d6510b-7299-4713-92cc-8ec0315fc4fb X-Openstack-Nova-Api-Version: 2.1 X-Openstack-Request-Id: req-01d6510b-7299-4713-92cc-8ec0315fc4fb
RESP BODY: {"computeFault": {"code": 500, "message": "Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.\n<class 'ValueError'>"}}
DELETE call to compute for http://10.100.133.11:8774/v2.1/3de2426941b943fd95633b0dbaedfd3f/os-services/43 used request id req-01d6510b-7299-4713-92cc-8ec0315fc4fb
Failed to delete compute service with ID '43': Unexpected API Error. Please report this at http://bugs.launchpad.net/nova/ and attach the Nova API log if possible.
<class 'ValueError'> (HTTP 500) (Request-ID: req-01d6510b-7299-4713-92cc-8ec0315fc4fb)
1 of 1 compute services failed to delete.
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/cliff/app.py", line 401, in run_subcommand
    result = cmd.run(parsed_args)
  File "/usr/lib/python3/dist-packages/osc_lib/command/command.py", line 39, in run
    return super(Command, self).run(parsed_args)
  File "/usr/lib/python3/dist-packages/cliff/command.py", line 185, in run
    return_code = self.take_action(parsed_args) or 0
  File "/usr/lib/python3/dist-packages/openstackclient/compute/v2/service.py", line 62, in take_action
    raise exceptions.CommandError(msg)
osc_lib.exceptions.CommandError: 1 of 1 compute services failed to delete.
clean_up DeleteService: 1 of 1 compute services failed to delete.
END return value: 1

the output of the command "dpkg -l | grep nova" is:

ii nova-api 3:23.0.0-0ubuntu1~cloud0 OpenStack Compute - API frontend
ii nova-common 3:23.0.0-0ubuntu1~cloud0 OpenStack Compute - common files
ii nova-compute 3:23.0.0-0ubuntu1~cloud0 OpenStack Compute - compute node base
ii nova-compute-kvm 3:23.0.0-0ubuntu1~cloud0 OpenStack Compute - compute node (KVM)
ii nova-compute-libvirt 3:23.0.0-0ubuntu1~cloud0 OpenStack Compute - compute node libvirt support
ii nova-conductor 3:23.0.0-0ubuntu1~cloud0 OpenStack Compute - conductor service
ii nova-novncproxy 3:23.0.0-0ubuntu1~cloud0 OpenStack Compute - NoVNC proxy
ii nova-scheduler 3:23.0.0-0ubuntu1~cloud0 OpenStack Compute - virtual machine scheduler
ii python3-nova 3:23.0.0-0ubuntu1~cloud0 OpenStack Compute Python 3 libraries
ii python3-novaclient 2:17.4.0-0ubuntu1~cloud0 client library for OpenStack Compute API - 3.x

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

Are you sure placement-api was running correctly?

Could you please give us openstack resource provider list ?
It seems that you no longer have a Resource Provider for this compute, please restart the nova-compute service if so.

Changed in nova:
status: New → Incomplete
Revision history for this message
josemiguel.espadero (jespa2) wrote :

The command "openstack resource provider list" returns 'resource provider list' is not an openstack command

The command "openstack service provider list" return empty string.

output of command "openstack service list" is

+----------------------------------+-----------+-----------+
| ID | Name | Type |
+----------------------------------+-----------+-----------+
| 0266a71a5d5642b6ae7d117f52dfcdde | keystone | identity |
| 6158181eed464dbda74fdd5121df033d | cinderv3 | volumev3 |
| b091f756876d4238ae078ee8e9f39d40 | glance | image |
| ccb41370c5fe47d29c813e71d0b03e4d | cinderv2 | volumev2 |
| d38d0ce070f1479d9a4afe992dc5bdee | neutron | network |
| d78575109d5c43d297508b8d83b8a4ac | placement | placement |
| e6acc50fb7064df0b4b2b1b5ebff8f76 | nova | compute |
+----------------------------------+-----------+-----------+

output of "openstack service show placement" command:
+-------------+-------------------------------------+
| Field | Value |
+-------------+-------------------------------------+
| description | OpenStack Compute Placement service |
| enabled | True |
| id | d78575109d5c43d297508b8d83b8a4ac |
| name | placement |
| type | placement |
+-------------+-------------------------------------+

Revision history for this message
josemiguel.espadero (jespa2) wrote :

Also, I did restart every service in the controller node using this script before of publishing the bug

for service in apache2 glance-api nova-api nova-conductor nova-scheduler nova-novncproxy cinder-scheduler
do
    echo systemctl restart $service
    systemctl restart $service
done

for service in neutron-server neutron-l3-agent neutron-dhcp-agent neutron-metadata-agent neutron-linuxbridge-agent nova-api nova-compute nova-conductor
do
    echo systemctl restart $service
    systemctl restart $service
done

Revision history for this message
josemiguel.espadero (jespa2) wrote :

And /var/log/placement directory is empty...

Revision history for this message
josemiguel.espadero (jespa2) wrote :

I'm debugging a dump of the openstack database generated with the command
mysqldump --databases nova

Database: nova
Table:compute_nodes

('2022-03-07 15:28:50', ... ,'catgpa01',1.5,16,'7727a631-86ba-4319-8bd3-f7e33dc0eb7a',1,1),
('2022-03-07 17:56:59', ... ,'catgpa01',1.5,16,'5b9853dc-a3be-466c-8ea2-2cea2e27d16c',1,1),
('2022-03-08 08:25:16', ... ,'catgpa01',1.5,16,'e224cb92-c659-4fed-81e9-5ce7ee2832fd',1,1);

Last line contains the "e224cb92-c659-4fed-81e9-5ce7ee2832fd" id listed in the error message that I got in the /var/log/nova/nova-api.log file;

2022-03-08 15:23:40.545 2569811 ERROR nova.api.openstack.wsgi [req-82511bfa-660d-41de-ab8a-c1b2dc78d623 d2fba82966a44a82b2624f75f65f9ad0 3de2426941b943fd95633b0dbaedfd3f - default default] Unexpected exception in API method: ValueError: No such provider e224cb92-c659-4fed-81e9-5ce7ee2832fd

Is it normal that "catgpa01" node is listed several times in nova.compute_nodes table?

Keep in mind that I added this node in the past and deleted it successfully with the command "openstack compute service delete 42"

but it does not coincide with the identifier 5b9853dc-a3be-466c-8ea2-2cea2e27d16c that
currently appears in the "placement" database:

Database: placement
Table:resource_providers

('2022-03-07 17:57:00',... ,17,'5b9853dc-a3be-466c-8ea2-2cea2e27d16c','catgpa01',6,17,NULL);

As you see... the placement database contains the id assigned the previous time I added the catgpa01 host (the previous day)

Have I found a bug?

Is it normal that a host appears several times in nova.compute_nodes database?

Is it normal that the id of a host in placement.resource_providers is not the last one of the appearances in nova.compute_nodes database?

It is safe to manually modify one of the databases to sync the values and then run the
«openstack compute service delete» command?

Revision history for this message
josemiguel.espadero (jespa2) wrote :

I finally solved it manually editing the database. I will publish my solution just in case somebody falls in the same problem.

1. Do ssh to the controller node and launch mariadb

2. Remove entries in the database "nova_api", table "host_mappings"

   USE nova_api;
   DELETE FROM host_mappings WHERE host='catgpa01';
   USE nova;

   USE placement;
   SELECT id FROM resource_providers WHERE name="catgpa01"; /* got 17 */
   DELETE FROM resource_provider_traits WHERE resource_provider_id=17; /* 17, from prev query */
   SET FOREIGN_KEY_CHECKS=0;
   DELETE FROM resource_providers WHERE name='catgpa01';
   SET FOREIGN_KEY_CHECKS=1;

   USE nova;
   SELECT id FROM compute_nodes WHERE hypervisor_hostname="catgpa01" AND deleted_at=NULL; /* got 17*/
   DELETE FROM pci_devices WHERE compute_node_id=17; /* 17, from prev query */
   DELETE FROM compute_nodes WHERE id=17; /* 17, from prev query */
   DELETE FROM services WHERE host="catgpa01" AND deleted_at=NULL;

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for OpenStack Compute (nova) because there has been no activity for 60 days.]

Changed in nova:
status: Incomplete → Expired
Revision history for this message
Robert Varjasi (robert.varjasi) wrote :

Me also affected.

Revision history for this message
Robert Varjasi (robert.varjasi) wrote :

My solution was to simple update the compute ID and the UUID for the current one in the placement/resource_providers table.

Revision history for this message
yosef (yosex) wrote :

I was also affected and created a temporary resource provider with the following command:
```
openstack resource provider create --uuid=<missing uuid> temp_provider
```

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.