nova compute fails to update inventory in placement after changing [DEFAULT]/host

Bug #1853587 reported by Balazs Gibizer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Triaged
Undecided
Harshavardhan Metla

Bug Description

* build a single node devstack (hostname=aio)
* when everything up and running change the /etc/nova/nova-cpu.conf [DEFAULT]/host to something else than the compute host hostname
* restart the nova-compute service

The below exception is periodically visible in the nova-compute logs

Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager [None req-46df956c-7a08-4217-a658-e44a1a8edb28 None None] Error updating resources for node aio.: nova.exception.R
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager Traceback (most recent call last):
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/manager.py", line 9152, in _update_available_resource_for_node
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager startup=startup)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 844, in update_available_resource
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager self._update_available_resource(context, resources, startup=startup)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/usr/local/lib/python3.6/dist-packages/oslo_concurrency/lockutils.py", line 328, in inner
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager return f(*args, **kwargs)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 929, in _update_available_resource
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager self._update(context, cn, startup=startup)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1178, in _update
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager self._update_to_placement(context, compute_node, startup)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 49, in wrapped_f
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager return Retrying(*dargs, **dkw).call(f, *args, **kw)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 206, in call
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager return attempt.get(self._wrap_exception)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 247, in get
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager six.reraise(self.value[0], self.value[1], self.value[2])
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/usr/local/lib/python3.6/dist-packages/six.py", line 696, in reraise
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager raise value
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/usr/local/lib/python3.6/dist-packages/retrying.py", line 200, in call
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/opt/stack/nova/nova/compute/resource_tracker.py", line 1112, in _update_to_placement
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager context, compute_node.uuid, name=compute_node.hypervisor_hostname)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/opt/stack/nova/nova/scheduler/client/report.py", line 857, in get_provider_tree_and_ensure_root
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager parent_provider_uuid=parent_provider_uuid)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/opt/stack/nova/nova/scheduler/client/report.py", line 644, in _ensure_resource_provider
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager parent_provider_uuid=parent_provider_uuid)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/opt/stack/nova/nova/scheduler/client/report.py", line 72, in wrapper
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager return f(self, *a, **k)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager File "/opt/stack/nova/nova/scheduler/client/report.py", line 574, in _create_resource_provider
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager raise exception.ResourceProviderCreationFailed(name=name)
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager nova.exception.ResourceProviderCreationFailed: Failed to create resource provider aio
Nov 22 11:50:52 aio nova-compute[15324]: ERROR nova.compute.manager

Problems:
Nova fails to create RP and update the inventory in placement BUT nova compute status is reported UP. I would argue that in this case this nova compute service is not operational as it cannot keep its resource view consistent in placement.

What happens:
* during the first that of the nova-compute service (before the reconfiguration) nova created a service object with hostname=aio and generated a uuid for it. Then used this uuid and the hypervisor_hostname returned by the virt driver (libvirt returns the result of gethostname, which was aio as well in this case) and used this uuid and name to create the compute RP in placement.

* when nova-compute is restarted during after the [DEFAULT]/host is change to other than 'aio' nova creates new service object and generates a new uuid for it. The libvirt driver still returns 'aio' as hypervisor hostname. So nova tries to create a new compute RP with a new uuid but the old 'aio' name. As in placement the RP name needs to be unique this RP creation fails.

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

I guess at least we need to document in the [DEFAULT]/host config option that changing the value of it after initial deploy needs special care (like cleaning up the old service object and placement)

tags: added: compute
tags: added: placement resource-tracker
Matt Riedemann (mriedem)
Changed in nova:
status: New → Triaged
Changed in nova:
assignee: nobody → Harshavardhan Metla (harsha24)
Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

@Harshavardhan: I saw you ping me via the launchpad contact page:

> Hi Balazs
> I have assigned this bug which you have reported.Can you help on how to
> reproduce the issue

The basic reproduction steps are as noted in the bugreport:

* build a single node devstack (hostname=aio)
* when everything up and running change the /etc/nova/nova-cpu.conf [DEFAULT]/host to something else than the compute host hostname
* restart the nova-compute service

If you need further help it is easier to contact me on IRC (my nick is gibi) on #openstack-nova channel on freenode irc server. [1]

[1] https://docs.openstack.org/contributors/common/irc.html

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.