Compute manager fails to cleanup compute_nodes not reported by driver
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Low
|
David Peraza | ||
Grizzly |
Fix Released
|
Low
|
Unassigned |
Bug Description
When virt driver supports multiple nodes and one node is removed from driver support the compute_nodes in DB are not synched with the driver list. This will cause scheduler to pick bad host resulting in this error:
| fault | {u'message': u'NovaException', u'code': 500, u'details': u'helium51 is not a valid node managed by this compute host. |
| | File "/usr/lib/
| | return function(self, context, *args, **kwargs) |
| | File "/usr/lib/
| | do_run_instance() |
| | File "/usr/lib/
| | retval = f(*args, **kwargs) |
| | File "/usr/lib/
| | admin_password, is_first_time, node, instance) |
| | File "/usr/lib/
| | self._set_
| | File "/usr/lib64/
| | self.gen.next() |
| | File "/usr/lib/
| | rt = self._get_
| | File "/usr/lib/
| | raise exception.
| | ', u'created': u'2013-
Two things I see in the code:
first the list of known hosts is not reflecting the DB list but a list from driver.
known_nodes = set(self.
Which then will never yield orphan compute_nodes in this statement:
for nodename in known_nodes - nodenames
Secondly, even if we fix to get known_nodes from the DB through conductor
This code will always raise and exception:
for nodename in known_nodes - nodenames:
rt = self._get_
rt.
because _get_resource_
To replicate this you could just change your hypervisor_hostname which will create a new record in nova.compute_nodes table leaving the old record around. This will simulate a compute node that is not supported anymore in a multi-node scenario.
Suggestion:
Remove logic to delete orphan compute_nodes from compute.manager and move to compute.
Changed in nova: | |
assignee: | nobody → David Peraza (dperaza) |
Changed in nova: | |
status: | New → Triaged |
importance: | Undecided → Low |
tags: | added: baremetal |
Changed in nova: | |
milestone: | none → havana-1 |
status: | Fix Committed → Fix Released |
Changed in nova: | |
milestone: | havana-1 → 2013.2 |
Fix proposed to branch: master /review. openstack. org/25592
Review: https:/