No host-to-cell mapping found for selected host

Bug #1660160 reported by Emilien Macchi
18
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Invalid
Undecided
Unassigned
tripleo
Fix Released
Critical
Oliver Walsh

Bug Description

This report is maybe not a bug but I found useful to share what happens in TripleO since this commit:
https://review.openstack.org/#/c/319379/

We are unable to deploy the overcloud nodes anymore (in other words, create servers with Nova / Ironic).

Nova Conductor sends this message:
"No host-to-cell mapping found for selected host"
http://logs.openstack.org/31/426231/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/915aeba/logs/undercloud/var/log/nova/nova-conductor.txt.gz#_2017-01-27_19_21_56_348

And it sounds like the compute host is not registered:
http://logs.openstack.org/31/426231/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/915aeba/logs/undercloud/var/log/nova/nova-compute.txt.gz#_2017-01-27_18_56_56_543

Nova Config is available here:
http://logs.openstack.org/31/426231/1/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/915aeba/logs/etc/nova/nova.conf.txt.gz

That's all the details I have now, feel free for more details if needed.

Revision history for this message
Emilien Macchi (emilienm) wrote :

Note: a revert of https://review.openstack.org/#/c/319379/ worked for TripleO and we are enable to spawn the overcloud again.

Changed in tripleo:
status: New → Triaged
importance: Undecided → Critical
milestone: none → ocata-rc1
tags: added: promotion-blocker
Revision history for this message
Dan Smith (danms) wrote :

Something in your config has been preventing compute nodes from creating their compute node records for much longer than the referenced patch has been in place. I picked a random older run and found the same compute node record create failure:

 http://logs.openstack.org/95/422795/4/check/gate-tripleo-ci-centos-7-undercloud/9d4dda4/logs/var/log/nova/nova-compute.txt.gz#_2017-01-20_15_58_59_030

The referenced patch does require those compute node records, just like many other pieces of nova (your resource tracking will be wrong without it) but it is only related in as much as it requires them to be there in order to boot an instance. The ComputeNode records are very fundamental to Nova and have been for years, before cellsv2 was even a thing.

Without the compute node records, the discover_hosts step will not be able to create HostMapping records for the compute nodes, which is what the "No host-to-cell mapping" message is about.

So, this is, IMHO, not a Nova bug but just something config-related on the tripleo side. I'm not sure what exactly would cause that compute node record create failure, but I expect it's something minor.

Changed in nova:
status: New → Invalid
Revision history for this message
Matt Riedemann (mriedem) wrote :

Looks like from that linked CI run, nova-compute can't make a connection to Ironic so we can't get nodes and thus don't create compute_nodes records in the main nova database:

http://logs.openstack.org/95/422795/4/check/gate-tripleo-ci-centos-7-undercloud/9d4dda4/logs/var/log/nova/nova-compute.txt.gz#_2017-01-20_15_59_49_174

Revision history for this message
Valeriy Ponomaryov (vponomaryov) wrote :

Dan Smith,

Nova VMs "hang" in "scheduling" state in current case, so Nova has bug for sure. It should never leave its resources in transitional states forever.

Revision history for this message
Valeriy Ponomaryov (vponomaryov) wrote :

Dan Smith,

also you can look at CI logs results searching for message "No compute node record for" here:

http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22No%20compute%20node%20record%20for%5C%22

So, there is tremendous amount of CI jobs that have such "misconfiguration". About 3000 hits per 6 hours when I searched it.

So, definitely have trouble here.

Revision history for this message
Alan Pevec (apevec) wrote :

From Dan Smith: likely this will fix this issue: https://review.openstack.org/#/c/425273

Revision history for this message
Valeriy Ponomaryov (vponomaryov) wrote :

Change [1] for Manila project solved problem, proof is here - [2].

Problem is in very late creation of cell0 [3]. So, if some devstack plugin tries to create Nova VM with disabled Nova cell services we get this error. Now, it can be workarounded creating Nova VMs in "test-config" section of devstack run.

[1] https://review.openstack.org/#/c/426737/4/devstack/plugin.sh
[2] http://logs.openstack.org/37/426737/4/check/gate-manila-tempest-dsvm-generic-no-share-servers-ubuntu-xenial-nv/8e008c9/logs/devstacklog.txt.gz
[3] https://github.com/openstack-dev/devstack/blob/3bdeed06/stack.sh#L1371

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (master)

Fix proposed to branch: master
Review: https://review.openstack.org/427534

Changed in tripleo:
assignee: nobody → Oliver Walsh (owalsh)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.openstack.org/427534
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=f6c286dbe882111b8de3b4f53391f1e96ad2d120
Submitter: Jenkins
Branch: master

commit f6c286dbe882111b8de3b4f53391f1e96ad2d120
Author: Oliver Walsh <email address hidden>
Date: Wed Feb 1 02:51:15 2017 +0000

    Fix race in undercloud cell_v2 host discovery

    Ensure that the ironic nodes have been picked up by the nova resource tracker
    before running nova-manage cell_v2 host discovery.

    Also adds logging of the verbose command output to mistral engine log.

    Change-Id: I4cc67935df8f37cdb2d8b0bfd96cf90eb7a6ce25
    Closes-Bug: #1660160

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 5.8.0

This issue was fixed in the openstack/tripleo-common 5.8.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.