Baremetal node id not supplied to driver

Bug #1346424 reported by Tomas Sedovic
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Critical
Chris Behrens
tripleo
Fix Released
Critical
Dan Prince

Bug Description

A random overcloud baremetal node fails to boot during check-tripleo-overcloud-f20. Occurs intermittently.

Full logs:

http://logs.openstack.org/26/105326/4/check-tripleo/check-tripleo-overcloud-f20/9292247/
http://logs.openstack.org/81/106381/2/check-tripleo/check-tripleo-overcloud-f20/ca8a59b/
http://logs.openstack.org/08/106908/2/check-tripleo/check-tripleo-overcloud-f20/e9894ca/

Seed's nova-compute log shows this exception:

Jul 21 13:46:07 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 ERROR oslo.messaging.rpc.dispatcher [req-9f090bea-a974-4f3c-ab06-ebd2b7a5c9e6 ] Exception during message handling: Baremetal node id not supplied to driver for 'e13f2660-b72d-4a97-afac-64ff0eecc448'
Jul 21 13:46:07 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):
Jul 21 13:46:07 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 133, in _dispatch_and_reply
Jul 21 13:46:07 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher incoming.message))
Jul 21 13:46:07 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 176, in _dispatch
Jul 21 13:46:07 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
Jul 21 13:46:07 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/oslo/messaging/rpc/dispatcher.py", line 122, in _do_dispatch
Jul 21 13:46:07 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args)
Jul 21 13:46:07 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/exception.py", line 88, in wrapped
Jul 21 13:46:07 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher payload)
Jul 21 13:46:07 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__
Jul 21 13:46:07 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/exception.py", line 71, in wrapped
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher return f(self, context, *args, **kw)
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/compute/manager.py", line 291, in decorated_function
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher pass
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/compute/manager.py", line 277, in decorated_function
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/compute/manager.py", line 341, in decorated_function
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/compute/manager.py", line 319, in decorated_function
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher kwargs['instance'], e, sys.exc_info())
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/openstack/common/excutils.py", line 82, in __exit__
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/compute/manager.py", line 307, in decorated_function
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/compute/manager.py", line 1950, in build_and_run_instance
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher node, limits)
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/openstack/common/lockutils.py", line 325, in inner
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher return f(*args, **kwargs)
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/compute/manager.py", line 1917, in do_build_and_run_instance
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher if self.driver.macs_for_instance(instance):
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/virt/baremetal/driver.py", line 223, in macs_for_instance
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher node_uuid = self._require_node(instance)
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/venvs/nova/lib/python2.7/site-packages/nova/virt/baremetal/driver.py", line 188, in _require_node
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher % instance['uuid'])
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher NovaException: Baremetal node id not supplied to driver for 'e13f2660-b72d-4a97-afac-64ff0eecc448'
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.981 3608 TRACE oslo.messaging.rpc.dispatcher
Jul 21 13:46:08 host-192-168-1-236 nova-compute[3608]: 2014-07-21 13:46:07.999 3608 ERROR oslo.messaging._drivers.common [req-9f090bea-a974-4f3c-ab06-ebd2b7a5c9e6 ] Returning exception Baremetal node id not supplied to driver for 'e13f2660-b72d-4a97-afac-64ff0eecc448' to caller

Tomas Sedovic (tsedovic)
Changed in tripleo:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Giulio Fidente (gfidente) wrote :

I have mitigated this by setting scheduler_host_subset_size = 2 in the hosting cloud

Revision history for this message
Giulio Fidente (gfidente) wrote :
Ben Nemec (bnemec)
Changed in tripleo:
importance: High → Critical
Dan Prince (dan-prince)
Changed in tripleo:
assignee: nobody → Dan Prince (dan-prince)
Dan Prince (dan-prince)
affects: tripleo → nova
Changed in tripleo:
assignee: nobody → Dan Prince (dan-prince)
importance: Undecided → Critical
status: New → In Progress
Changed in nova:
assignee: Dan Prince (dan-prince) → Chris Behrens (cbehrens)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/109317
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7f65d43b0465eb27c638e44395e5ca535574c2a1
Submitter: Jenkins
Branch: master

commit 7f65d43b0465eb27c638e44395e5ca535574c2a1
Author: Dan Prince <email address hidden>
Date: Thu Jul 24 12:41:07 2014 -0400

    Revert "Deallocate the network if rescheduling for

    Revert "Deallocate the network if rescheduling for Ironic"

    This reverts commit 963ad71af4750e28745b6de262da11816b403801.

    The original fix, targeted towards Nova BM and Ironic tests
    is actually making us have more test failures. Lets go
    back to the original race... and then we can have a bit
    more time to test a proper fix.

    Closes-bug #1346424

    Change-Id: Icbbe16ffef69132177165d21c727d791b62a232f

Changed in nova:
status: In Progress → Fix Committed
Dan Prince (dan-prince)
Changed in tripleo:
status: In Progress → Fix Committed
Changed in tripleo:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: none → juno-3
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-3 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.