Ironic

Bug #1398128
Comment #8

Comment 8 for bug 1398128

Revision history for this message

Adam Gandelman (gandelman-a) wrote on 2014-12-05:

So the devstack patch solves the issue in the grenade migration, where we start n-cpu early and don't enroll nodes till much later. But this is error still showing up in the check-tempest-dsvm-ironic-parallel-nv job, frequently. Symptoms, via http://logs.openstack.org/87/139687/1/check/check-tempest-dsvm-ironic-parallel-nv/55ce684/logs/:

Test case fails with the following on the tempest/client side:

Details: {u'code': 500, u'created': u'2014-12-05T18:02:26Z', u'message': u'No valid host was found. There are not enough hosts available.'}

On deployment, devstack waits till nova reports 3 nodes in the hypervisor stats, finishes:

2014-12-05 18:01:27.371 | ++ '[' 3 -ge 3 ']'

n-cpu's first syncs of the 3 nodes, ironic reports each node's node properties, but for all updates nova registers 0mb/0gb/0cpu:

http://logs.openstack.org/87/139687/1/check/check-tempest-dsvm-ironic-parallel-nv/55ce684/logs/screen-n-cpu.txt.gz#_2014-12-05_18_01_26_804

2014-12-05 18:01:26.904 30461 INFO nova.compute.resource_tracker [-] Compute_service record created for devstack-trusty-hpcloud-b3-3341906:7048e6e1-e180-4295-8e0a-81eb0416a10a
2014-12-05 18:01:26.999 30461 INFO nova.compute.resource_tracker [-] Compute_service record created for devstack-trusty-hpcloud-b3-3341906:a4768094-3818-4e04-b55f-6ec7b34ffce3
2014-12-05 18:01:27.088 30461 INFO nova.compute.resource_tracker [-] Compute_service record created for devstack-trusty-hpcloud-b3-3341906:75fe1fde-6f4e-4b31-8b7c-7683d96e9b88

The next set of periodic tasks run 1min later and this time, resources for the nodes are updated appropriately, 512mb/1gb/1cpu:

http://logs.openstack.org/87/139687/1/check/check-tempest-dsvm-ironic-parallel-nv/55ce684/logs/screen-n-cpu.txt.gz#_2014-12-05_18_02_27_875

2014-12-05 18:02:27.958 30461 INFO nova.compute.resource_tracker [-] Compute_service record updated for devstack-trusty-hpcloud-b3-3341906:7048e6e1-e180-4295-8e0a-81eb0416a10a
2014-12-05 18:02:28.077 30461 INFO nova.compute.resource_tracker [-] Compute_service record updated for devstack-trusty-hpcloud-b3-3341906:a4768094-3818-4e04-b55f-6ec7b34ffce3
2014-12-05 18:02:28.162 30461 INFO nova.compute.resource_tracker [-] Compute_service record updated for devstack-trusty-hpcloud-b3-3341906:75fe1fde-6f4e-4b31-8b7c-7683d96e9b88

The first instance that tempest spawns fails just before the second periodic task sync. It looks like the initial resource sync picks up the nodes but does not update nova's resources, only the # of available hypervisors. Node properties (ram/mem/cpu) are associated with nodes when they are enrolled, so that data is being received on the nova side if its updating its hypervisor count.

So the devstack patch solves the issue in the grenade migration, where we start n-cpu early and don't enroll nodes till much later.  But this is error still showing up  in the check-tempest-dsvm-ironic-parallel-nv job, frequently.  Symptoms, via http://logs.openstack.org/87/139687/1/check/check-tempest-dsvm-ironic-parallel-nv/55ce684/logs/: