Check that enough nodes are in "available" state with maintenance mode off error
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Fix Released
|
High
|
Alex Schultz |
Bug Description
When trying to deploy the overcloud (minimal deploy, one controller and one compute) I ran into the following error
(undercloud) [stack@undercloud ~]$ openstack overcloud deploy --templates ~/tripleo-
Started Mistral Workflow tripleo.
Waiting for messages on queue 'a7f0b2ec-
{u'errors': [u'Only 0 nodes are exposed to Nova of 2 requests. Check that enough nodes are in "available" state with maintenance mode off.'], u'result': {u'enough_nodes': False, u'statistics': {u'count': 0, u'vcpus_used': 0, u'local_gb_used': 0, u'manager': {u'api': {u'server_groups': None, u'keypairs': None, u'servers': None, u'server_
ERRORS
[u'Only 0 nodes are exposed to Nova of 2 requests. Check that enough nodes are in "available" state with maintenance mode off.', u'Only 0 nodes are exposed to Nova of 2 requests. Check that enough nodes are in "available" state with maintenance mode off.']
Configuration has 2 errors, fix them before proceeding. Ignoring these errors is likely to lead to a failed deploy.
Ironic nodes appear to be ok, there are 2 available, but Nova doesn't know about them.
(undercloud) [stack@undercloud ~]$ ironic node-list
+------
| UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+------
| 51d68c49-
| df8f251e-
+------
(undercloud) [stack@undercloud ~]$ nova hypervisor-stats
+------
| Property | Value |
+------
| count | 0 |
| current_workload | 0 |
| disk_available_
| free_disk_gb | 0 |
| free_ram_mb | 0 |
| local_gb | 0 |
| local_gb_used | 0 |
| memory_mb | 0 |
| memory_mb_used | 0 |
| running_vms | 0 |
| vcpus | 0 |
| vcpus_used | 0 |
+------
Checking the nova services we see the following
(undercloud) [stack@undercloud ~]$ nova service-list
+------
| Id | Binary | Host | Zone | Status | State | Updated_at | Disabled Reason | Forced down |
+------
| c49703f6-
| 2c008a8b-
| 85bbdc1c-
+------
So, we restarted nova-compute
(undercloud) [stack@undercloud ~]$ nova service-enable 85bbdc1c-
+------
| ID | Host | Binary | Status |
+------
| 85bbdc1c-
+------
(undercloud) [stack@undercloud ~]$ nova hypervisor-stats
+------
| Property | Value |
+------
| count | 2 |
| current_workload | 0 |
| disk_available_
| free_disk_gb | 98 |
| free_ram_mb | 16384 |
| local_gb | 98 |
| local_gb_used | 0 |
| memory_mb | 16384 |
| memory_mb_used | 0 |
| running_vms | 0 |
| vcpus | 4 |
| vcpus_used | 0 |
+------
And this solved the issue.
Turns out the problem was that the undercloud didn't have much disk space remaining so ironic builds failed, and nova now auto disables computes that fail 10 consecutive builds. Restarting the nova-compute service does the trick, but there is no clear message indicating this.
We should catch this in the validation to avoid failing to create the overcloud. And by catch I mean report the actual issue, validation did catch this I guess
The validation should be added to triple-common
Changed in tripleo: | |
milestone: | pike-rc1 → pike-rc2 |
Alternatively is this something we can configure nova not to do?