CI jobs failing with "Message: No valid host was found. There are not enough hosts available., Code: 500"
Bug #1599858 reported by
Derek Higgins
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
tripleo |
Invalid
|
High
|
Derek Higgins |
Bug Description
Ever since the move to rh2, we've been seeing a lot of jobs failing (not all but too many) with
2016-07-07 11:42:39.204687 | 2016-07-07 11:42:26 [overcloud]: CREATE_FAILED Resource CREATE failed: ResourceInError: resources.
Changed in tripleo: | |
milestone: | none → ongoing |
assignee: | nobody → Derek Higgins (derekh) |
To post a comment you must log in.
From postci log, one of the node failed to deploy e.g. logs.openstack. org/34/ 335434/ 3/check- tripleo/ gate-tripleo- ci-centos- 7-ovb-ha/ 085ed2a/ logs/postci. txt.gz
http://
+------ ------- ------- ------- ------- ----+-- ------- ------- ------- --+---- ----+-- ------- ---+--- ------- ---+--- ------- ------- ----+ ------- ------- ------- ------- ----+-- ------- ------- ------- --+---- ----+-- ------- ---+--- ------- ---+--- ------- ------- ----+ eed2-4dfe- a6ef-11d336be71 6b | overcloud- controller- 0 | ACTIVE | - | Running | ctlplane=192.0.2.19 | c4db-44db- a822-7b3ee631b5 8a | overcloud- controller- 1 | ACTIVE | - | Running | ctlplane=192.0.2.13 | 5ff4-4e16- b730-5cb740d10d 51 | overcloud- controller- 2 | ACTIVE | - | Running | ctlplane=192.0.2.20 | f41d-4d81- 92ae-59da36e828 f0 | overcloud- novacompute- 0 | ERROR | - | NOSTATE | | ------- ------- ------- ------- ----+-- ------- ------- ------- --+---- ----+-- ------- ---+--- ------- ---+--- ------- ------- ----+
| ID | Name | Status | Task State | Power State | Networks |
+------
| 9a37c72f-
| b49a7770-
| 233d96e9-
| 6eabef07-
+------
The times variuos nodes were created
11:33:54 979f33b0- 0df2-4565- b475-25c87296af 78 c4db-44db- a822-7b3ee631b5 8a OK eed2-4dfe- a6ef-11d336be71 6b OK 5ff4-4e16- b730-5cb740d10d 51 OK f6b8-4239- 853e-b36d595566 7c d694-4a39- 8421-e84348bd69 06 c9e1-4549- 9917-6803676497 a6 28fa-4cf9- 8d1c-86c656f781 0b f41d-4d81- 92ae-59da36e828 f0 ERROR
11:34:18 b49a7770-
11:34:28 9a37c72f-
11:34:36 233d96e9-
11:36:17 3bee4016-
11:36:32 37cfeee2-
11:36:49 393d0386-
11:37:23 d9965837-
11:38:15 6eabef07-
The very first on in this case failed, the nova logs show ironic waiting for some time
nova/nova- compute. log:2016- 07-07 11:35:43.132 3716 DEBUG nova.virt. ironic. driver [-] [instance: 979f33b0- 0df2-4565- b475-25c87296af 78] Still waiting for ironic node 2ebfe30e- df9b-4383- 81d1-61b5c34438 10 to become ACTIVE: power_state="power off", target_ power_state= None, provision_ state=" deploying" , target_ provision_ state=" active" _log_ironic_polling /usr/lib/ python2. 7/site- packages/ nova/virt/ ironic/ driver. py:118
until eventually compute. log:2016- 07-07 11:35:45.146 3716 ERROR oslo.service. loopingcall InstanceDeployF ailure: Failed to provision instance 979f33b0- 0df2-4565- b475-25c87296af 78: Timeout reached while waiting for callback for node 2ebfe30e- df9b-4383- 81d1-61b5c34438 10
nova/nova-
After this all of the retries (from heat?) fail scheduler. log d5e0-4ffe- af60-6bbd9a227e ee 620cdc7716ad4b1 99de8315d3e3199 b7 57bbfbf0ad5b4d4 6af40c681bbf66d a9 - - -] Filtering removed all hosts for the request with instance ID '3bee4016- f6b8-4239- 853e-b36d595566 7c'. Filter results: ['RetryFilter: (start: 4, end: 4)', 'TripleOCapabil itiesFilter: (start: 4, end: 4)', 'ComputeCapabil itiesFilter: (start: 4, end: 4)', 'AvailabilityZo neFilter: (start: 4, end: 4)', 'RamFilter: (start: 4, end: 0)'] 1676-4d41- 9004-6514106914 e5 620cdc7716ad4b1 99de8315d3e3199 b7 57bbfbf0ad5b4d4 6af40c681bbf66d a9 - - -] Filtering removed all hosts for...
grep "start: 4, end: 0" nova/nova-
2016-07-07 11:36:18.069 1929 INFO nova.filters [req-6e2bcd70-
2016-07-07 11:36:32.717 1929 INFO nova.filters [req-8f78197c-