[queens/master promotion] fs001 fails overcloud deploy with 'No valid host was found. , Code: 500'

Bug #1773289 reported by Ronelle Landy on 2018-05-25
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Critical
Ronelle Landy

Bug Description

fs001 failed in the current promotion (2018-05-25). The overcloud fails to deploy:

Master trace:

https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-master/512d712/undercloud/home/jenkins/overcloud_deploy.log.txt.gz#_2018-05-25_01_16_41

 ResourceInError: resources.NovaCompute: Went to status ERROR due to "Message: No valid host was found. , Code: 500"
 ResourceInError: resources.Controller: Went to status ERROR due to "Message: No valid host was found. , Code: 500"

Queens trace:

https://logs.rdoproject.org/openstack-periodic/periodic-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens/bdd2d83/undercloud/home/jenkins/overcloud_deploy.log.txt.gz#_2018-05-25_01_04_06

2018-05-25 01:04:06 | ResourceInError: resources.Controller: Went to status ERROR due to "Message: No valid host was found. , Code: 500"
2018-05-25 01:04:06 | overcloud.Compute.0.NovaCompute:
2018-05-25 01:04:06 | resource_type: OS::TripleO::ComputeServer
2018-05-25 01:04:06 | physical_resource_id: 0aab27d8-a9aa-4c7e-8f03-beaef67c7f11
2018-05-25 01:04:06 | status: CREATE_FAILED
2018-05-25 01:04:06 | status_reason: |
2018-05-25 01:04:06 | ResourceInError: resources.NovaCompute: Went to status ERROR due to "Message: No valid host was found. , Code: 500"

This is the first error we have seen in fs001 in some time in promotions.

Ronelle Landy (rlandy) on 2018-05-25
Changed in tripleo:
status: New → Triaged
importance: Undecided → Critical
tags: added: ci promotion-blocker quickstart
tags: added: alert
tags: removed: quickstart
Ronelle Landy (rlandy) wrote :

Running a reproducer for queens to check overcloud nodes created

Arx Cruz (arxcruz) on 2018-05-25
Changed in tripleo:
milestone: none → rocky-2
Ronelle Landy (rlandy) on 2018-05-25
Changed in tripleo:
assignee: nobody → Ronelle Landy (rlandy)
Alan Pevec (apevec) wrote :

ops team fixed it today around 0900 UTC after ykarel reported it:
<ykarel> kforde, btw what was the issu
<kforde> some patch files we have to reapply didn't get applied properly
<kforde> fixed now ... I expect you will succeed

Ronelle Landy (rlandy) wrote :

ok - I think this is working now:

<kforde> basically there is patch for pxe booting vms on top of rdo-cloud
<kforde> after the minor update it was overwritten and we forgot to re-apply

https://review.rdoproject.org/jenkins/job/gate-tripleo-ci-centos-7-ovb-3ctlr_1comp-featureset001-queens/

waiting for promotion to confirm

Ronelle Landy (rlandy) on 2018-05-27
Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers