ocata ovb jobs fail with CREATE_FAILED Error: resources.NovaComputeDeployment

Bug #1680996 reported by Alex Schultz
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Steven Hardy

Bug Description

All *Ocata* OVB jobs appear to be failing with CREATE_FAILED Error: resources.NovaComputeDeployment: Deployment to server failed: deploy_status_code : Deployment exited with non-zero status code: 1

Example,
https://review.openstack.org/#/c/454687/

See the tripleoci status page:
http://status-tripleoci.rhcloud.com/#gate-tripleo-ci-centos-7-ovb-ha-ocata
http://status-tripleoci.rhcloud.com/#gate-tripleo-ci-centos-7-ovb-nonha-ocata

Revision history for this message
Emilien Macchi (emilienm) wrote :
Revision history for this message
Emilien Macchi (emilienm) wrote :
Revision history for this message
Emilien Macchi (emilienm) wrote :

ok so I found the root cause:

Heat fails to create the overcloud stack when trying to bootstrap the compute node, because Nova is waiting for ironic to provide a node, but ironic fails to poweroff the node:

http://logs.openstack.org/87/454687/3/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/d99feaa/logs/undercloud/var/log/ironic/ironic-conductor.txt.gz#_2017-04-07_23_36_18_186

Revision history for this message
Emilien Macchi (emilienm) wrote :
Revision history for this message
Emilien Macchi (emilienm) wrote :

I'll continue debug later but it's pretty clear that NovaComputeDeployment stack failed to finish.
Ironic provided a node to Nova, so this is the thing we we need to investigate, why it still makes the stack failing.

http://logs.openstack.org/87/454687/3/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/d99feaa/logs/undercloud/var/log/heat/heat-engine.txt.gz#_2017-04-07_23_38_19_831

tags: added: ocata-backport-potential
Revision history for this message
Thomas Herve (therve) wrote :

From the compute logs:

Apr 7 23:38:09 localhost os-collect-config: [2017-04-07 23:38:09,872] (heat-config) [INFO] {"deploy_stdout": "", "deploy_stderr": "Legacy hiera hook or data has been detected. Please update all of your interfaces to use the new heat-agents hiera hook before proceeding. See http://lists.openstack.org/pipermail/openstack-dev/2017-January/110922.html or bug 1680006 for more information.", "deploy_status_code": 1}

Introduced by https://review.openstack.org/#/c/454556/

Revision history for this message
Marios Andreou (marios-b) wrote :

o/ so https://review.openstack.org/#/c/454556/ relaxes the existing check for the 'legacy' hiera hook by not running os-apply-config (since the deployed 'legacy' hiera data is still around) but just looking for the existing hook nd o-a-c template. We did this because upgrades N->O were hitting https://bugs.launchpad.net/tripleo/+bug/1680006

The fact ci is hitting this means the env either has /usr/libexec/os-apply-config/templates/etc/puppet/hiera.yaml or /usr/libexec/os-refresh-config/configure.d/40-hiera-datafiles...

For N->O we use https://github.com/openstack/tripleo-heat-templates/blob/stable/ocata/environments/major-upgrade-composable-steps.yaml#L6-L16 to remove them. I'm still not clear why those would still be around in this env though

Revision history for this message
Marios Andreou (marios-b) wrote :

from shardy via irc just now we are including the old hook still at https://github.com/openstack/tripleo-common/blob/master/image-yaml/overcloud-images.yaml#L15

11:02 < marios> shardy: like here https://github.com/openstack/tripleo-heat-templates/blob/stable/ocata/puppet/role.role.j2.yaml#L396 - trouble is I don't know how much
                sense an environment file which does
                https://github.com/openstack/tripleo-heat-templates/blob/stable/ocata/environments/major-upgrade-composable-steps.yaml#L14-L16 is but i think it will solve
                the issue in 1680996 well i hope it will anyway
11:02 < jaosorior> shardy: yep, it's somewhere in heat
11:02 < shardy> marios: Hmm, is the problem that we're still building images with the old hiera element perhaps?
11:03 < marios> shardy: ah that would explain the 40-hiera-datafiles
11:03 < shardy> marios: I can't think of another reason that legacy data would be there, since this isn't an upgrade CI job?
11:03 < shardy> marios: ah, yeah, so I guess we need to remove that element and just install the heat hook
11:03 < marios> shardy: right exactly iit isn't upgrade like
                http://logs.openstack.org/87/454687/3/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/d99feaa/console.html#_2017-04-07_23_38_24_591760 pointed to from the bug
                as the example
11:04 < shardy> jaosorior: ack, if you can bisect to the exact commit that would be great as we can look at reverting it
11:04 < shardy> https://github.com/openstack/tripleo-common/blob/master/image-yaml/overcloud-images.yaml#L15
11:04 < shardy> marios: ^^
11:04 < jaosorior> shardy: I'
11:04 < jaosorior> *I'm on it
11:04 < shardy> Looks like we forgot to remove that
11:05 < shardy> marios: sec, let me push a patch and we can see how CI likes it
11:05 < marios> shardy: ack nice one

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (master)

Fix proposed to branch: master
Review: https://review.openstack.org/455149

Changed in tripleo:
assignee: nobody → Steven Hardy (shardy)
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-common (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/455150

Revision history for this message
Steven Hardy (shardy) wrote :

Actually the upgrade job to master is also failing in a similar way:

http://logs.openstack.org/49/455149/1/check/gate-tripleo-ci-centos-7-multinode-upgrades-nv/adedf62/logs/subnode-2/var/log/messages

And removing the element isn't enough - we'll have to debug what isn't getting cleaned up by the UpgradeInitCommand

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (master)

Reviewed: https://review.openstack.org/455149
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=f79aaca65f85bbd688a3735dfcf0aa4c9f4d3c5b
Submitter: Jenkins
Branch: master

commit f79aaca65f85bbd688a3735dfcf0aa4c9f4d3c5b
Author: Steven Hardy <email address hidden>
Date: Mon Apr 10 09:09:24 2017 +0100

    Remove legacy hiera element

    We moved to using the heat-config hiera hook in ocata, so the hiera
    element should be removed to avoid any data related to the old approach
    being written to the image.

    Depends-On: I7de5c32c6d9ec689ea0d7716daa9c90234991dfa
    Change-Id: Ia685d06bcd55c2487c9a269aa41fee7c9307f126
    Closes-Bug: #1680996

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-common (stable/ocata)

Reviewed: https://review.openstack.org/455150
Committed: https://git.openstack.org/cgit/openstack/tripleo-common/commit/?id=a720d20c296297607364a5790283f8536bbcabe3
Submitter: Jenkins
Branch: stable/ocata

commit a720d20c296297607364a5790283f8536bbcabe3
Author: Steven Hardy <email address hidden>
Date: Mon Apr 10 09:09:24 2017 +0100

    Remove legacy hiera element

    We moved to using the heat-config hiera hook in ocata, so the hiera
    element should be removed to avoid any data related to the old approach
    being written to the image.

    Change-Id: Ia685d06bcd55c2487c9a269aa41fee7c9307f126
    Closes-Bug: #1680996
    (cherry picked from commit f79aaca65f85bbd688a3735dfcf0aa4c9f4d3c5b)

tags: added: in-stable-ocata
Revision history for this message
Alan Pevec (apevec) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 7.0.0

This issue was fixed in the openstack/tripleo-common 7.0.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-common 6.1.0

This issue was fixed in the openstack/tripleo-common 6.1.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.