InstanceDeployFailure in overcloud job

Bug #1380782 reported by Ben Nemec
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
tripleo
Expired
High
Unassigned

Bug Description

We've been seeing this in CI on a pretty regular basis lately. It's not clear exactly what is going on (yay logging!), but one of the overcloud instances goes to ERROR state and you see entries like this in the nova-compute and ironic-conductor logs:

InstanceDeployFailure: Failed to provision instance 395da77b-f125-4d29-a694-df61fad22f4c: Failed to deploy. Error: Failed to execute command via SSH: LC_ALL=C /usr/bin/virsh --connect qemu:///system destroy baremetalbrbm4_1.

I'm not even sure that's related since it's trying to destroy the instance, but it's the only thing I've found so far. I'm mostly opening this to track the problem so we're not all looking at it separately.

Revision history for this message
Ben Nemec (bnemec) wrote :

Thought this might be related to the ci_commands filter script, but it doesn't appear so:

[fedora@openstack bin]$ SSH_ORIGINAL_COMMAND="LC_ALL=C /usr/bin/virsh --connect qemu:///system destroy baremetalbrbm4_1" ./ci_commands
Calling LC_ALL=C /usr/bin/virsh --connect qemu:///system destroy baremetalbrbm4_1
error: failed to connect to the hypervisor
error: Failed to connect socket to '/var/run/libvirt/libvirt-sock': No such file or directory

Not actually running libvirt on this VM so the command fails, but it does allow it through.

Revision history for this message
Steven Hardy (shardy) wrote : potentially eol bug

This bug was reported against an old version of TripleO, and may no longer be valid.

Since it was reported before the start of the liberty cycle (and our oldest stable
branch is stable/liberty), I'm marking this incomplete.

Please reopen this (change the status from incomplete) if the bug is still valid
on a current supported (stable/liberty, stable/mitaka or trunk) version of TripleO,
thanks!

Changed in tripleo:
status: Triaged → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for tripleo because there has been no activity for 60 days.]

Changed in tripleo:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.