Gate failure: Timed out waiting for Nova hypervisor-stats count >= 1

Bug #1441007 reported by Dmitry Tantsur
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Ironic
Confirmed
High
Unassigned
devstack
Fix Released
Undecided
Thiago Paiva Brito

Bug Description

See for example http://logs.openstack.org/39/170439/2/gate/gate-tempest-dsvm-ironic-agent_ssh/66849d2/logs/devstacklog.txt.gz

Not sure how often it happens, just opening this bug to track the issue.

Revision history for this message
Vladyslav Drok (vdrok) wrote :

Looking at the logs, ir-api started at 2015-04-06 16:48:10.051, while last attempt to do node.list from n-cpu happened on 2015-04-06 16:47:53.497. Also just happened to me on fresh devstack locally.

Revision history for this message
Vladyslav Drok (vdrok) wrote :

There are some nova x509 requests which fail, and it takes a minute to fail each request. That seems to cause this late start of ironic.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to devstack (master)

Fix proposed to branch: master
Review: https://review.openstack.org/171313

Changed in devstack:
status: New → In Progress
Revision history for this message
Adam Gandelman (gandelman-a) wrote :

It looks like openstackclient calls are eating up the window of time between n-cpu starting and ir-api coming up. I've opened https://bugs.launchpad.net/python-openstackclient/+bug/1441294 to look at that separately, but there is no harm in adjusting Ironic's api max retries here to cope with this and allow devstack more time to do whatever it may need to do (171313)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/173493

Changed in devstack:
assignee: nobody → Thiago Paiva Brito (thiagop)
Revision history for this message
Thiago Paiva Brito (outbrito) wrote :

n-cpu tries to communicate with Ironic by 60 seconds before giving up. As vdrok said, the nova x509 requests were failing and the timeout was causing n-cpu to exceed the maximum attempts to reach ir-api.

This was caused by a patch that removed n-crt from the default devstack setup. This patch corrects it.

P.S.: The requests of n-cpu were failing on devstack without Ironic setup, but it isn't noted because the libvirt driver doesn't need a post-installed service.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to devstack (master)

Reviewed: https://review.openstack.org/173493
Committed: https://git.openstack.org/cgit/openstack-dev/devstack/commit/?id=73af846ca064f214828c9833ab83561be53a1be4
Submitter: Jenkins
Branch: master

commit 73af846ca064f214828c9833ab83561be53a1be4
Author: Thiago Paiva <email address hidden>
Date: Tue Apr 14 16:57:22 2015 -0300

    Fixing n-crt removal from stackrc

    The commit 279cfe75198c723519f1fb361b2bff3c641c6cef removed the n-crt
    service from the default devstack setup. As such, the stack.sh script
    begun to thrown the following error when trying to "nova x509-create-cert":

      ERROR (ClientException): The server has either erred or is incapable of
      performing the requested operation. (HTTP 500)

    This patches reintroduces the n-crt as a default service.

    Change-Id: Id9695a37e1c6df567f2c86baa4475225adcfb0ee
    Closes-bug: #1441007

Changed in devstack:
status: In Progress → Fix Released
Revision history for this message
Zhenzan Zhou (zhenzan-zhou) wrote :

I opened a bug before for almost the same issue: https://launchpad.net/bugs/1430616. The proposed patch is under review https://review.openstack.org/#/c/173681/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on devstack (master)

Change abandoned by Adam Gandelman (<email address hidden>) on branch: master
Review: https://review.openstack.org/171313

Revision history for this message
Matt Riedemann (mriedem) wrote :
Revision history for this message
Dmitry Tantsur (divius) wrote :
Changed in ironic:
importance: Medium → High
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.