Ironic API deployed with Apache: Exception registering nodes: Timed out waiting for a reply to message ID

Bug #1608252 reported by Emilien Macchi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
High
Yuriy Zveryanskyy
tripleo
Fix Released
Critical
Emilien Macchi

Bug Description

When Ironic API is deployed with Apache, there is a critical issue preventing from registering nodes:
http://logs.openstack.org/56/333556/35/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/a097bab/logs/undercloud/var/log/ironic/app.txt.gz#_2016-07-31_15_28_21_830

2016-07-31 15:28:21.830 15571 ERROR wsme.api [req-76b24ee6-5907-47bf-86ae-93b2edc8d608 admin admin - - -] Server-side error: "Timed out waiting for a reply to message ID 50e8351ed91446859faaee9d50d4cf70". Detail:
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/wsmeext/pecan.py", line 84, in callfunction
    result = f(self, *args, **kwargs)

  File "/usr/lib/python2.7/site-packages/ironic/api/controllers/v1/node.py", line 447, in power
    topic)

  File "/usr/lib/python2.7/site-packages/ironic/conductor/rpcapi.py", line 179, in change_node_power_state
    new_state=new_state)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 169, in call
    retry=self.retry)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 96, in _send
    timeout=timeout, retry=retry)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 464, in send
    retry=retry)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 453, in _send
    result = self._waiter.wait(msg_id, timeout)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 336, in wait
    message = self.waiters.get(msg_id, timeout=timeout)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 238, in get
    'to message ID %s' % msg_id)

MessagingTimeout: Timed out waiting for a reply to message ID 50e8351ed91446859faaee9d50d4cf70

Tags: api
tags: added: alert
Changed in tripleo:
status: New → Confirmed
importance: Undecided → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to instack-undercloud (master)

Fix proposed to branch: master
Review: https://review.openstack.org/349281

Changed in tripleo:
assignee: nobody → Emilien Macchi (emilienm)
status: Confirmed → In Progress
tags: removed: alert
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to instack-undercloud (master)

Reviewed: https://review.openstack.org/349281
Committed: https://git.openstack.org/cgit/openstack/instack-undercloud/commit/?id=b487f96a541fab08d261b69f7e055117be319879
Submitter: Jenkins
Branch: master

commit b487f96a541fab08d261b69f7e055117be319879
Author: Emilien Macchi <email address hidden>
Date: Sun Jul 31 17:00:06 2016 +0000

    Revert "Deploy and Upgrade Ironic to run in mod_wsgi"

    It looks like node registration is failing randomly very often. Let's revert it now.

    This reverts commit 7aefae4e41eb7ffb83675dffd1130dd8c6802287.

    Change-Id: I53febcb01a2806fadaf688101925001da6188899
    Partial-Bug: #1608252

Revision history for this message
Emilien Macchi (emilienm) wrote :

The patch was reverted but I'm trying to enable it again: https://review.openstack.org/#/c/349301/

Feel free to look ovb-ha job and see why it randomly fails.

Changed in tripleo:
status: In Progress → Fix Released
Dmitry Tantsur (divius)
Changed in ironic:
status: New → Confirmed
importance: Undecided → High
tags: added: api
Revision history for this message
Lucas Alvares Gomes (lucasagomes) wrote :

I've manually configured the ironic-api to run behind Apache with mod_wsgi following the guide upstream [0] and it works. Logs: http://paste.openstack.org/show/547736/

[0] http://docs.openstack.org/developer/ironic/deploy/install-guide.html#configuring-ironic-api-behind-mod-wsgi

Changed in ironic:
assignee: nobody → Lucas Alvares Gomes (lucasagomes)
Revision history for this message
Lucas Alvares Gomes (lucasagomes) wrote :

For the TripleO case, I haven't dug in the puppet modules to see how things are being setup, but I suspect it may be a misconfiguration from there.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to ironic (master)

Reviewed: https://review.openstack.org/350507
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=bacc872b72d18c344d20db1ead13fb6204a82ba7
Submitter: Jenkins
Branch: master

commit bacc872b72d18c344d20db1ead13fb6204a82ba7
Author: Lucas Alvares Gomes <email address hidden>
Date: Wed Aug 3 11:20:29 2016 +0100

    Extend the "configuring ironic-api behind mod_wsgi" guide

    The etc/apache2/ironic configuration have the logs file path pointing to
    /var/log/apache2/ which does not exist in the Red Hat systems (the
    equivalent is /var/log/httpd). This patch extend the documentation to
    point that out to the operator to look at these paths when setting up
    the ironic-api to run behing Apache mod_wsgi.

    Related-Bug: #1608252
    Change-Id: I591748245af885eeb782df82eaa5f33e123f8e06

Revision history for this message
Lucas Alvares Gomes (lucasagomes) wrote :
Changed in ironic:
assignee: Lucas Alvares Gomes (lucasagomes) → nobody
Revision history for this message
Jay Faulkner (jason-oldos) wrote :

Am I correct in reading that this bug is now resolved, or at least invalid, for Ironic? Can I mark it as such?

Revision history for this message
Oksana Voshchana (ovoshchana) wrote :

Hi, this bug is exist.
I proposed patch to use ironic with wsgi. And it repeated if we use default apache config for ironic with threads > 1
https://review.openstack.org/#/c/430851/

Changed in ironic:
assignee: nobody → Yuriy Zveryanskyy (yzveryanskyy)
status: Confirmed → In Progress
Changed in tripleo:
status: Fix Released → Triaged
Revision history for this message
Ruby Loo (rloo) wrote :

Yuriy submitted this patch to fix it: https://review.openstack.org/#/c/440292/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.openstack.org/440292
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=556b1d0871d01ab17715955bef566c99bcf6bedc
Submitter: Jenkins
Branch: master

commit 556b1d0871d01ab17715955bef566c99bcf6bedc
Author: Yuriy Zveryanskyy <email address hidden>
Date: Thu Mar 2 12:01:36 2017 +0200

    Move eventlet monkey patch code

    Eventlet monkey patching is not recommended on top level __init__ [1],
    because Apache WSGI module uses own concurrency model [2] and API
    service under Apache should be runned without eventlet. This patch
    moves eventlet monkey patching code to ironic.cmd module __init__
    (like in nova).

    [1] https://specs.openstack.org/openstack/openstack-specs/specs/eventlet-best-practices.html
    [2] http://modwsgi.readthedocs.io/en/develop/user-guides/processes-and-threading.html

    Closes-Bug: 1608252
    Change-Id: I887a06566dcc2f09875f975f1e12ae4ff75fd348

Changed in ironic:
status: In Progress → Fix Released
Revision history for this message
Ben Nemec (bnemec) wrote :

It looks like this was fixed in ironic, and tripleo doesn't seem to be failing on it anymore.

Changed in tripleo:
status: Triaged → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/ironic 8.0.0

This issue was fixed in the openstack/ironic 8.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.