Ironic API deployed with Apache: Exception registering nodes: Timed out waiting for a reply to message ID

Bug #1608252 reported by Emilien Macchi on 2016-07-31
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
High
Yuriy Zveryanskyy
tripleo
Critical
Emilien Macchi

Bug Description

When Ironic API is deployed with Apache, there is a critical issue preventing from registering nodes:
http://logs.openstack.org/56/333556/35/check-tripleo/gate-tripleo-ci-centos-7-ovb-ha/a097bab/logs/undercloud/var/log/ironic/app.txt.gz#_2016-07-31_15_28_21_830

2016-07-31 15:28:21.830 15571 ERROR wsme.api [req-76b24ee6-5907-47bf-86ae-93b2edc8d608 admin admin - - -] Server-side error: "Timed out waiting for a reply to message ID 50e8351ed91446859faaee9d50d4cf70". Detail:
Traceback (most recent call last):

  File "/usr/lib/python2.7/site-packages/wsmeext/pecan.py", line 84, in callfunction
    result = f(self, *args, **kwargs)

  File "/usr/lib/python2.7/site-packages/ironic/api/controllers/v1/node.py", line 447, in power
    topic)

  File "/usr/lib/python2.7/site-packages/ironic/conductor/rpcapi.py", line 179, in change_node_power_state
    new_state=new_state)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/client.py", line 169, in call
    retry=self.retry)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/transport.py", line 96, in _send
    timeout=timeout, retry=retry)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 464, in send
    retry=retry)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 453, in _send
    result = self._waiter.wait(msg_id, timeout)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 336, in wait
    message = self.waiters.get(msg_id, timeout=timeout)

  File "/usr/lib/python2.7/site-packages/oslo_messaging/_drivers/amqpdriver.py", line 238, in get
    'to message ID %s' % msg_id)

MessagingTimeout: Timed out waiting for a reply to message ID 50e8351ed91446859faaee9d50d4cf70

Tags: api Edit Tag help
tags: added: alert
Changed in tripleo:
status: New → Confirmed
importance: Undecided → Critical

Fix proposed to branch: master
Review: https://review.openstack.org/349281

Changed in tripleo:
assignee: nobody → Emilien Macchi (emilienm)
status: Confirmed → In Progress
tags: removed: alert

Reviewed: https://review.openstack.org/349281
Committed: https://git.openstack.org/cgit/openstack/instack-undercloud/commit/?id=b487f96a541fab08d261b69f7e055117be319879
Submitter: Jenkins
Branch: master

commit b487f96a541fab08d261b69f7e055117be319879
Author: Emilien Macchi <email address hidden>
Date: Sun Jul 31 17:00:06 2016 +0000

    Revert "Deploy and Upgrade Ironic to run in mod_wsgi"

    It looks like node registration is failing randomly very often. Let's revert it now.

    This reverts commit 7aefae4e41eb7ffb83675dffd1130dd8c6802287.

    Change-Id: I53febcb01a2806fadaf688101925001da6188899
    Partial-Bug: #1608252

Emilien Macchi (emilienm) wrote :

The patch was reverted but I'm trying to enable it again: https://review.openstack.org/#/c/349301/

Feel free to look ovb-ha job and see why it randomly fails.

Changed in tripleo:
status: In Progress → Fix Released
Dmitry Tantsur (divius) on 2016-08-01
Changed in ironic:
status: New → Confirmed
importance: Undecided → High
tags: added: api

I've manually configured the ironic-api to run behind Apache with mod_wsgi following the guide upstream [0] and it works. Logs: http://paste.openstack.org/show/547736/

[0] http://docs.openstack.org/developer/ironic/deploy/install-guide.html#configuring-ironic-api-behind-mod-wsgi

Changed in ironic:
assignee: nobody → Lucas Alvares Gomes (lucasagomes)

For the TripleO case, I haven't dug in the puppet modules to see how things are being setup, but I suspect it may be a misconfiguration from there.

Reviewed: https://review.openstack.org/350507
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=bacc872b72d18c344d20db1ead13fb6204a82ba7
Submitter: Jenkins
Branch: master

commit bacc872b72d18c344d20db1ead13fb6204a82ba7
Author: Lucas Alvares Gomes <email address hidden>
Date: Wed Aug 3 11:20:29 2016 +0100

    Extend the "configuring ironic-api behind mod_wsgi" guide

    The etc/apache2/ironic configuration have the logs file path pointing to
    /var/log/apache2/ which does not exist in the Red Hat systems (the
    equivalent is /var/log/httpd). This patch extend the documentation to
    point that out to the operator to look at these paths when setting up
    the ironic-api to run behing Apache mod_wsgi.

    Related-Bug: #1608252
    Change-Id: I591748245af885eeb782df82eaa5f33e123f8e06

Changed in ironic:
assignee: Lucas Alvares Gomes (lucasagomes) → nobody
Jay Faulkner (jason-oldos) wrote :

Am I correct in reading that this bug is now resolved, or at least invalid, for Ironic? Can I mark it as such?

Oksana Voshchana (ovoshchana) wrote :

Hi, this bug is exist.
I proposed patch to use ironic with wsgi. And it repeated if we use default apache config for ironic with threads > 1
https://review.openstack.org/#/c/430851/

Changed in ironic:
assignee: nobody → Yuriy Zveryanskyy (yzveryanskyy)
status: Confirmed → In Progress
Changed in tripleo:
status: Fix Released → Triaged
Ruby Loo (rloo) wrote :

Yuriy submitted this patch to fix it: https://review.openstack.org/#/c/440292/

Reviewed: https://review.openstack.org/440292
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=556b1d0871d01ab17715955bef566c99bcf6bedc
Submitter: Jenkins
Branch: master

commit 556b1d0871d01ab17715955bef566c99bcf6bedc
Author: Yuriy Zveryanskyy <email address hidden>
Date: Thu Mar 2 12:01:36 2017 +0200

    Move eventlet monkey patch code

    Eventlet monkey patching is not recommended on top level __init__ [1],
    because Apache WSGI module uses own concurrency model [2] and API
    service under Apache should be runned without eventlet. This patch
    moves eventlet monkey patching code to ironic.cmd module __init__
    (like in nova).

    [1] https://specs.openstack.org/openstack/openstack-specs/specs/eventlet-best-practices.html
    [2] http://modwsgi.readthedocs.io/en/develop/user-guides/processes-and-threading.html

    Closes-Bug: 1608252
    Change-Id: I887a06566dcc2f09875f975f1e12ae4ff75fd348

Changed in ironic:
status: In Progress → Fix Released
Ben Nemec (bnemec) wrote :

It looks like this was fixed in ironic, and tripleo doesn't seem to be failing on it anymore.

Changed in tripleo:
status: Triaged → Fix Released

This issue was fixed in the openstack/ironic 8.0.0 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers