[LMA] the collector service stops when the local RabbitMQ server is down

Bug #1503251 reported by Simon Pasquier
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel Plugins
Fix Released
High
Simon Pasquier
StackLight
Fix Released
High
Simon Pasquier
0.8
Fix Released
Undecided
Unassigned

Bug Description

Steps to reproduce
===============

- Deploy an OpenStack environment.
- Login to one of the controllers and kill the rabbitmq process.
- Verify that the LMA collector isn't running either.

Expected result
=============

The LMA collector on the controller should be running.

Tags: lma
Changed in fuel-plugins:
assignee: LMA-Toolchain Fuel Plugins (mos-lma-toolchain) → Simon Pasquier (simon-pasquier)
Revision history for this message
Simon Pasquier (simon-pasquier) wrote :

This is obviously a regression: with LMA 0.7, Heka can cope with RabbitMQ being down after it has initialized. On some 0.7 environment, I get these lines in /var/log/lma_collector.log:

2015/10/07 15:19:57 Input 'openstack_error_amqp' error: dial tcp 127.0.0.1:5673: connection refused
2015/10/07 15:19:58 Input 'openstack_info_amqp' error: dial tcp 127.0.0.1:5673: connection refused
2015/10/07 15:19:58 Input 'openstack_warn_amqp' error: dial tcp 127.0.0.1:5673: connection refused
2015/10/07 15:20:13 Input 'openstack_error_amqp' error: dial tcp 127.0.0.1:5673: connection refused
2015/10/07 15:20:14 Input 'openstack_info_amqp' error: dial tcp 127.0.0.1:5673: connection refused
2015/10/07 15:20:14 Input 'openstack_warn_amqp' error: dial tcp 127.0.0.1:5673: connection refused
2015/10/07 15:20:43 Input 'openstack_error_amqp' error: dial tcp 127.0.0.1:5673: connection refused

Revision history for this message
Simon Pasquier (simon-pasquier) wrote :

I've filled an issue to Heka [1] but the problem might be in the AMQP library...

[1] https://github.com/mozilla-services/heka/issues/1757

Revision history for this message
Simon Pasquier (simon-pasquier) wrote :
Changed in fuel-plugins:
status: Confirmed → In Progress
Changed in lma-toolchain:
status: New → In Progress
importance: Undecided → High
assignee: nobody → Simon Pasquier (simon-pasquier)
milestone: none → 0.8.0
summary: - [LMA] the collector service stops when the local rabbitmq server is down
+ [LMA] the collector service stops when the local RabbitMQ server is down
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-plugin-lma-collector (master)

Fix proposed to branch: master
Review: https://review.openstack.org/250410

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-plugin-lma-collector (master)

Reviewed: https://review.openstack.org/250410
Committed: https://git.openstack.org/cgit/openstack/fuel-plugin-lma-collector/commit/?id=c5f97a203b5e5c517987be2c6aecc567f2f95656
Submitter: Jenkins
Branch: master

commit c5f97a203b5e5c517987be2c6aecc567f2f95656
Author: Simon Pasquier <email address hidden>
Date: Thu Nov 26 15:03:03 2015 +0100

    Update Heka to 0.10.0b2

    This version will allow us to enable the buffering for the output
    plugins and deal properly with RabbitMQ connection drops.

    Change-Id: I087236ecc7756d005a98cd11d3e5efe8cbdc00cb
    Closes-Bug: #1503251
    Partial-Bug: #1488717

Changed in lma-toolchain:
status: In Progress → Fix Committed
Changed in fuel-plugins:
status: In Progress → Fix Committed
Changed in lma-toolchain:
milestone: 0.8.0 → 0.9.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-plugin-lma-collector (stable/0.8)

Fix proposed to branch: stable/0.8
Review: https://review.openstack.org/258465

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-plugin-lma-collector (stable/0.8)

Reviewed: https://review.openstack.org/258465
Committed: https://git.openstack.org/cgit/openstack/fuel-plugin-lma-collector/commit/?id=7206aa0e4b67afb13c22748268fd51a0ca58e139
Submitter: Jenkins
Branch: stable/0.8

commit 7206aa0e4b67afb13c22748268fd51a0ca58e139
Author: Simon Pasquier <email address hidden>
Date: Thu Nov 26 15:03:03 2015 +0100

    Update Heka to 0.10.0b2

    This version will allow us to enable the buffering for the output
    plugins and deal properly with RabbitMQ connection drops.

    Change-Id: I087236ecc7756d005a98cd11d3e5efe8cbdc00cb
    Closes-Bug: #1503251
    Partial-Bug: #1488717
    (cherry picked from commit c5f97a203b5e5c517987be2c6aecc567f2f95656)

Changed in fuel-plugins:
status: Fix Committed → Fix Released
Changed in lma-toolchain:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.