RabbitMQ OCF action stop would fail if there is no pid file exist

Bug #1446526 reported by Bogdan Dobrelya
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Bogdan Dobrelya
5.1.x
Won't Fix
High
Denis Meltsaykin
6.0.x
Won't Fix
High
Denis Meltsaykin

Bug Description

Currently, the action stop would fail if there is no pid file exist for some reason, but rabbit beam process might still be running.
stop_server_process() https://github.com/stackforge/fuel-library/blob/master/deployment/puppet/cluster/files/ocf/rabbitmq#L583-593 exits with generic error and not trying to stop the running process.

how to reproduce:
1) rm -f /var/run/rabbitmq/p_pid
2) ocf_handler_rabbitmq-server -d stop

As a solution, it should instead invoke 'rabbitmqctl stop' and return generic error only if this operation have failed as well. Note, it should not try to find beam process in ps and kill it as this could affect other rabbitmq instances running around.

description: updated
Changed in fuel:
milestone: none → 6.1
importance: Undecided → High
assignee: nobody → Bogdan Dobrelya (bogdando)
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/175785

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/175785
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=bd6470d1683b777c9980fbb23be5dcfcf108707e
Submitter: Jenkins
Branch: master

commit bd6470d1683b777c9980fbb23be5dcfcf108707e
Author: Bogdan Dobrelya <email address hidden>
Date: Tue Apr 21 11:24:18 2015 +0200

    Fix RabbitMQ OCF stop action when no PIDFILE exists

    W/o this patch, RabbitMQ OCF action stop would fail
    if there is no pid file exist. This is an issue as that
    could leave the beam process running and uncontrolled
    by the Pacemaker resource agent.

    The solution is to invoke 'rabbitmqctl stop' and return generic
    error only if this operation have failed as well or timed out.
    For this case, there also should be a note provided in logs about
    additional manual stop actions required.

    Closes-bug: #1446526

    Change-Id: I86272230bd7f0ef8412ccaeb6bbb5dcaa18387bc
    Signed-off-by: Bogdan Dobrelya <email address hidden>

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Alexander Nevenchannyy (anevenchannyy) wrote :

Verified on MOS 6.1 ISO #429
Steps to Verify:
1) rm -f /var/run/rabbitmq/p_pid
2) ocf_handler_rabbitmq-server -d stop
...
Exit status: -e Success (0)

Changed in fuel:
status: Fix Committed → Fix Released
Revision history for this message
Denis Meltsaykin (dmeltsaykin) wrote :

Setting this as Won't Fix for 5.1.1-updates and 6.0-updates, as such a complex change cannot be delivered in the scope of the Maintenance Update. Also, the possible solution of the backporting of RabbitMQ OCF script is covered in details by the Operations Guide from the official documentation of the Product.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.