MySQL RA OCF should use a timeout wrapper instead of sleep commands

Bug #1449542 reported by Bogdan Dobrelya
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
Medium
Unassigned
Mitaka
Won't Fix
Medium
Liubov Efremova
Newton
Invalid
Medium
Unassigned

Bug Description

This comes from original mail thread http://<email address hidden>/msg51625.html :

"If there is one cloned mysql instance not starting, the
whole pacemaker cluster gets stuck and does not emit any log. On the
host of the failed instance, I can see a mysql resource agent process
calling the sleep command. If I kill that process, the pacemaker comes
back alive and RabbitMQ master gets promoted. In fact this long timeout
is blocking every resource from state transition in pacemaker.

This maybe a known problem of pacemaker and there are some discussions
in Linux-HA mailing list [2]. It might not be fixed in the near future.
It seems in generally it's bad to have long timeout in state transition
actions (start/stop/promote/demote). There maybe another way to
implement MySQL-wss resource agent to use a short start timeout and
monitor the wss cluster state using monitor action."

[2] http://lists.linux-ha.org/pipermail/linux-ha/2014-March/047989.html

I believe the start/stop/promote/demote and other commands must be wrapped in timeout as well as we did for MQ RA OCF

Related bug: https://bugs.launchpad.net/fuel/+bug/1432603

Changed in fuel:
milestone: none → 7.0
assignee: nobody → Fuel Library Team (fuel-library)
importance: Undecided → Medium
status: New → Triaged
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Sergii Golovatiuk (sgolovatiuk)
Revision history for this message
Bartłomiej Piotrowski (bpiotrowski) wrote :

Moving to 8.0 due to SCF.

Changed in fuel:
status: Triaged → Won't Fix
Revision history for this message
Matthew Mosesohn (raytrac3r) wrote :

This is a tech debt bug to refactor MySQL OCF script, not an actual defect

tags: added: ha tech-debt
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 7.0 → 8.0
status: Won't Fix → Triaged
no longer affects: fuel/8.0.x
Dmitry Pyzhov (dpyzhov)
tags: added: area-library
Dmitry Pyzhov (dpyzhov)
Changed in fuel:
milestone: 8.0 → 9.0
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Liubov Efremova (lefremova)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/287733

Changed in fuel:
status: Triaged → In Progress
Changed in fuel:
assignee: Liubov Efremova (lefremova) → Alex Schultz (alex-schultz)
Changed in fuel:
assignee: Alex Schultz (alex-schultz) → Liubov Efremova (lefremova)
Changed in fuel:
assignee: Liubov Efremova (lefremova) → Sergii Golovatiuk (sgolovatiuk)
Changed in fuel:
assignee: Sergii Golovatiuk (sgolovatiuk) → Liubov Efremova (lefremova)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on fuel-library (master)

Change abandoned by Fuel DevOps Robot (<email address hidden>) on branch: master
Review: https://review.openstack.org/287733
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.

Changed in fuel:
assignee: Liubov Efremova (lefremova) → Bogdan Dobrelya (bogdando)
Revision history for this message
Bogdan Dobrelya (bogdando) wrote :

The command proc_stop will take care of the orphans, if any left running after a start action has been expired by a Pacemaker lrmd

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.