Fuel for OpenStack

Add cluster health task with disk monitor

Series mitaka
Bug #1500422

Bug #1500422 reported by OpenStack Infra on 2015-09-28

This bug affects 5 people

	Status	Importance	Assigned to	Milestone
Fuel for OpenStack	Fix Released	High	Bogdan Dobrelya	Fuel for OpenStack 9.1
8.0.x	Won't Fix	High	Fuel Documentation Team	Fuel for OpenStack 8.0-updates
Mitaka	Fix Committed	High	Fuel Documentation Team	Fuel for OpenStack 9.0

Bug Description

https://review.openstack.org/226062
commit 03e7683381d14c4a9d5da93481b2d5140e7896f0
Author: Alex Schultz <email address hidden>
Date: Mon Sep 21 16:29:56 2015 -0500

Add cluster health task with disk monitor

This change adds a monitor into corosync/pacemaker to migrate services
if the monitored disks drop below 100M free.

Once the operator has resolved the full disk, they must clear the
alarm by running:

crm node status-attr <hostname> delete "#health_disk"

After the alarm has been cleared, the services should be automatically
restarted.

    This change is not a replacement for proper monitoring, but it will
    properly shut down and migrate services if a controller runs out of disk
    space.

DocImpact
Closes-Bug: 1493520

Change-Id: I8a2cb4bd8d0b6070400d13e25d2310f4777b9faf

Tags:

Stanislaw Bogatkin (sbogatkin) on 2015-09-28

Changed in fuel:
assignee:	nobody → Fuel Documentation Team (fuel-docs)
milestone:	none → 8.0
importance:	Undecided → Medium
status:	New → Confirmed

Dmitry Pyzhov (dpyzhov) on 2015-10-22

tags:

added: area-docs

Revision history for this message

Timur Nurlygayanov (tnurlygayanov) wrote on 2016-06-22:

Please also see this one (about the same issue):

https://bugs.launchpad.net/fuel/+bug/1595146

Maksim Malchuk (mmalchuk) on 2016-06-22

no longer affects:

fuel/newton

Revision history for this message

Timur Nurlygayanov (tnurlygayanov) wrote on 2016-06-22:

We need to add the description of the recovering procedure for the following case:

1. Customer have HA environment with MOS 8.x-9.x
2. One partition on one controller become full (not enough free disk space error)
3. Pacemaker automatically shut down all services on this controller
4. Operator should login to the controller node, move/remove extra files from the disks and then execute the following command to recover pacemaker:
crm node status-attr `hostname -f` delete "#health_disk"

Other possible workarounds:
1. Restart pacemaker service:
service pacemaker restart
2. Reboot controller node

We need to describe in the documentation for OpenStack operators and support team the right workflow of recovering for this situation.

Please see comments from Vladimir Kuklin here for more detailed information:
https://bugs.launchpad.net/fuel/+bug/1595100

Changed in fuel:
importance:	Medium → High

Revision history for this message

Fuel Devops McRobotson (fuel-devops-robot) wrote on 2016-06-23: Fix proposed to mos/mos-docs (master)

Fix proposed to branch: master
Change author: Bogdan Dobrelya <email address hidden>
Review: https://review.fuel-infra.org/22506

Bogdan Dobrelya (bogdando) on 2016-06-23

Changed in fuel:
assignee:	Fuel Documentation Team (fuel-docs) → Bogdan Dobrelya (bogdando)
status:	Confirmed → In Progress

Bogdan Dobrelya (bogdando) on 2016-06-23

tags:

removed: fuel-library

Revision history for this message

Fuel Devops McRobotson (fuel-devops-robot) wrote on 2016-07-18: Fix merged to mos/mos-docs (master)

Reviewed: https://review.fuel-infra.org/22506
Submitter: Svetlana Karslioglu <email address hidden>
Branch: master

Commit: e9f2c8576f064dfd59409c1f59655bb1284077a4
Author: Bogdan Dobrelya <email address hidden>
Date: Mon Jul 11 09:02:14 2016

Add free space monitoring guide

Closes-bug: #1500422

Change-Id: Id31279c4dc5eb7e1102fc8d90d0defb265742d88
Signed-off-by: Bogdan Dobrelya <email address hidden>

Michele Fagan (michelefagan) on 2016-07-25

Changed in fuel:
status:	In Progress → Fix Released
milestone:	10.0 → 9.1

Report a bug

This report contains Public information

Everyone can see this information.

Duplicates of this bug

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.