Fuel for OpenStack

Galera rebuild failed with pacemaker

Bug #1577911 reported by lieni on 2016-05-03

This bug affects 1 person

	Status	Importance	Assigned to	Milestone
Fuel for OpenStack	Invalid	High	Bogdan Dobrelya	Fuel for OpenStack 10.0
8.0.x	Invalid	High	Fuel Sustaining	Fuel for OpenStack 8.0-updates
Mitaka	Invalid	High	Bogdan Dobrelya	Fuel for OpenStack 9.0

Bug Description

Detailed bug description:
Miratis 8.0 with Ubuntu
HA Deployment 3 Nodes Galera, simulate Network outage or Powerloss
After NEtworking is back online, pacemaker was not able to recover all resources.
RabbitMQ failed and MYSQL as well.

Steps to reproduce:
deploy 3 controller node cluster, power off Switches. Rsume Switches

Expected results:
pacemaker should recover resources.

Actual result:
RabbitMQ and Galera MYSQL down.
Failed actions:
p_mysql_start_0 on sm4.domain.tld 'unknown error' (1): call=764, status=Timed Out, last-rc-change='Tue May 3 19:04:37 2016', queued=0ms, exec=300003ms

PCSD Status:
  192.168.199.3: Offline
  192.168.199.5: Offline
  192.168.199.6: Offline
What means PCSD Status : Offiline, nothing found on this in google.

Workaround:
Steps we tried to recover:
pcs resource debug-start clone_p_rabbit-server
rabbitmq started on 1. node, on other nodes stopp_app join cluster, start_app
RabbitMQ restored OK.

Galera dosent start up
pcs resource debug-start clone_p_mysql
*** Error in `/usr/sbin/crm_resource': free(): invalid pointer: 0x000000000087a5b0 ***
resource clone_p_mysql is NOT running
resource clone_p_mysql is NOT running
resource clone_p_mysql is NOT running
error: resources_action_create: A service action must have a valid standard.

Looked for the node with the newest WSREP state.
found one, others have -1 reported.
tried to start that one with pcs cleanup and debug-start
Galera still down.

Ist this a bug or do we just missed something?

See original description

Tags:

lieni (oliver-lienhard-b) on 2016-05-03

description:

updated

Revision history for this message

Oleksiy Molchanov (omolchanov) wrote on 2016-05-04:

Please provide diagnostic snapshot.

tags:	added: area-libr
tags:	added: area-library removed: area-libr

Dmitry Pyzhov (dpyzhov) on 2016-05-12

no longer affects:	fuel/newton
Changed in fuel:
assignee:	Fuel Sustaining (fuel-sustaining-team) → Bogdan Dobrelya (bogdando)

Revision history for this message

lieni (oliver-lienhard-b) wrote on 2016-05-16:

Hi
Sorry dont have diag snapshot, deployment was allready deleted.
We will try to reproduce in 1-2 weeks

Revision history for this message

Dmitry Pyzhov (dpyzhov) wrote on 2016-06-06:

Bug is in incomplete state for a month. Closing as invalid. Please reopen if you have more data.

Changed in fuel:
status:	Incomplete → Invalid

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.