Galera rebuild failed with pacemaker

Bug #1577911 reported by lieni
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
Bogdan Dobrelya
8.0.x
Invalid
High
Fuel Sustaining
Mitaka
Invalid
High
Bogdan Dobrelya

Bug Description

Detailed bug description:
Miratis 8.0 with Ubuntu
 HA Deployment 3 Nodes Galera, simulate Network outage or Powerloss
 After NEtworking is back online, pacemaker was not able to recover all resources.
 RabbitMQ failed and MYSQL as well.

Steps to reproduce:
 deploy 3 controller node cluster, power off Switches. Rsume Switches

Expected results:
pacemaker should recover resources.

Actual result:
RabbitMQ and Galera MYSQL down.
Failed actions:
    p_mysql_start_0 on sm4.domain.tld 'unknown error' (1): call=764, status=Timed Out, last-rc-change='Tue May 3 19:04:37 2016', queued=0ms, exec=300003ms

PCSD Status:
  192.168.199.3: Offline
  192.168.199.5: Offline
  192.168.199.6: Offline
What means PCSD Status : Offiline, nothing found on this in google.

Workaround:
 Steps we tried to recover:
pcs resource debug-start clone_p_rabbit-server
rabbitmq started on 1. node, on other nodes stopp_app join cluster, start_app
RabbitMQ restored OK.

Galera dosent start up
pcs resource debug-start clone_p_mysql
*** Error in `/usr/sbin/crm_resource': free(): invalid pointer: 0x000000000087a5b0 ***
resource clone_p_mysql is NOT running
resource clone_p_mysql is NOT running
resource clone_p_mysql is NOT running
   error: resources_action_create: A service action must have a valid standard.

Looked for the node with the newest WSREP state.
found one, others have -1 reported.
tried to start that one with pcs cleanup and debug-start
Galera still down.

Ist this a bug or do we just missed something?

Tags: area-library
description: updated
Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

Please provide diagnostic snapshot.

tags: added: area-libr
tags: added: area-library
removed: area-libr
Dmitry Pyzhov (dpyzhov)
no longer affects: fuel/newton
Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → Bogdan Dobrelya (bogdando)
Revision history for this message
lieni (oliver-lienhard-b) wrote :

Hi
Sorry dont have diag snapshot, deployment was allready deleted.
We will try to reproduce in 1-2 weeks

Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

Bug is in incomplete state for a month. Closing as invalid. Please reopen if you have more data.

Changed in fuel:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.