Fuel for OpenStack

Controllers scale up fails due to galera epoch divergence when new controller's id is smaller than old ones

Bug #1398378 reported by Tatyanka on 2014-12-02

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Fuel for OpenStack	Won't Fix	High	Sergii Golovatiuk	Fuel for OpenStack 5.1.1-updates

Bug Description

Scenario:
            1. Create cluster
            2. Add 1 controller node
            3. Deploy the cluster
            4. Add 2 controller nodes
            5. Deploy changes
            6. Run network verification
            7. Add 2 controller nodes
            8. Deploy changes
            9. Run network verification
            10. Run OSTF

Actual result:
deployment on step 5 failed (Add 2 controllers - deploy changes) with Failed to call refresh: execution expired
http://paste.openstack.org/show/143488/ (see node 3, 2 in shnapshot)

Expected:
Cluster ready, ostf passed

VERSION:
  feature_groups:
    - mirantis
  production: "docker"
  release: "5.1.1"
  api: "1.0"
  build_number: "45"
  build_id: "2014-11-27_23-41-13"
  astute_sha: "ef8aa0fd0e3ce20709612906f1f0551b5682a6ce"
  fuellib_sha: "15a387462f7be50c4f87ad986d0c81535025c125"
  ostf_sha: "64cb59c681658a7a55cc2c09d079072a41beb346"
  nailgun_sha: "500e36d08a45dbb389bf2bd97673d9bff48ee84d"
  fuelmain_sha: "51e66db7750e9c856ba128f35cfb6724895bf479"

Tags:

Revision history for this message

Tatyanka (tatyana-leontovich) wrote on 2014-12-02:

fail_error_ha_flat_scalability-2014_12_02__05_38_53.tar.gz Edit (9.5 MiB, application/x-tar)

Bogdan Dobrelya (bogdando) on 2014-12-04

Changed in fuel:
status:	New → Confirmed

Revision history for this message

Vladimir Kuklin (vkuklin) wrote on 2014-12-05:

This issue happens due to divergence of GTID as we are adding new nodes and one of new nodes may become a primary controller as it has a lower ID.

Work around is pretty simple:

If you are adding controllers, ensure that nodes that you are adding nodes with lower ids than your current controllers. If you want to increment your nodes IDs, just delete them from nailgun and rebootstrap them. Evgeniy Li will comment on the details how to do this in the next comment:

tags:

added: release-notes

Revision history for this message

Vladimir Kuklin (vkuklin) wrote on 2014-12-05:

Related-bug: https://bugs.launchpad.net/fuel/+bug/1390397

Revision history for this message

Evgeniy L (rustyrobot) wrote on 2014-12-05:

In order to delete node you can follow a standard flow [1], select the node, click delete button and then click Deploy button, after node is discovered it should get new incremented id.

[1] http://docs.mirantis.com/openstack/fuel/fuel-5.1/operations.html?highlight=delete#remove-a-controller-node

Revision history for this message

Vladimir Kuklin (vkuklin) wrote on 2014-12-05: Re: Controllers scale up fails due to galera epoch divergence

We will also need to fix system tests for 5.1.x branch as they should add only nodes with bigger IDs to the cluster.

summary:	- [system_tests] Scalability tests failed with Failed to call refresh: - execution expired + Controllers scale up fails due to galera epoch divergence
summary:	- Controllers scale up fails due to galera epoch divergence + Controllers scale up fails due to galera epoch divergence when new + controller's id is smaller than old ones

Revision history for this message

Sergii Golovatiuk (sgolovatiuk) wrote on 2014-12-08:

This issue is not reproducible anymore on new OCF script. MySQL Galera was assembled properly.

Revision history for this message

Sergii Golovatiuk (sgolovatiuk) wrote on 2014-12-08:

We need to backport OCF script from 5.1 to 6.0

Sergii Golovatiuk (sgolovatiuk) on 2014-12-08

Changed in fuel:
status:	Confirmed → Triaged

Sergii Golovatiuk (sgolovatiuk) on 2015-02-14

Changed in fuel:
status:	Triaged → Won't Fix

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

fail_error_ha_flat_scalability-2014_12_02__05_38_53.tar.gz Edit

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.