Controllers scale up fails due to galera epoch divergence when new controller's id is smaller than old ones
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Won't Fix
|
High
|
Sergii Golovatiuk |
Bug Description
Scenario:
1. Create cluster
2. Add 1 controller node
3. Deploy the cluster
4. Add 2 controller nodes
5. Deploy changes
6. Run network verification
7. Add 2 controller nodes
8. Deploy changes
9. Run network verification
10. Run OSTF
Actual result:
deployment on step 5 failed (Add 2 controllers - deploy changes) with Failed to call refresh: execution expired
http://
Expected:
Cluster ready, ostf passed
VERSION:
feature_groups:
- mirantis
production: "docker"
release: "5.1.1"
api: "1.0"
build_number: "45"
build_id: "2014-11-
astute_sha: "ef8aa0fd0e3ce2
fuellib_sha: "15a387462f7be5
ostf_sha: "64cb59c681658a
nailgun_sha: "500e36d08a45db
fuelmain_sha: "51e66db7750e9c
Changed in fuel: | |
status: | New → Confirmed |
Changed in fuel: | |
status: | Confirmed → Triaged |
Changed in fuel: | |
status: | Triaged → Won't Fix |
This issue happens due to divergence of GTID as we are adding new nodes and one of new nodes may become a primary controller as it has a lower ID.
Work around is pretty simple:
If you are adding controllers, ensure that nodes that you are adding nodes with lower ids than your current controllers. If you want to increment your nodes IDs, just delete them from nailgun and rebootstrap them. Evgeniy Li will comment on the details how to do this in the next comment: