[library] After locking DB access from primary controller cluster is unable to work

Bug #1326829 reported by Egor Kotko on 2014-06-05
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
High
Sergii Golovatiuk
4.1.x
High
Sergii Golovatiuk
5.0.x
High
Sergii Golovatiuk

Bug Description

{"build_id": "2014-06-04_09-16-08", "mirantis": "yes", "build_number": "341", "nailgun_sha": "a828d6b7610f872980d5a2113774f1cda6f6810b", "ostf_sha": "c959aa55f83fe2555cf2d382559271c7a9b17467", "fuelmain_sha": "7ed0f85acc0bab4b9157703a618b8cc9fd7de3e1", "astute_sha": "55df06b2e84fa5d71a1cc0e78dbccab5db29d968", "release": "4.1B", "fuellib_sha": "0e96fc5a340cd57f75c454ea8536471379299494"}

Steps to reproduce:
1. Deploy cluster: Centos, HA, Neutron Vlan, 3Controllers, 1 Compute
2. On primary controller emulate non-responsiveness of MySQL - disables MYSQL ports via iptables
3. See status of Galera:
-on node with disabled ports:
http://paste.openstack.org/show/82966/
-on other node:
http://paste.openstack.org/show/82965/

Expected result:
Cluster will works with 2 controllers.

Actual result:
Functionality of cluster is inaccessible.

Egor Kotko (ykotko) wrote :
Egor Kotko (ykotko) wrote :
Vladimir Kuklin (vkuklin) wrote :

this bug is going to be targeted in 5.1 release.

Changed in fuel:
milestone: 4.1.1 → 4.1.2
importance: Undecided → High
assignee: nobody → Fuel Library Team (fuel-library)
milestone: 4.1.2 → 5.1
status: New → Confirmed
no longer affects: fuel/5.1.x
tags: added: ha
tags: removed: nailgun
Changed in fuel:
assignee: Fuel Library Team (fuel-library) → Sergii Golovatiuk (sgolovatiuk)
Sergii Golovatiuk (sgolovatiuk) wrote :

There should be a good script for HAProxy to shut down backend if "wsrep_ready" = "OFF". This bug will be addressed in blueprint

Sergii Golovatiuk (sgolovatiuk) wrote :

This test is very synthetic. Though I confirm this case is present

Changed in fuel:
status: Invalid → Confirmed
importance: High → Medium
Egor Kotko (ykotko) wrote :

Have reproduced on:
{"build_id": "2014-07-10_00-39-56", "mirantis": "yes", "build_number": "112", "ostf_sha": "09b6bccf7d476771ac859bb3c76c9ebec9da9e1f", "nailgun_sha": "f5ff82558f99bb6ca7d5e1617eddddf7142fe857", "production": "docker", "api": "1.0", "fuelmain_sha": "293015843304222ead899270449495af91b06aed", "astute_sha": "5df009e8eab611750309a4c5b5c9b0f7b9d85806", "release": "5.0.1", "fuellib_sha": "364dee37435cbdc85d6b814a61f57800b83bf22d"}

Egor Kotko (ykotko) wrote :
Dmitry Ilyin (idv1985) on 2014-07-15
summary:
summary: - After locking DB access from primary controller cluster is unable to
- work
+ [library] After locking DB access from primary controller cluster is
+ unable to work
Mike Scherbakov (mihgen) on 2014-07-17
tags: added: release-notes
Nastya Urlapova (aurlapova) wrote :

Reproduced on
{
build_id: "2014-07-17_11-18-10",
mirantis: "yes",
build_number: "135",
ostf_sha: "09b6bccf7d476771ac859bb3c76c9ebec9da9e1f",
nailgun_sha: "1d08d6f80b6514085dd8c0af4d437ef5d37e2802",
production: "docker",
api: "1.0",
fuelmain_sha: "c8e13df4c7de3ce3504c2bcb6d51a165b9aae0b6",
astute_sha: "9a74b788be9a7c5682f1c52a892df36e4766ce3f",
release: "5.0.1",
fuellib_sha: "e8c2bb726be6b78c3a34f75c84337a3a5662bb35"
}

Looks like 40-50% envs are affected by this issue after failover.

Changed in fuel:
importance: Medium → High
Vladimir Kuklin (vkuklin) wrote :

this requires additional HA checks. may be it is worth moving to 6.0

This test should be invalid right now. If block port 3307is blocked galera still gets updates from the neighbors. Even in this case unix socket is responsible and many services will be able to get access to mysql (clustercheck script is a good sample).
In order to perform a proper test
1. Disable galera port in INPUT/OUTPUT chain in filter table
iptables -I OUTPUT 1 -p tcp --dport 4567 -j DROP
iptables -I INPUT 1 -p tcp --dport 4567 -j DROP
2. Check if Galera/MySQL is in sync
/usr/local/bin/clustercheck or telnet localhost 49000
3. Try to create a new database in mysql client. Just run mysql client without any parameters in this case it will connect to local mysql server
4. Try to connect to HAProxy interface
mysql -h192.168.0.1 -P3306 -uUSER -pPASSWORD
5. Unblock port 4567.
See if if created database appeared on local mysql server. Try to delete that database.

Changed in fuel:
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers