It's easy to create inconsistencies as result of a user error with bootstrapping nodes

Bug #1330953 reported by Kenny Gryp on 2014-06-17
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Status tracked in 5.6
5.5
Undecided
Unassigned
5.6
Undecided
Unassigned
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Won't Fix
Medium
Unassigned
5.6
Confirmed
Medium
Unassigned

Bug Description

5.6.15-56 Percona XtraDB Cluster (GPL), Release 25.5, Revision 759, wsrep_25.5.r4061

We have a 3 node cluster to start with:

node1 mysql> show global status like 'wsrep_cluster_state%';
+--------------------------+--------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------+
| wsrep_cluster_state_uuid | 62eb8c72-f601-11e3-b42c-ab6847529d86 |
+--------------------------+--------------------------------------+
1 row in set (0.00 sec)

node2 mysql> show global status like 'wsrep_cluster_state%';
+--------------------------+--------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------+
| wsrep_cluster_state_uuid | 62eb8c72-f601-11e3-b42c-ab6847529d86 |
+--------------------------+--------------------------------------+
1 row in set (0.00 sec)

node3 mysql> show global status like 'wsrep_cluster_state%';
+--------------------------+--------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------+
| wsrep_cluster_state_uuid | 62eb8c72-f601-11e3-b42c-ab6847529d86 |
+--------------------------+--------------------------------------+
1 row in set (0.00 sec)

node1 mysql> use test;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
node1 mysql> create table inconsistencytest (text char(32));
Query OK, 0 rows affected (0.01 sec)

node2 mysql> select * from inconsistencytest;
+--------------------------+
| text |
+--------------------------+
| everybody is still happy |
+--------------------------+
1 row in set (0.00 sec)

node1# service mysql stop (clean shutdown!)

node2 and node3 are still primary.

node1# service mysql bootstrap-pxc

node1 becomes primary on it's own cluster. This we should not do of course.
In my tests, both environments still have the same UUID.

node2 and node3 get some blacklist messages:

2014-06-17 11:24:34 3860 [Note] WSREP: (b0143193-f602-11e3-bef6-6f2df541c98a, 'tcp://0.0.0.0:4567') address 'tcp://192.168.70.3:4567' pointing to uuid b0143193-f602-11e3-bef6-6f2df541c98a is blacklisted, skipping
2014-06-17 11:24:34 3872 [Note] WSREP: (c05e9a56-f602-11e3-afad-72eee1ca6827, 'tcp://0.0.0.0:4567') address 'tcp://192.168.70.4:4567' pointing to uuid c05e9a56-f602-11e3-afad-72eee1ca6827 is blacklisted, skipping

Now do something like this::

node3 mysql> insert into inconsistencytest values ("this write came from the original cluster, node3 to be exact");
Query OK, 1 row affected, 1 warning (0.00 sec)
node3 mysql> insert into inconsistencytest values ("another write from node3");
Query OK, 1 row affected (0.00 sec)

node1 mysql> insert into inconsistencytest values ("this write came from node1 which was a cluster on it's own");
Query OK, 1 row affected, 1 warning (0.03 sec)

ok, well. both are primary, have the same UUID, but different data.
Nothing special, this is a user error.

let's shut down node1

node1# service mysql stop

Now let's start node1 again....

[root@node1 ~]# service mysql start
Starting MySQL (Percona XtraDB Cluster)..... SUCCESS!

(IST was performed)

node1 mysql> select * from inconsistencytest;
+----------------------------------+
| text |
+----------------------------------+
| everybody is still happy |
| this write came from node1 which |
| another write from node3 |
+----------------------------------+
3 rows in set (0.00 sec)

node2 mysql> select * from inconsistencytest;
+----------------------------------+
| text |
+----------------------------------+
| everybody is still happy |
| this write came from the origina |
| another write from node3 |
+----------------------------------+
3 rows in set (0.00 sec)

node3 mysql> select * from inconsistencytest;
+----------------------------------+
| text |
+----------------------------------+
| everybody is still happy |
| this write came from the origina |
| another write from node3 |
+----------------------------------+
3 rows in set (0.00 sec)

So we have created an inconsistency now, pxc did not notice anything. It just assumed it was part of the same cluster.

So a fix would be to generate a new UUID at bootstrap, but according to Teemu & Jay, this breaks the 'all down, all up' use cases.

Alex Yurchenko (ayurchen) wrote :

If I may, this seems to be a duplicate of https://bugs.launchpad.net/galera/+bug/843752

It indeed looks like a duplicate of lp:843752.

Now, following things have happened here:

a) Node2 and 3 have right state and received updates of the PC (considering
larger component as the 'right' PC).

b) Node1 has last update correctly received.

c) Node1 has second row from its 'own' PC than what came from other nodes.

Now, only c) looks incorrect. Now, when configuration change occurs, and state
transfer is required, only the sequence numbers are compared and it is assumed
when uuids are same, sequence numbers are consistent.

While changing UUID on bootstrap can break things, if galera intends to support
multiple primary components in future, a reconciliation between multiple PCs is
something that will need to be added. Something like a PC-id (and last PC-id
stored). This way, further reconciliation strategies can be devised.

Verified the same with PXC 5.5 and 5.6 with above mentioned steps. Finally getting data inconsistency between node1 and node2/3

mysql> show global status like 'wsrep_cluster_state%';
+--------------------------+--------------------------------------+
| Variable_name | Value |
+--------------------------+--------------------------------------+
| wsrep_cluster_state_uuid | 3671fad1-123e-11e4-8ee5-7febff796263 |
+--------------------------+--------------------------------------+
1 row in set (0.00 sec)

On node1:

mysql> select * from inconsistencytest;
+----------------------------------+
| text |
+----------------------------------+
| everybody is still happy |
| this write came from node1 which |
| another write from node3 |
+----------------------------------+
3 rows in set (0.01 sec)

On node2 ad node3:
mysql> select * from inconsistencytest;
+----------------------------------+
| text |
+----------------------------------+
| everybody is still happy |
| this write came from the origina |
| another write from node3 |
+----------------------------------+
3 rows in set (0.00 sec)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers