wsrep_start_position does not work unless grastate.dat is parseable
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
| Galera |
Medium
|
Unassigned | |||
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC | Status tracked in 5.6 | |||||
| 5.5 |
Confirmed
|
Medium
|
Unassigned | ||
| 5.6 |
Fix Released
|
Medium
|
Unassigned |
Bug Description
I would expect --wsrep_
Submitting a ---wsrep_
130201 12:36:21 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130201 12:36:21 mysqld_safe WSREP: Running position recovery with --log_error=
130201 12:36:26 mysqld_safe WSREP: Recovered position 8d211006-
130201 12:36:26 [Note] WSREP: wsrep_start_
130201 12:36:26 [Note] WSREP: wsrep_start_
130201 12:36:26 [Note] WSREP: Read nil XID from storage engines, skipping position init
130201 12:36:26 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/
130201 12:36:26 [Note] WSREP: wsrep_load(): Galera 2.3(r143) by Codership Oy <email address hidden> loaded succesfully.
130201 12:36:26 [Warning] WSREP: Could not open saved state file for reading: /var/lib/
130201 12:36:26 [Note] WSREP: Found saved state: 00000000-
Further, if I create an empty grastate.dat, it also fails:
[root@node3 lib]# ls -lah mysql/grastate.dat
-rw-r--r--. 1 mysql mysql 0 Feb 1 12:42 mysql/grastate.dat
130201 12:43:41 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130201 12:43:41 mysqld_safe WSREP: Running position recovery with --log_error=
130201 12:43:46 mysqld_safe WSREP: Recovered position 8d211006-
130201 12:43:46 [Note] WSREP: wsrep_start_
130201 12:43:46 [Note] WSREP: wsrep_start_
130201 12:43:46 [Note] WSREP: Read nil XID from storage engines, skipping position init
130201 12:43:46 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/
130201 12:43:46 [Note] WSREP: wsrep_load(): Galera 2.3(r143) by Codership Oy <email address hidden> loaded succesfully.
130201 12:43:46 [Note] WSREP: Found saved state: 00000000-
From what I can tell, wsrep_start_
[root@node3 lib]# cat mysql/grastate.dat
# GALERA saved state
version: 2.1
uuid: 8d211006-
seqno: -1
cert_index:
[root@node3 lib]# service mysql start --wsrep_
130201 12:49:09 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
130201 12:49:09 mysqld_safe WSREP: Running position recovery with --log_error=
130201 12:49:14 mysqld_safe WSREP: Recovered position 8d211006-
130201 12:49:14 [Note] WSREP: wsrep_start_
130201 12:49:14 [Note] WSREP: wsrep_start_
130201 12:49:14 [Note] WSREP: Read nil XID from storage engines, skipping position init
130201 12:49:14 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib/
130201 12:49:14 [Note] WSREP: wsrep_load(): Galera 2.3(r143) by Codership Oy <email address hidden> loaded succesfully.
130201 12:49:15 [Note] WSREP: Found saved state: 8d211006-
...
130201 12:49:15 [Note] WSREP: State transfer required:
Group state: 8d211006-
Local state: 8d211006-
In all other cases it does a zero state reset and forces SST. This will lead to unexpected results.
Jay Janssen (jay-janssen) wrote : | #1 |
Jay Janssen (jay-janssen) wrote : | #2 |
To be more concise:
130302 06:52:32 mysqld_safe WSREP: Running position recovery with --log_error=
130302 06:52:37 mysqld_safe WSREP: Recovered position 8797f811-
130302 6:52:37 [Note] WSREP: wsrep_start_
130302 6:52:37 [Warning] WSREP: Could not open saved state file for reading: /var/lib/
130302 6:52:37 [Note] WSREP: Found saved state: 00000000-
130302 6:52:37 [Note] WSREP: Setting initial position to 00000000-
Shouldn't the wsrep_start_
Fixing this would probably solve: https:/
Alex Yurchenko (ayurchen) wrote : | #3 |
Jay,
traditionally, yes, a command line parameter should take precedence over any defaults, configs, etc.
The problem here is that this option is used in _automatic_ (read: unattended) node recovery: to pass a GTID value found via --wsrep-recover option from InnoDB table space. So it is not always user-supplied, and hence grastate.dat takes precedence, since InnoDB does not store DDL and other non-transactional GTIDs.
And I don't think that lp:1111706 can ever be fixed without a risk of inconsistency.
Jay Janssen (jay-janssen) wrote : Re: [Bug 1112724] wsrep_start_position does not work unless grastate.dat is parseable | #4 |
On Mar 2, 2013, at 11:52 AM, Alex Yurchenko <email address hidden> wrote:
> The problem here is that this option is used in _automatic_ (read:
> unattended) node recovery: to pass a GTID value found via --wsrep-
> recover option from InnoDB table space. So it is not always user-
> supplied, and hence grastate.dat takes precedence, since InnoDB does not
> store DDL and other non-transactional GTIDs.
It's really not clear at what points grastate takes precedence over --wsrep_
AFAIK, there are three (maybe 4) possible grastate.dat states:
1) UUID set, seqno is >= 0, indicating either a clean shutdown, or someone manually tinkering with the file
2) UUID set, seqno is -1: indicating an unclean shutdown/crash
3) UUID zeroed: wsrep abort (?) (like lp:1111706?, and RBR errors?)
3) grastate.dat missing or unparseable: someone trying to build a node from a backup, someone manually tinkering with the file, or something horrible (filesystem corruption)
AFAICT, --wsrep_
I can accept that #3 would not accept wsrep_start_
However, I think --wsrep_
Jay Janssen, MySQL Consulting Lead, Percona
http://
Percona Live in Santa Clara, CA April 22nd-25th 2013
http://
Alex Yurchenko (ayurchen) wrote : | #5 |
Jay,
Now that you put it this way, I can't find any more excuses except that we have a pile of other issues with higher priorities ATM :)
affects: | codership-mysql → galera |
Changed in galera: | |
importance: | Undecided → Low |
milestone: | none → 3.0beta |
status: | New → Confirmed |
Changed in galera: | |
importance: | Low → Medium |
I was looking at this from POV of xtrabackup and SST, and
===========
st_.get (uuid, seqno);
if (0 != args->state_uuid &&
seqno == WSREP_SEQNO_
{
/* non-trivial recovery information provided on startup, and db is safe
* so use recovered seqno value */
seqno = args->state_seqno;
}
log_debug << "End state: " << uuid << ':' << seqno << " #################";
update_
cc_seqno_ = seqno; // is it needed here?
apply_
if (co_mode_ != CommitOrder:
cert_
=======
It looks like the provided position (with wsrep-start-
is allowed only when UUID matches the one in grastate.dat and
sequence number is -1 in grastate.dat
Now, from Xtrabackup's perspective, --no-lock is only used when DDL and
non-transactional tables are not in effect (at the moment this needs
to checked manually). So, doesn't that mean if that is taken care of
(automatically since SST runs unattended on donor) then grastate.dat
won't be needed? Regarding DDL, is it not possible for SST code in WSREP
to acquire a shared MDL lock (MDL_SHARED_READ or MDL_SHARED_
Changed in galera: | |
milestone: | 3.0beta → 3.0 |
Changed in galera: | |
milestone: | 3.0-beta → 3.1 |
Changed in galera: | |
milestone: | 25.3.1 → 25.3.2 |
Changed in galera: | |
milestone: | 25.3.2 → 25.3.3 |
Changed in galera: | |
milestone: | 25.3.3 → 25.3.4 |
no longer affects: | galera/2.x |
Changed in galera: | |
milestone: | 25.3.4 → 25.3.5 |
Changed in galera: | |
milestone: | 25.3.5 → 25.3.6 |
Nilnandan Joshi (nilnandan-joshi) wrote : | #7 |
This is still happening.
[root@percona-
150521 14:58:31 mysqld_safe mysqld from pid file /var/lib/
150521 14:59:55 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql
150521 14:59:55 mysqld_safe Skipping wsrep-recover for empty datadir: /var/lib/mysql
150521 14:59:55 mysqld_safe Assigning 00000000-
2015-05-21 14:59:55 0 [Note] WSREP: wsrep_start_
2015-05-21 14:59:55 0 [Note] WSREP: wsrep_start_
2015-05-21 14:59:55 0 [Warning] TIMESTAMP with implicit DEFAULT value is deprecated. Please use --explicit_
2015-05-21 14:59:55 2307 [Warning] You need to use --log-bin to make --log-slave-updates work.
2015-05-21 14:59:55 2307 [Note] WSREP: Read nil XID from storage engines, skipping position init
2015-05-21 14:59:55 2307 [Note] WSREP: wsrep_load(): loading provider library '/usr/lib64/
2015-05-21 14:59:55 2307 [Note] WSREP: wsrep_load(): Galera 3.9(r93aca2d) by Codership Oy <email address hidden> loaded successfully.
2015-05-21 14:59:55 2307 [Note] WSREP: CRC-32C: using "slicing-by-8" algorithm.
2015-05-21 14:59:55 2307 [Warning] WSREP: Could not open saved state file for reading: /var/lib/
2015-05-21 14:59:55 2307 [Note] WSREP: Found saved state: 00000000-
Hrvoje Matijakovic (hrvojem) wrote : | #8 |
Shahriyar Rzayev (rzayev-sehriyar) wrote : | #9 |
Percona now uses JIRA for bug reports so this bug report is migrated to: https:/
Forgot to mention:
Centos 6.3 version | 2.3(r143) |
Server version: 5.5.29 Percona XtraDB Cluster (GPL), wsrep_23.7.1.r3843
| wsrep_provider_