Drizzle

possible sync issue using commit_id and backup to bring up a slave

Bug #766296 reported by Joe Daly on 2011-04-19

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Drizzle	Confirmed	Medium	Unassigned	Drizzle 2011-06-06

Bug Description

came from discussion with knielsen

This gets back into the prepare_commit_mutex that does not exist in the inno replication log. The basics of the bug are its possible based on where the commit_id is assigned that two unrelated transactions T1 and T2 write to disk in a different order (t2 then t1). If you were to take a backup at preciously the moment t2 wrote to disk but not t1 and prevision a slave with that backup the slave applier would assume the transaction to start at would be (t2 + 1) this would leave t1 un-applied on the slave.

Heres some IRC discussion in case my description is confusing

Apr 19 09:49:47 --> Apr 19 09:51:02 <knielsen> Apr 19 09:51:43 <knielsen> Apr 19 09:52:01 --> Apr 19 09:52:01 <-- Apr 19 09:52:01 --> Apr 19 09:52:13 <knielsen> Apr 19 09:52:16 <jdaly> Apr 19 09:53:07 <knielsen> Apr 19 09:53:23 <knielsen> Apr 19 09:53:58 <jdaly> Apr 19 09:54:00 <knielsen> Apr 19 09:54:19 <knielsen> Apr 19 09:55:07 <knielsen> Apr 19 09:55:41 <knielsen> Apr 19 09:56:03 <knielsen> Apr 19 09:56:26 <knielsen> Apr 19 09:56:34 <knielsen> Apr 19 09:56:51 <knielsen> Apr 19 09:57:23 <jdaly> Apr 19 09:57:35 <knielsen> Apr 19 09:58:01 <knielsen> Apr 19 09:59:44 <knielsen> Apr 19 10:00:01 <knielsen> Apr 19 10:02:34 <jdaly> Apr 19 10:02:35 <-- Apr 19 10:03:21 <LinuxJedi> Apr 19 10:03:30 <LinuxJedi> Apr 19 10:03:47 --> Apr 19 10:05:47 <jdaly> Apr 19 10:05:57 <knielsen> Apr 19 10:06:06 <LinuxJedi> Apr 19 10:06:22 <knielsen> Apr 19 10:07:12 --- Apr 19 10:07:30 * Apr 19 10:07:38 <knielsen> Apr 19 10:07:44 <LinuxJedi> Apr 19 10:07:59 <knielsen> Apr 19 10:08:06 <jdaly> Apr 19 10:08:21 <LinuxJedi> Apr 19 10:08:35 * Apr 19 10:08:48 <jdaly> Ori (~Ori@75-149-135-141-Connecticut.hfc.comcastbusiness.net) has joined #drizzle
Shrews: yes, I noticed that in the code. What I didn't understand was if you ensure that this number is allocated in the same sequence as commits are written into the innodb transaction log
Shrews: from what you say, my guess is that you do not ensure this. Which can be the right solution depending on what you want, just trying to figure out how it works
HarrisonF (~hfisk@74.74.88.21) has joined #drizzle
HarrisonF has quit (Changing host)
HarrisonF (~hfisk@mysql/training/HarrisonF) has joined #drizzle
there are a number of advantages to relaxed ordering (between innodb transaction log and commit_id number)
knielsen: it would be possible that two unrelated commits could not be ordered, but related would be held up at a higher level before assignment of the commit_id
there are some disadvantates as well, of course :-) main ones I can think of is applications seeing state on slave that never existed on master, and xtrabackup taking a server snapshot that does not correspond to any commit_id number
jdaly: yes, I agree
why would xtrabackup take a backup that doesnt correspond to a commit_id number. Im naive in that area
jdaly: if two commits depend on one another, then the second cannot start commit until the first is done and releases row locks
jdaly: the issue is the following
jdaly: xtrabackup basically copies the innodb transaction log up to a certain point X, and the resulting backup gives a snapshot of the server at that point
jdaly: now suppose that we commit independent transactions T1 and T2. T1 is assigned commit_id 101 and T2 commit_id 102
jdaly: then thread scheduling just happens to work so that T2 is written into the innodb transaction log before T1.
jdaly: now if we take an xtrabackup at that exact moment, we may get a snapshot that has T2, but not T1
... which is fine, as T1 and T2 are independent
jdaly: but now suppose we use this backup to provision a new slave
ok Im seeing what your talking about
jdaly: then we have the problem that we don't know which commit_id to start replication from? If we take 101, then we will duplicate T2. If we take 102 then we will be missing T1
jdaly: this problem is the sole reason (AFAIK) that innodb did the prepare_commit_mutex in MySQL, which killed group commit for >5 years :-(
jdaly: on the other hand, it is nice not to take an expensive lock and impose ordering for _every_ commit just for an issue that only occurs for one millisecond of the daily backup
(eg. one could maybe impose ordering only at that split second during backup)
knielsen: thanks for the details, Ill look at what would be involved in forcing order when doing a backup. The max commit_id is stored in the innospace as well, it may be possible to use that somewhere although off the top Im not sure how
shinguz (~oli@67-28.3-85.cust.bluewin.ch) has left #drizzle
surely if you are taking an xtrabackup of the slave then it would require a little logic but we could do a comparison of the replication log between the master and slave for missing transactions (or am I missing something?)
since the replication log will be part of the backup
tjoneslo (~<email address hidden>) has joined #drizzle
LinuxJedi: you also could look for a missing commit_id in the replication log
yes, that would be one interesting way of doing it
jdaly: yep :)
you just need some kind of checkpointing so you know how far back you have to look for missing commits
hartmut is now known as hartmut|jetlag
LinuxJedi thinks a bug should probably be filed to document this
When I did the MariaDB group commit, I though a lot about how to do this issue. I ended up (in MariaDB) enforcing same commit order. But I just wondered what Drizzle was doing. I think it's also interesting to relax the order and then just deal with that on the slave or wherever
(document this conversation so we know to fix it I mean)
so thanks for the info :-)
I can file a bug, it will be a couple hours
knielsen: thanks for the feedback :)
knielsen hopes Drizzle guys don't mind him asking the nasty questions that cause bugs to be filed, seems I've done that a couple of times now :)
knielsen: yes thanks much (again)

Revision history for this message

Joe Daly (skinny.moey) wrote on 2011-04-19:

left unassigned deliberately as this needs some bouncing around to come up with a workable solution, the amount of code to fix probably will end up being small

Changed in drizzle:
status:	New → Confirmed
importance:	Undecided → Medium
milestone:	none → 2011-05-09

Patrick Crews (patrick-crews) on 2011-05-10

Changed in drizzle:
milestone:	2011-05-09 → 2011-05-23

Patrick Crews (patrick-crews) on 2011-05-24

Changed in drizzle:
milestone:	2011-05-23 → 2011-06-06

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.