Add optional causality check to autocommit UPDATE/DELETE statements

Bug #1277053 reported by Alex Yurchenko on 2014-02-06
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
MySQL patches by Codership
Status tracked in 5.6
5.5
Critical
Yan Zhang
5.6
Critical
Yan Zhang
Percona XtraDB Cluster moved to https://jira.percona.com/projects/PXC
Status tracked in 5.6
5.5
Fix Released
Undecided
Unassigned
5.6
Fix Released
Undecided
Unassigned

Bug Description

While not doing causality check on UPDATE / DELETE does not return outdated values, it still can break application logic in that the statement may go ineffective because of the missing row. Note that this won't return a deadlock error (Query OK, 0 rows affected) so the application has no hint to retry the statement.

Related branches

Jay Janssen (jay-janssen) wrote :

I think the current causality logic is more or less m/^select/i ?

I'd argue that this maybe should be extended to any statement starting a transaction (autocommit or not) so the read view is opened after the apply queue is flushed. Not sure how hard that is though...

Jay Janssen (jay-janssen) wrote :

One could even take this a step further and say the above only applies in REPEATABLE-READ (i.e., flush queue before read view is opened). For READ-COMMITTED, should it be before every statement?

Michaël de Groot (z-miceael-6) wrote :

I like the idea, though if in a READ COMMITTED transaction would guarantee causality with every subsequent query it might be worth to mention that performance could be worse with transactions in READ-COMMITTED then with transactions in REPEATABLE-READ

Alex Yurchenko (ayurchen) wrote :

Jay,

1) causal wait is part of trans_begin() method, so you can always establish a proper read view. The fact that trans_begin() is not called on autocommit writes suggests (to me) that there are reasons why autocommit writes are supposed to be treated somewhat differently.

2) we would rather rely on certification to detect a conflict/out-of-date write, rather than on dumb wait. The problem comes out of late INSERTS - if you don't have a row, you can't detect a conflict (we don't even have anything to replicate, since the write was ineffective). As such it touches only UPDATEs and DELETEs. (INSERTs a) can be resolved by certificaiton, b) it is very unlikely use case that you will be inserting the same PK because of stale data (only if you have auto increment control turned off))

- the latter consideration brings us to an interesting point: we need a mask to specify events that we want to do causal wait. Perhaps we should introduce

wsrep_causal_wait = <mask> (1 - SELECT, SHOW, BEGIN, 2 - UPDATE/DELETE, 4 - INSERT)

wsrep_causal_reads=ON would be equivalent to wsrep_causal_wait=1

On Feb 6, 2014, at 8:44 AM, Alex Yurchenko <email address hidden> wrote:

> 1) causal wait is part of trans_begin() method, so you can always
> establish a proper read view. The fact that trans_begin() is not called
> on autocommit writes suggests (to me) that there are reasons why
> autocommit writes are supposed to be treated somewhat differently.

Hum, I thought causal_reads were literally just finding all SELECTs. Are you saying causal_reads are always in effect for explicit transactions? With or without wsrep_causal_reads=1?

>
> 2) we would rather rely on certification to detect a conflict/out-of-
> date write, rather than on dumb wait.

> The problem comes out of late
> INSERTS - if you don't have a row, you can't detect a conflict (we don't
> even have anything to replicate, since the write was ineffective). As
> such it touches only UPDATEs and DELETEs. (INSERTs a) can be resolved by
> certificaiton, b) it is very unlikely use case that you will be
> inserting the same PK because of stale data (only if you have auto
> increment control turned off))
>
> - the latter consideration brings us to an interesting point: we need a
> mask to specify events that we want to do causal wait. Perhaps we should
> introduce
>
> wsrep_causal_wait = <mask> (1 - SELECT, SHOW, BEGIN, 2 - UPDATE/DELETE,
> 4 - INSERT)
>
> wsrep_causal_reads=ON would be equivalent to wsrep_causal_wait=1

So was SHOW and BEGIN (and I assume START TRANSACTION) always part of causal_reads? Why SHOW?

Any command in autocommit=0 will start a transaction I think — would that be covered by wsrep_causal_wait=1? The extra masks are only for autocommit=1?

Jay Janssen
http://about.me/jay.janssen

Alex Yurchenko (ayurchen) wrote :

On 2014-02-06 23:38, Jay Janssen wrote:
> Hum, I thought causal_reads were literally just finding all SELECTs.
> Are you saying causal_reads are always in effect for explicit
> transactions? With or without wsrep_causal_reads=1?

wsrep_causal_reads=1 was always effective for SHOW, autocommit SELECTs,
and beginning of transaction, whether explicit or implicit.

>> wsrep_causal_wait = <mask> (1 - SELECT, SHOW, BEGIN, 2 -
>> UPDATE/DELETE,
>> 4 - INSERT)
>>
>> wsrep_causal_reads=ON would be equivalent to wsrep_causal_wait=1
>
> So was SHOW and BEGIN (and I assume START TRANSACTION) always part of
> causal_reads? Why SHOW?

Yes. SHOW because of status variables. Like wsrep_last_committed.

> Any command in autocommit=0 will start a transaction I think — would
> that be covered by wsrep_causal_wait=1? The extra masks are only for
> autocommit=1?

Yes and yes. Unless I'm missing something about trans_begin() purpose.

> Jay Janssen
> http://about.me/jay.janssen
>
> --
> You received this bug notification because you are subscribed to the
> bug
> report.
> https://bugs.launchpad.net/bugs/1277053
>
> Title:
> Add optional causality check to autocommit UPDATE/DELETE statements
>
> Status in MySQL patches by Codership:
> New
> Status in MySQL patches by Codership 5.5 series:
> New
> Status in MySQL patches by Codership 5.6 series:
> New
>
> Bug description:
> While not doing causality check on UPDATE / DELETE does not return
> outdated values, it still can break application logic in that the
> statement may go ineffective because of the missing row. Note that
> this won't return a deadlock error (Query OK, 0 rows affected) so the
> application has no hint to retry the statement.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/codership-mysql/+bug/1277053/+subscriptions

--
Alexey Yurchenko,
Codership Oy, www.codership.com
Skype: alexey.yurchenko, Phone: +358-400-516-011

> So was SHOW and BEGIN (and I assume START TRANSACTION) always part of causal_reads? Why SHOW?

As https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1271177/comments/5 (second point there).

But as per https://bugs.launchpad.net/percona-xtradb-cluster/+bug/1271177/comments/4

it is broken for 5.6 - ie. in 5.6 you can do SHOW without
blocking even with wsrep_causal_reads=1.

"break application logic in that the statement may go ineffective because of the missing row"

Has this been reproduced? If so, then it is certainly a bug in
certification/galera. This is because any subsequent dependent DML has
to take causality into account by default. wsrep_causal_reads shouldn't
influence this (I mean it should work fine by default).

Yan Zhang (yan.zhang) wrote :

add a session variable 'wsrep_sync_wait' to control causality check. And old session variable 'wsrep_causal_reads' is deprecated but is kept for backward compatiblity.

```
"wsrep_causal_reads", "(DEPRECATED) setting this variable is equivalent to setting wsrep_sync_wait READ flag",
"wsrep_sync_wait", "Ensure \"synchronous\" read view before executing an operation of the type specified by bitmask: 1 - READ(includes SELECT, SHOW and BEGIN/START TRANSACTION); 2 - UPDATE and DELETE; 4 - INSERT and REPLACE"
```

fixed in
http://bazaar.launchpad.net/~codership/codership-mysql/wsrep-5.5/revision/4014
http://bazaar.launchpad.net/~codership/codership-mysql/5.6/revision/4123

Percona now uses JIRA for bug reports so this bug report is migrated to: https://jira.percona.com/browse/PXC-1605

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers