Comment 6 for bug 1492919

Revision history for this message
Russell Boyatt (russell-boyatt) wrote :

Agree that deadlocks are not necessarily a bad thing and retrying is something to work considering. I've now had chance to look at what's going on and I *think* I understand what's going on now...

MySQL defaults to a transaction isolation level of 'REPEATABLE READ' (the transaction snapshot is stable from the first read) compared to the default in Oracle and PostgreSQL of 'READ COMMITTED'. As a REPEATABLE READ level causes any index lock to be held for the entire transaction then two page/collection copy operations working on the various view tables are going to end up in deadlock.

Some possible approaches:

1) Altering the transaction isolation for page copy operations

Force the transaction isolation level to 'read committed' for page copy operations. This should prevent the deadlock that is appearing in 15.04 (and 1.10, 1.9, 1.8 and likely back further) when run on MySQL. A crude patch to a branch of 15.04 is below. If I test using a JMeter test plan that simulates 30 users initiating page copies - without the patch I can trigger deadlocks, with the patch I've not managed to hit a deadlock.

https://github.com/rboyatt/mahara/tree/bug1492919-approach1

2) Setting transaction isolation application wide

As above but setting the transaction isolation level for MySQL to be the same as when running on PostgreSQL (i.e. read committed'). I'm afraid I don't know Mahara well enough to understand the full implications of this change. Others will be much better placed to advise here!

3) More complex approach is addressing the issue that any of the transactions in the application might still fail due to deadlock. I couldn't find anywhere in the code that attempts transaction retry but perhaps I've missed it? So perhaps longer-term looking at transaction retries and how some of the longer operations in transactions lock the various elements of the DB.