- PXC#456: WSREP: FSM: no such a transition ROLLED_BACK -> ROLLED_BACK with
LOAD DATA INFILE
Issue:
-----
LDI for that matter DML statement can fail due to multiple reasons.
Some probable reasons are:
- Creating table w/o pk and setting wsrep_certify_nonPK = off
- Existing bug that causes partitioned table LDI to fail.
....etc.
Statement failure will skip append_key which besides appending key also
set valid trx_id.
Such failed statements are rolled back with trx_id = default.
Galera-Plugin try to check if there is an existing Trx Object with
given trx_id before creating a new one.
If there are 2 independent connections (connected to same cluster node)
and both of these connections execute a failing statement then
both of them will try to rollback with trx_id = default.
Logic that cached trx_id to trx-object never considered this situation
and one of the such connection will get reference to a object that belongs
to other connection which is logically wrong as both connection are unrelated.
This also causes operational in-consistency as latter connection accesses
state already modified by former connection.
(Causing the famous ROLLBACK -> ROLLBACK assert).
Solution(s):
-----------
(I am listing all possible solution with one we have selected)
* trx-map should use pair of <trx_id, conn_id> as map key.
* trx-map should use multi-map with trx_id -> TrxObject
TrxObject can use valid conn_id (vs -1 for now).
For valid trx_id there only 1 trx_id -> TrxObject pair
for default there could be multiple trx_id -> TrxObjects pair
so proper pair is selected based on conn_id.
[Both of the above approach needs interface change so ruled out for now]
* Re-arrange the logic to discard_trx object while holding lock on trx
so that latter connection will get reference to the object but will
not be able to operate on it till former one is done.
(Logically 2 connections are sharing the objects which itself is wrong
but if this can be made possible with some tweak in the code it will
introduce flow control as it involves exception handling).
* Introduce a separate map that will cache pthread_id -> TrxObject if
trx_id = default.
(Given the limited changes involved we opted for this solution though
we would love to sort this out with upstream using interface change
solutions mentioned above).
commit 794f3cddb0c1947 67a760dc51b30b0 0ab94c55ac
Merge: 304970e be2dd53
Author: Krunal Bauskar <email address hidden>
Date: Mon Nov 16 19:49:55 2015 +0530
Merge pull request #33 from kbauskar/ 3.x-pxc- 456
- PXC#456: WSREP: FSM: no such a transition ROLLED_BACK -> ROLLED_BAC…
commit be2dd5305479d62 1c94ba26992610e fd84ca9752
Author: Krunal Bauskar <email address hidden>
Date: Mon Nov 16 10:58:42 2015 +0530
- PXC#456: WSREP: FSM: no such a transition ROLLED_BACK -> ROLLED_BACK with
LOAD DATA INFILE
Issue:
-----
LDI for that matter DML statement can fail due to multiple reasons.
Some probable reasons are:
- Creating table w/o pk and setting wsrep_certify_nonPK = off
- Existing bug that causes partitioned table LDI to fail.
....etc.
Statement failure will skip append_key which besides appending key also
set valid trx_id.
Such failed statements are rolled back with trx_id = default.
Galera-Plugin try to check if there is an existing Trx Object with
given trx_id before creating a new one.
If there are 2 independent connections (connected to same cluster node)
and both of these connections execute a failing statement then
both of them will try to rollback with trx_id = default.
Logic that cached trx_id to trx-object never considered this situation
and one of the such connection will get reference to a object that belongs
to other connection which is logically wrong as both connection are unrelated.
This also causes operational in-consistency as latter connection accesses
state already modified by former connection.
(Causing the famous ROLLBACK -> ROLLBACK assert).
Solution(s):
-----------
(I am listing all possible solution with one we have selected)
* trx-map should use pair of <trx_id, conn_id> as map key.
* trx-map should use multi-map with trx_id -> TrxObject
TrxObject can use valid conn_id (vs -1 for now).
For valid trx_id there only 1 trx_id -> TrxObject pair
for default there could be multiple trx_id -> TrxObjects pair
so proper pair is selected based on conn_id.
[Both of the above approach needs interface change so ruled out for now]
* Re-arrange the logic to discard_trx object while holding lock on trx
so that latter connection will get reference to the object but will
not be able to operate on it till former one is done.
(Logically 2 connections are sharing the objects which itself is wrong
but if this can be made possible with some tweak in the code it will
introduce flow control as it involves exception handling).
* Introduce a separate map that will cache pthread_id -> TrxObject if
trx_id = default.
(Given the limited changes involved we opted for this solution though
we would love to sort this out with upstream using interface change
solutions mentioned above).