Assertion in MVV.prune()
Bug #1032846 reported by
Peter Beaman
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Akiban Persistit |
Fix Released
|
Critical
|
Peter Beaman |
Bug Description
Encountered while running TransactionTest
java.lang.
at com.persistit.
at com.persistit.
at com.persistit.
at com.persistit.
at com.persistit.
at com.persistit.
at com.persistit.
at java.lang.
Related branches
lp:~pbeaman/akiban-persistit/fix-1032846-assertion-in-MVV-prune
- Nathan Williams: Approve
-
Diff: 52 lines (+21/-4)1 file modifiedsrc/main/java/com/persistit/TransactionIndex.java (+21/-4)
Changed in akiban-persistit: | |
assignee: | nobody → Peter Beaman (pbeaman) |
Changed in akiban-persistit: | |
importance: | High → Critical |
milestone: | none → 3.1.8 |
Changed in akiban-persistit: | |
status: | Confirmed → Fix Committed |
Changed in akiban-persistit: | |
status: | Fix Committed → Fix Released |
To post a comment you must log in.
Found.
This assert is one manifestation of a pair of phenomena we call "isolation protocol failure". In our MVCC implementation it is critical for thread A to be able to correctly and atomically read the status of transaction being executed by thread B. TransactionIndex and TransactionInde xBucket are carefully written to ensure correct values are read, while at the same time avoiding locks as often as possible.
Unfortunately there were two statements in TransactionIndex in which changing values of object fields read outside of locks were able to change in a way that permitted false readings. The consequences of this were rare but serious. Specifically:
- TransactionTest2, which runs multiple threads in tight loops performing contentious transactions, sometimes exhibited a failure in which the sum of all "accounts" was non-zero by the end of the test.
- The assertion documented in this bug report which indicates an MVV has two versions from concurrently executed transactions. These should have been prevented by the wwDependency logic.
The race condition that permitted these errors is very narrow: specifically the commit timestamp of a TransactionStatus object was required to change between two states within the execution of a conditional statement that read the commit timestamp twice outside of a lock. We believe the error is extremely unlikely unless a very large volume of concurrent transactions are being performed.