Assertion in MVV.prune()

Bug #1032846 reported by Peter Beaman
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Akiban Persistit
Fix Released
Critical
Peter Beaman

Bug Description

Encountered while running TransactionTest2#transactionsWithInterrupts

java.lang.AssertionError
 at com.persistit.MVV.prune(MVV.java:463)
 at com.persistit.Exchange.storeInternal(Exchange.java:1490)
 at com.persistit.Exchange.store(Exchange.java:1294)
 at com.persistit.Exchange.store(Exchange.java:2539)
 at com.persistit.TransactionTest2.transfer(TransactionTest2.java:290)
 at com.persistit.TransactionTest2.runIt(TransactionTest2.java:240)
 at com.persistit.TransactionTest2$2.run(TransactionTest2.java:184)
 at java.lang.Thread.run(Thread.java:680)

Related branches

Peter Beaman (pbeaman)
Changed in akiban-persistit:
assignee: nobody → Peter Beaman (pbeaman)
Changed in akiban-persistit:
importance: High → Critical
milestone: none → 3.1.8
Revision history for this message
Peter Beaman (pbeaman) wrote :

Found.

This assert is one manifestation of a pair of phenomena we call "isolation protocol failure". In our MVCC implementation it is critical for thread A to be able to correctly and atomically read the status of transaction being executed by thread B. TransactionIndex and TransactionIndexBucket are carefully written to ensure correct values are read, while at the same time avoiding locks as often as possible.

Unfortunately there were two statements in TransactionIndex in which changing values of object fields read outside of locks were able to change in a way that permitted false readings. The consequences of this were rare but serious. Specifically:

- TransactionTest2, which runs multiple threads in tight loops performing contentious transactions, sometimes exhibited a failure in which the sum of all "accounts" was non-zero by the end of the test.

- The assertion documented in this bug report which indicates an MVV has two versions from concurrently executed transactions. These should have been prevented by the wwDependency logic.

The race condition that permitted these errors is very narrow: specifically the commit timestamp of a TransactionStatus object was required to change between two states within the execution of a conditional statement that read the commit timestamp twice outside of a lock. We believe the error is extremely unlikely unless a very large volume of concurrent transactions are being performed.

Peter Beaman (pbeaman)
Changed in akiban-persistit:
status: Confirmed → Fix Committed
Peter Beaman (pbeaman)
Changed in akiban-persistit:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.