Assertion failure in TransactionIndexBucket#allocateTransactionStatus

Bug #1126297 reported by Peter Beaman
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Akiban Persistit
Fix Released
High
Nathan Williams

Bug Description

This is actually a dup of bug932072. That bug was closed long ago and pertained to somewhat different code, so I am writing a new bug report to cover this instance. This error was seen on a customer server:

java.lang.AssertionError
        at com.persistit.TransactionIndexBucket.allocateTransactionStatus(TransactionIndexBucket.java:180)
        at com.persistit.TransactionIndex.registerTransaction(TransactionIndex.java:560)
        at com.persistit.TransactionIndex.registerTransaction(TransactionIndex.java:527)
        at com.persistit.Transaction.begin(Transaction.java:596)
        at com.akiban.server.store.PersistitTransactionService.beginTransaction(PersistitTransactionService.java:81)
        at com.akiban.server.service.dxl.DXLTransactionHook.hookFunctionIn(DXLTransactionHook.java:46)
        at com.akiban.server.service.dxl.HookableDDLFunctions.getAIS(HookableDDLFunctions.java:266)
        at com.akiban.sql.pg.PostgresServerConnection.updateAIS(PostgresServerConnection.java:1000)
        at com.akiban.sql.pg.PostgresServerConnection.finishGenerating(PostgresServerConnection.java:1102)
        at com.akiban.sql.pg.PostgresServerConnection.processQuery(PostgresServerConnection.java:689)
        at com.akiban.sql.pg.PostgresServerConnection.topLevel(PostgresServerConnection.java:263)
        at com.akiban.sql.pg.PostgresServerConnection.run(PostgresServerConnection.java:179)
        at java.lang.Thread.run(Unknown Source)

We believe this is caused by an inadequately handled race condition between a thread relinquishing a TransactionStatus instance and another thread attempting to acquire one.

Related branches

Peter Beaman (pbeaman)
information type: Proprietary → Public
Peter Beaman (pbeaman)
Changed in akiban-persistit:
status: Confirmed → Fix Committed
Changed in akiban-persistit:
milestone: none → 3.2.6
assignee: nobody → Peter Beaman (pbeaman)
Revision history for this message
Padraig O'Sullivan (posulliv) wrote :

I tried this branch out on the customers machine but still seem to be hitting the same issue.

Changed in akiban-persistit:
status: Fix Committed → In Progress
assignee: Peter Beaman (pbeaman) → Nathan Williams (nwilliams)
Revision history for this message
Nathan Williams (nwilliams) wrote :

Originally posted on wrong bug:
Another version of the stack, but the other half of the coin (error on unlock vs lock):
ERROR [CLEANUP_MANAGER] 2013-02-19 16:58:45,134 Slf4jAdapter.java (line 99) [CLEANUP_MANAGER] ERROR java.lang.IllegalMonitorStateException
at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:155)
at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1260)
at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:460)
at com.persistit.TransactionStatus.wwUnlock(TransactionStatus.java:377)
at com.persistit.TransactionIndexBucket.notifyCompleted(TransactionIndexBucket.java:263)
at com.persistit.TransactionIndex.notifyCompleted(TransactionIndex.java:614)
at com.persistit.Transaction.rollback(Transaction.java:726)
at com.persistit.Transaction.close(Transaction.java:511)
at com.persistit.Persistit.cleanup(Persistit.java:1512)
at com.persistit.CleanupManager.poll(CleanupManager.java:156)
at com.persistit.CleanupManager.runTask(CleanupManager.java:87)
at com.persistit.IOTaskRunnable.run(IOTaskRunnable.java:143)
at java.lang.Thread.run(Thread.java:722)

And Peter's response:
Likely due to different thread trying to abort an abandoned txn.

Revision history for this message
Nathan Williams (nwilliams) wrote :
Download full text (3.5 KiB)

Eureka. This is trivially explainable and not due to any race.

The code running on the customer machine had a worker thread that could, potentially, exit without closing the transaction (no commit, rollback, or end). That leads to an error in the CleanupManager (see comment #2) but not before the manager marks that status as notified. Since it got notified, the next cleanup of the bucket will put it onto the free list in a locked state. This is how we wind up with a locked status on the free list, making that bucket essentially unusable.

This quick test code demonstrates all three aspects:
public static class EmptyTxnThread extends Thread {
        private final Persistit persistit;
        private final boolean doCommit;
        public Throwable error;

        public EmptyTxnThread(Persistit persistit, boolean doCommit) {
            this.persistit = persistit;
            this.doCommit = doCommit;
        }

        @Override
        public void run() {
            Transaction txn = persistit.getTransaction();
            try {
                txn.begin();
                if(doCommit) {
                    txn.commit();
                    txn.end();
                } else {
                    System.out.println("Abandoning thread: " + Thread.currentThread().getName());
                }
            } catch(Throwable t) {
                error = t;
            }
        }
    }

    @Test
    public void abandonedTest() throws Throwable {
        for(int i = 0; ; ++i) {
            EmptyTxnThread thread = new EmptyTxnThread(_persistit, i > 0);
            thread.start();
            while(thread.isAlive()) {
                Thread.sleep(10);
            }
            if(thread.error != null) {
                throw thread.error;
            }
        }
    }

And output from running it locally:
Abandoning thread: Thread-2
[CLEANUP_MANAGER] ERROR java.lang.IllegalMonitorStateException
     at java.util.concurrent.locks.ReentrantLock$Sync.tryRelease(ReentrantLock.java:155)
     at java.util.concurrent.locks.AbstractQueuedSynchronizer.release(AbstractQueuedSynchronizer.java:1260)
     at java.util.concurrent.locks.ReentrantLock.unlock(ReentrantLock.java:460)
     at com.persistit.TransactionStatus.wwUnlock(TransactionStatus.java:378)
     at com.persistit.TransactionIndexBucket.notifyCompleted(TransactionIndexBucket.java:263)
     at com.persistit.TransactionIndex.notifyCompleted(TransactionIndex.java:614)
     at com.persistit.Transaction.rollback(Transaction.java:726)
     at com.persistit.Transaction.close(Transaction.java:511)
     at com.persistit.Persistit.cleanup(Persistit.java:1512)
     at com.persistit.CleanupManager.poll(CleanupManager.java:156)
     at com.persistit.CleanupManager.runTask(CleanupManager.java:87)
     at com.persistit.IOTaskRunnable.run(IOTaskRunnable.java:143)
     at java.lang.Thread.run(Thread.java:722)

java.lang.AssertionError: status:<<ts=115 tc=ABORTED mvv=0>> wwLock:<java.util.concurrent.locks.ReentrantLock@51037076[Locked by thread Thread-2]> curThread:<Thread-240>
 at com.persistit.TransactionIndexBucket.allocateTransactionStatus(TransactionIndexBucket.java:180)
 at com.persistit.TransactionIndex.r...

Read more...

Changed in akiban-persistit:
status: In Progress → Fix Committed
Changed in akiban-persistit:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.