Interrupt may leave TransactionStatus ABORTED but not Notified

Bug #1012859 reported by Peter Beaman
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Akiban Persistit
High
Peter Beaman

Bug Description

The server ApiTestBase_turbo2 branch fails to complete due to excessive looping during the FTS tests in snapshotAccumulatorHelper. Upon further investigation it appears the ActiveTransactionCache floor is not advancing due to some very old TransactionStatus instances on the long running queue that are marked ABORTED but not notified.

Related branches

Peter Beaman (pbeaman)
Changed in akiban-persistit:
assignee: nobody → Peter Beaman (pbeaman)
status: New → Confirmed
Revision history for this message
Peter Beaman (pbeaman) wrote :

Former name of this bug: Server API Turbo branch slow executing TransactionBucketIndex#accumulatorSnapshotHelper

After investigation we found a code path in which a thread is interrupted while trying to register a transaction; the TransactionStatus instance is left in the state of being partially aborted: it's status is ABORTED, but it is not Notified. The downstream consequences of this are severe; the TransactionStatus in this state prevents the ActiveTransactionCache floor from being increased, which causes a severe aggregation of Long Running transactions. The bug was observed when a server test was found almost constantly looping in the Accumulator code.

summary: - Server API Turbo branch slow executing
- TransactionBucketIndex#accumulatorSnapshotHelper
+ Interrupt may leave TransactionStatus ABORTED but not Notified
Changed in akiban-persistit:
importance: Undecided → High
status: Confirmed → In Progress
Revision history for this message
Peter Beaman (pbeaman) wrote :

The fix for this bug fixes the ApiBase_Turbo2 branch. Note the time.

Results :

Failed tests: forceNewTimestampChangesSchemaGen(com.akiban.server.test.it.store.SchemaManagerIT): timestamp unchanged: 487
  testCancel(com.akiban.sql.pg.JMXCancelationIT): expected:<1> but was:<2>
  testConnectionReuse(com.akiban.sql.pg.YamlTesterIT): Command 1 (Statement): java.lang.NullPointerException

Tests in error:
  renameAllSchemasAndTablesParentDownWithRestartsBetween(com.akiban.server.test.it.dxl.RenameTableIT): Guice provision errors:(..)

Tests run: 1719, Failures: 3, Errors: 1, Skipped: 17

[INFO] [failsafe:verify {execution: verify}]
[INFO] Failsafe report directory: /home/peter/dev/server/ApiTestBase_turbo2/target/failsafe-reports
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] There are test failures.

Please refer to /home/peter/dev/server/ApiTestBase_turbo2/target/failsafe-reports for the individual test results.
[INFO] ------------------------------------------------------------------------
[INFO] For more information, run Maven with the -e switch
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 15 minutes 54 seconds
[INFO] Finished at: Thu Jun 14 13:41:41 EDT 2012
[INFO] Final Memory: 43M/537M
[INFO] ------------------------------------------------------------------------

real 15m55.731s
user 12m31.580s
sys 0m19.510s

Changed in akiban-persistit:
status: In Progress → Fix Committed
Changed in akiban-persistit:
milestone: none → 3.1.2
visibility: private → public
Changed in akiban-persistit:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers