Akiban Persistit

Accumulator state sometimes missing from checkpoint

Bug #1064565 reported by Peter Beaman on 2012-10-09

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	Akiban Persistit	Fix Released	Critical	Peter Beaman	Akiban Persistit 3.2.0

Bug Description

The state of an Accumulator is sometimes incorrect after shutting down and restarting Persistit and as a result an application can read a count or value that is inconsistent with the history of committed transactions.

The bug mechanism is a race between the CheckpointManager#createCheckpoint method and the Accumulator#update method in which an update which occurs in a transaction that starts immediately after the checkpoint begins its transaction can be lost. The probability of failure is low but may be increased by intense I/O activity.

This is a data loss error and is therefore critical.

Related branches

lp:~pbeaman/akiban-persistit/fix-accumulator-checkpoint-failure

Merged into lp:akiban-persistit at revision 378

Nathan Williams: Approve on 2012-10-11

Peter Beaman (pbeaman) on 2012-10-09

Changed in akiban-persistit:
assignee:	nobody → Peter Beaman (pbeaman)

Revision history for this message

Peter Beaman (pbeaman) wrote on 2012-10-09:

Bug has been reproduced within a new unit test Bug1064565Test.

Method:

ThreadSequencer is used to create a schedule in which

- Thread A executing CheckpointManager#createCheckpoint method blocks after assigning the checkpoint timestamp but before reviewing the acquiring the accumulator checkpoint list.

- Thread B invokes Accumulator#update while A is blocked, after which both threads are allowed to complete.

Close and restart Persistit and verify state of accumulator.

Revision history for this message

Nathan Williams (nwilliams) wrote on 2012-10-09:

From your elaboration, it doesn't even need a sequencer:

txn.begin()
accum.update()
checkpoint()
txn.commit()
txn.end()
copyback()
restart()

Because the checkpointed snapshot does not include the delta and it is lost upon restart. Right?

Revision history for this message

Peter Beaman (pbeaman) wrote on 2012-10-10: Re: [Bug 1064565] Re: Accumulator state sometimes missing from checkpoint

Correct! And verified by test. The issue is that the _checkpointRef
field of AccumulatorRef is not treated transactionally.

On Tue, Oct 9, 2012 at 5:57 PM, Nathan Williams
<email address hidden> wrote:
> >From your elaboration, it doesn't even need a sequencer:
>
> txn.begin()
> accum.update()
> checkpoint()
> txn.commit()
> txn.end()
> copyback()
> restart()
>
> Because the checkpointed snapshot does not include the delta and it is
> lost upon restart. Right?
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1064565
>
> Title:
> Accumulator state sometimes missing from checkpoint
>
> Status in Akiban Persistit:
> Confirmed
>
> Bug description:
> The state of an Accumulator is sometimes incorrect after shutting down
> and restarting Persistit and as a result an application can read a
> count or value that is inconsistent with the history of committed
> transactions.
>
> The bug mechanism is a race between the
> CheckpointManager#createCheckpoint method and the Accumulator#update
> method in which an update which occurs in a transaction that starts
> immediately after the checkpoint begins its transaction can be lost.
> The probability of failure is low but may be increased by intense I/O
> activity.
>
> This is a data loss error and is therefore critical.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/akiban-persistit/+bug/1064565/+subscriptions

Revision history for this message

Timothy Wegner (timmwegner) wrote on 2012-10-10:

Here is my view of the sequence based on my last discussion with Peter.

  1. A transaction 1, which causes accumulator updates begins
  2. Accumulator is updated and flag indicating it needs to be check pointed is set. Transaction 1 continues to do updates
  3. Checkpoint operation starts, snap shot is taken (includes accumulator values).
  4. Checkpoint code acquires journal lock and flushes buffers
  5. Checkpoint code gets accumulator values and stores them (journal no longer locked)
  6. Transaction 1 commits/ends....no more updates
  7. Checkpoint code clears flag indicating accumulator checkpoint is required.

Txn 1 should be blocked during step 4, however all accumulator updates that occur between steps 1 and 7 would be lost.
If Txn 1 continues to perform updates after step 7 all is good....no issue will occur.

Peter Beaman (pbeaman) on 2012-10-10

Changed in akiban-persistit:
status:	Confirmed → Fix Committed

Revision history for this message

Peter Beaman (pbeaman) wrote on 2012-10-11:

Here's a review of the bug mechanism:

Effectively, every Accumulator has a flag F that indicates whether it has been modified since the last checkpoint. (The flag is actually the state of a field and an associated AccumulatorRef instance.) F is set each time the Accumulator is updated and cleared by the checkpoint transaction. What happens in the bug is this:

A transaction A begins and performs updates on one or more Accumulators, along with other work. For now assume there is just one Accumulator with one flag F.

After A's start timestamp but before A commits, a checkpoint transaction C starts in the CHECKPOINT_MANAGER thread. C writes a copy of the Accumulator's snapshot value as of C's start timestamp to stable storage and clears F. C then commits before A, leaving the F flag set to false.

The snapshot value written to stable storage does not include the updates performed by A since it had not committed yet. However the needs-checkpoint flag F is false so that unless there is another Accumulator update, the newly updated and committed value will never be written by another checkpoint.

Subsequently the system shuts down and starts up, which causes the pre-update state of the Accumulator to be recovered. This value is inconsistent key-value pairs A may have inserted and which are present in the recovered state.

Nathan's formulation of the bug has an arbitrarily large time window during which the bug can occur. Summary:

A begins
A performs all of its Accumulator updates
C begins
C commits
A commits

(it doesn't really matter whether A or C commits first.)

If A begins, performs its Accumulator updates, and then simply waits for a checkpoint to start and commit, the bug will occur. The longer A waits after performing its last Accumulator update before calling commit, the greater the probability of an occurrence. The situation is amplified because the checkpoint timestamp and the foreground transaction both synchronize against the JournalManager. If the checkpoint transaction gets it first, the commit for transaction A is blocked. Contrary to our first assessment, this bug is fairly easy to reproduce and does not require an improbably timing accident.

The proposed solution is better handling of the flag F so that the checkpoint transaction clears it only if no update has committed after the start of the checkpoint timestamp.

Here's a review of the bug mechanism:

Effectively, every Accumulator has a flag F that indicates whether it has been modified since the last checkpoint. (The flag is actually the state of a field and an associated AccumulatorRef instance.)  F is set each time the Accumulator is updated and cleared by the checkpoint transaction.  What happens in the bug is this:

A transaction A begins and performs updates on one or more Accumulators, along with other work.  For now assume there is just one Accumulator with one flag F.

After A's start timestamp but before A commits, a checkpoint transaction C starts in the CHECKPOINT_MANAGER thread.  C writes a copy of the Accumulator's snapshot value as of C's start timestamp to stable storage and clears F.  C then commits before A, leaving the F flag set to false.

The snapshot value written to stable storage does not include the updates performed by A since it had not committed yet.  However the needs-checkpoint flag F is false so that unless there is another Accumulator update, the newly updated and committed value will never be written by another checkpoint.

Subsequently the system shuts down and starts up, which causes the pre-update state of the Accumulator to be recovered.  This value is inconsistent key-value pairs A may have inserted and which are present in the recovered state.

Nathan's formulation of the bug has an arbitrarily large time window during which the bug can occur.   Summary:

A begins
A performs all of its Accumulator updates
C begins
C commits
A commits

(it doesn't really matter whether A or C commits first.)

If A begins, performs its Accumulator updates, and then simply waits for a checkpoint to start and commit, the bug will occur.  The longer A waits after performing its last Accumulator update before calling commit, the greater the probability of an occurrence. The situation is amplified because the checkpoint timestamp and the foreground transaction both synchronize against the JournalManager.  If the checkpoint transaction gets it first, the commit for transaction A is blocked. Contrary to our first assessment, this bug is fairly easy to reproduce and does not require an improbably timing accident.

The proposed solution is better handling of the flag F so that the checkpoint transaction clears it only if no update has committed after the start of the checkpoint timestamp.

Nathan Williams (nwilliams) on 2012-10-15

Changed in akiban-persistit:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.