exchange.remove() returns true if a key has already been removed

Bug #914044 reported by Yuval Shavit
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Akiban Persistit
Fix Released
Medium
Nathan Williams

Bug Description

In the server code, when we maintain GI row counts, we decrement the counter iff our attempt to remove the current key causes it to actually be removed. This happens in OperatorStoreGIHandler.removeExchange.

I originally tried this code:

            if (exchange.remove()) {
                AccumulatorAdapter.updateAndGet(AccumulatorAdapter.AccumInfo.ROW_COUNT, exchange, -1);
            }

That didn't work as expected, but this does:

            if (exchange.fetch().getValue().isDefined()) {
                exchange.remove();
                AccumulatorAdapter.updateAndGet(AccumulatorAdapter.AccumInfo.ROW_COUNT, exchange, -1);
            }

Looking at the persistit UI, the key is defined but has an unpruned MVV that marks it as deleted:

    {"John",null,(long)1,null} [429<442>:AntiValue{}, 924<UNCOMMITTED>:AntiValue{}]

(The second one of those antivalues is the one caused by the if (exchange.remove()).

To reproduce:

- server trunk at revno 1430, persistit at 3.0.1-SNAPSHOT.220
- turn persistit UI on
- put a breakpoint in NewGiUpdateIT.coih_add_o() at the writeRow. Should be line 1159.
- once that's hit, put a breakpoint at OperatorStoreGIHandler.removeExchange
- you should see the antivalued key, and exchange.remove() should return true.

Fwiw, even with the fetch-then-remove version, exchange.remove() still returns true.

Revision history for this message
Yuval Shavit (yshavit) wrote :

Adding server as also-affected, since our code (at least in OperatorStoreGIHandler) uses the workaround of not relying on exchange.remove()'s return value. As far as I can tell, no other usage in the server relies on this return value. So the server is functionally correct here, but the code is a bit more verbose than it should be. My guess is that the performance hit is negligible, since a correct implementation of exchange.remove() would have to look at the MVVs to decide its return value, same as the current fetch.

Changed in akiban-server:
importance: Undecided → Low
milestone: none → 1.1
Revision history for this message
Nathan Williams (nwilliams) wrote :

The bug here is that exchange.remove() is really returning/leaking the visibility state of the key. If you do

If you do remove twice in a row both will return true:
ex.remove(k)
ex.remove(k)

This is a simple fix.

It may be important to note that this means the GI code is during redundant removes. Is that expected?

Note: the pruned state of the value didn't, really, have anything to do with this bug so I changed the title -- i.e. you can get a incorrect 'true' back even when the value is fully pruned. To be clear, pruned does not strictly mean single valued. If you have another case where it does please correct me!

summary: - exchange.remove() returns true if a key has an unpruned, value
+ exchange.remove() returns true if a key has already been removed
Changed in akiban-persistit:
status: New → Confirmed
importance: Undecided → Medium
assignee: nobody → Nathan Williams (nwilliams)
milestone: none → samba
Revision history for this message
Yuval Shavit (yshavit) wrote :

>It may be important to note that this means the GI code is doing
>redundant removes. Is that expected?

Yup.

Revision history for this message
Yuval Shavit (yshavit) wrote :

lp:~yshavit/akiban-server/gi_rowcount_fix is the branch that uses this workaround. When this bug is fixed in persistit, we should go back and simplify the server code before marking this bug fixed for the server.

Changed in akiban-persistit:
status: Confirmed → Fix Committed
Changed in akiban-persistit:
status: Fix Committed → Fix Released
Changed in akiban-persistit:
milestone: samba → 1.1
Changed in akiban-persistit:
milestone: 1.1 → 3.0.3
Yuval Shavit (yshavit)
Changed in akiban-server:
assignee: nobody → Yuval Shavit (yshavit)
Revision history for this message
Yuval Shavit (yshavit) wrote :

Since this is fixed in persistit, I thought I'd just quickly go in and fix it in the server. First, just to prove that we need the return value to be correct, I replaced the if with:

    if (exchange.remove() & true) {

Imagine my surprise when all ITs pass. Uh oh. FWIW, exchange.remove() & false does fail tests.

Revision history for this message
Yuval Shavit (yshavit) wrote :

Jack brought up the good point that (exchange.remove() & true) doesn't do what I wanted it to. "These are not the operations you're looking for." When I replace the & with |, the tests fail. And when I remove it so it's just if(exchange.remove()), the tests pass.

By "the tests" I mean specifically NewGiUpdateIT. Jenkins can take care of the rest of the 192847124 minutes it takes to run all the ITs.

Changed in akiban-server:
status: New → Fix Committed
Changed in akiban-server:
milestone: 1.1 → 1.2.1
status: Fix Committed → Fix Released
information type: Private → Public
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.