oids improperly reused after savepoint rollback

Reported by David Glick on 2010-10-23
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
ZODB
Undecided
Unassigned

Bug Description

In ZODB 3.10.0, if a savepoint is created after a persistent object is added to a connection, and then the savepoint is rolled back, the new object's oid is incorrectly added to the DB's _saved_oids list. Then if another object is added, it gets the same oid as the first object. This becomes evident with a traceback like the following when the second object gets added to the cache:

    Traceback (most recent call last):
      File "/Users/davidg/src/python/parts/opt/lib/python2.6/doctest.py", line 1248, in __run
        compileflags, 1) in test.globs
      File "<doctest ZODB.tests.testConnectionSavepoint.testNewOidNotReusedAfterRollbackOfPostAddSavepoint[7]>", line 1, in <module>
        sp2 = transaction.savepoint()
      File "/Users/davidg/.buildout/eggs/transaction-1.1.1-py2.6.egg/transaction/_manager.py", line 101, in savepoint
        return self.get().savepoint(optimistic)
      File "/Users/davidg/.buildout/eggs/transaction-1.1.1-py2.6.egg/transaction/_transaction.py", line 260, in savepoint
        self._saveAndRaiseCommitishError() # reraises!
      File "/Users/davidg/.buildout/eggs/transaction-1.1.1-py2.6.egg/transaction/_transaction.py", line 257, in savepoint
        savepoint = Savepoint(self, optimistic, *self._resources)
      File "/Users/davidg/.buildout/eggs/transaction-1.1.1-py2.6.egg/transaction/_transaction.py", line 690, in __init__
        savepoint = savepoint()
      File "/Users/davidg/Plone/ZODB/src/ZODB/Connection.py", line 1122, in savepoint
        self._commit(None)
      File "/Users/davidg/Plone/ZODB/src/ZODB/Connection.py", line 621, in _commit
        self._store_objects(ObjectWriter(obj), transaction)
      File "/Users/davidg/Plone/ZODB/src/ZODB/Connection.py", line 687, in _store_objects
        self._cache[oid] = obj
    ValueError: A different object already has the same oid

I've attached a failing test case. This is currently blocking use of ZODB 3.10.0 with Plone trunk.

David Glick (davisagli) wrote :
Jim Fulton (jim-zope) wrote :

will be in 3.10.1

Changed in zodb:
status: New → Fix Committed
David Glick (davisagli) on 2010-11-02
Changed in zodb:
status: Fix Committed → Fix Released

I am still getting this error on ZODB 3.10.1, so I am not sure if this is really fixed... :-)

My application is a web scrapper that searches a few different sites for information and stores it on a ZODB database. It is a repoze.bfg application.

There are a few parsers searching for realestate information and sometimes they have some properties in common. They run on a separate process and once they have finished with the parsing they report to the main application with a request where they send in the data. When two requests come in sometimes there are conflict errors which is ok, but once in a while I am getting this other error reported in this ticket and when that happens I need to restart my application.

I hope I am being helpful here...

--
JP

Jürgen Herrmann (xlhost) wrote :

I also get this this error from time to time with zodb 3.10.1, last time it happend about 30min. ago while packing my biggest relstorage db. First there was a conflict error:
Traceback (innermost last):
  Module Zope2.App.startup, line 197, in __call__
  Module ZPublisher.Publish, line 134, in publish
  Module Zope2.App.startup, line 301, in commit
  Module transaction._manager, line 95, in commit
  Module transaction._transaction, line 329, in commit
  Module transaction._transaction, line 446, in _commitResources
  Module ZODB.Connection, line 780, in tpc_vote
  Module relstorage.storage, line 788, in tpc_vote
--
ConflictError: database conflict error (oid 0x01, class BTrees.IIBTree.IITreeSet, serial this txn started with 0x00 1900-01-01 00:00:00.000000, serial currently committed 0x038bcd4602a83ccc 2011-01-24 01:42:00.622776)

Obviously this could not be resolved through retrying and any attempt to commit changes involving oid 0x01 throws that conflict error.

A few seconds later the "same oid" exceptions pop up:

Traceback (innermost last):
  Module ZPublisher.Publish, line 134, in publish
  Module Zope2.App.startup, line 301, in commit
  Module transaction._manager, line 95, in commit
  Module transaction._transaction, line 329, in commit
  Module transaction._transaction, line 443, in _commitResources
  Module ZODB.Connection, line 565, in commit
  Module ZODB.Connection, line 621, in _commit
  Module ZODB.Connection, line 687, in _store_objects
ValueError: A different object already has the same oid

Both errors go away on zope restart.

Jürgen Herrmann (xlhost) wrote :

Still getting this error pretty soon after starting to pack my biggest db. Both errors appear almost simultaneously.

for testing i disabled the save_oid() method of ZODB.DB, so not oids are saved in self._saved_oids. This did not make the error go away. Please reopen the bug.

On Mon, Feb 7, 2011 at 6:10 AM, Jürgen Herrmann <email address hidden> wrote:
> Still getting this error pretty soon after starting to pack my biggest
> db. Both errors appear almost simultaneously.
>
> for testing i disabled the save_oid() method of ZODB.DB, so not oids are
> saved in self._saved_oids. This did not make the error go away. Please
> reopen the bug.

Are you *sure* you did this correctly? This obviously can't be related to
save_oids if disabling this function doesn't make the problem go away.
I also don't see how this can be related to packing.

Jim

--
Jim Fulton
http://www.linkedin.com/in/jimfulton

Changed in zodb:
status: Fix Released → In Progress
Jürgen Herrmann (xlhost) wrote :

i modified ZODB/DB.py:

    def save_oid(self, oid):
        if oid in self._saved_oids:
            raise ValueError("Duplicate saved object ids.")
+ return
        if len(self._saved_oids) < self._max_saved_oids:
            self._saved_oids.add(oid)

I'll try packing it again today night, when traffic is less on our site. right now this nasty bug leaves us with a couple of angry customers, as the errors don't go away until we restart the appserver. If you have any suggestions as to what code paths i might try to deactivate for debugging, i'm happy to help out. unfortunately i still cannot make a small reproduceable test case from this.
it strongly looks like some kind of cache corruption though.

Jürgen Herrmann (xlhost) wrote :

yet another hint: the last attempted packing operation left the database in an persistently inconsistent state, zope restarts did not make this one go away. After restarting zope i immediately hat ReadConflictErrors on various oids in the event.log.
Grabing the oid from the event.log entry and doing the following in a debugger:

from ZODB.utils import p64
container._p_jar[p64(0x123456)]._p_changed = True
import transaction; transaction.commit()

This led to the "different object already has the same oid" on all subsequent accesses to this object... I strongly suspect that the ReadConflictErrors and this one are tied together and the former one introduces the cache inconsistency.
Finally i could fix this inconsistency by setting _p_changed to True on each object modified in the last 24h and commiting each object in a separate transaction. This made the errors go away, at least for now.

Jim Fulton (jim-zope) on 2011-02-11
Changed in zodb:
status: In Progress → Fix Committed
Jim Fulton (jim-zope) on 2011-02-12
Changed in zodb:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers