cache GC interacts badly with multiple threads
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
ZODB |
Fix Released
|
Undecided
|
Christian Theune | ||
3.6 |
Fix Released
|
Undecided
|
Christian Theune | ||
3.7 |
Fix Released
|
Undecided
|
Christian Theune | ||
3.8 |
Fix Released
|
Undecided
|
Christian Theune |
Bug Description
Okay, I somehow managed to lose my first version of what I was going to say here.
Basically, when you run with a cache-size of 0, (although it is theoretically possible to occur with other sizes as well), you can run into problems if the garbage collection for the cache is run while you are using objects in that thread. Unfortunately the db.open call will runs the GC on the caches for all connections when you create a new connection.
The sorts of errors that can occur can be quite confusing. For example take the code for the persistent mapping __getstate__ routine (as called eventually by Transaction.
def __getstate__(self):
state = {}
del state['data']
return state
if the object is ghostified before the self.__dict__ reference then you'll get a KeyError('data') exception in the 3rd line.
Another case is when you store a reference to a mutable (non-persistent) subobject and then access it.
l = self.list
l.append([xyz])
self._p_changed = 1
then if self is ghostified before settting _p_changed then self.list != l and so you won't actually change the object like you mean to.
In fact if you're really unlucky you can get the case where:
self.member == self.member will be false.
Changed in zodb: | |
status: | Unconfirmed → In Progress |
I noticed the problem within our quite complex application, but fortunately I was able to reproduce the behaviour in a simple setting.
Attached is a python script that will create a PersistentMapping and populate it will a couple entries and then repeated access the entries while another thread is creating new connections and thus triggering the garbage collection.