get_cls_info is not threadsafe

Reported by najeira on 2012-01-20
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Storm
High
Unassigned

Bug Description

I had got "RuntimeError: dictionary changed size during iteration":

Traceback (most recent call last):

...

  File "/app/lib/storm/store.py", line 215, in find
    find_spec = FindSpec(cls_spec)
  File "/app/lib/storm/store.py", line 1699, in __init__
    info.append((False, get_cls_info(item)))
  File "/app/lib/storm/info.py", line 51, in get_cls_info
    cls.__storm_class_info__ = ClassInfo(cls)
  File "/app/lib/storm/info.py", line 78, in __init__
    column = getattr(cls, attr, None)
  File "/app/lib/storm/references.py", line 139, in __get__
    self._cls = _find_descriptor_class(cls or local.__class__, self)
  File "/app/lib/storm/references.py", line 935, in _find_descriptor_class
    for attr, _descr in cls.__dict__.iteritems():
RuntimeError: dictionary changed size during iteration

storm.info.ClassInfo.__init__ is not threadsafe.
I wrote pache:

Index: info.py
===================================================================
--- info.py (rev 5903)
+++ info.py (working)
@@ -19,6 +19,7 @@
 # along with this program. If not, see <http://www.gnu.org/licenses/>.
 #
 from weakref import ref
+from threading import RLock

 from storm.exceptions import ClassInfoError
 from storm.expr import Column, Desc, TABLE
@@ -30,7 +31,9 @@
 __all__ = ["get_obj_info", "set_obj_info", "get_cls_info",
            "ClassInfo", "ObjectInfo", "ClassAlias"]

+_lock = RLock()

+
 def get_obj_info(obj):
     try:
         return obj.__storm_object_info__
@@ -50,8 +53,13 @@
         # Can't use attribute access here, otherwise subclassing won't work.
         return cls.__dict__["__storm_class_info__"]
     else:
- cls.__storm_class_info__ = ClassInfo(cls)
- return cls.__storm_class_info__
+ _lock.acquire()
+ try:
+ if "__storm_class_info__" not in cls.__dict__:
+ cls.__storm_class_info__ = ClassInfo(cls)
+ return cls.__storm_class_info__
+ finally:
+ _lock.release()

 class ClassInfo(dict):

najeira (najeira) wrote :

lock to initialize ClassInfo

najeira (najeira) wrote :
Gavin Panella (allenap) wrote :

I don't think much of Storm is thread-safe. You should probably not be sharing stores and model objects between threads, but someone with more experience may correct me. We certainly avoid that in Launchpad.

najeira (najeira) wrote :

I did not share stores and model objects. I use other stores each threads.
get_cls_info() stores model information to class. Class is shared every thread.
But, get_cls_info does not have protection for threadsafe.

Gavin Panella (allenap) wrote :

Ah yes, good point :-)

Gavin Panella (allenap) on 2012-01-21
Changed in storm:
status: New → Triaged
importance: Undecided → High
Gustavo Niemeyer (niemeyer) wrote :

It must be thread safe indeed. The fix is going a good direction, but it doesn't look entirely correct. Reference.__get__ isn't called by get_cls_info.

Gustavo Niemeyer (niemeyer) wrote :

FWIW, that'd be easy to fix on 2.X, since we can call items() which is atomic. In 3.X a more traditional approach is necessary. It's a pity that the classes itself offers no way to reflect a list of key/value pairs in a safe way.

najeira (najeira) wrote :

There are several problems, not only `cls.__dict__.iteritems()`.

For example ClassInfo.columns. That value will be used by many places.
But that value is setted after first initialization, in rare cases, because thread-unsafe.

Gustavo Niemeyer (niemeyer) wrote :

I'm probably missing what you actually mean. Apparently ClassInfo.columns is set in ClassInfo's constructor.

najeira (najeira) wrote :

Yes, ClassInfo.columns is setted in ClassInfo's constructor.
And ClassInfo's constructor called from get_cls_info().

ClassInfo.columns is stored __storm_class_info__,
that value shold not changed after initialized.
Because ObjectInfo use ClassInfo.columns.

Gustavo Niemeyer (niemeyer) wrote :

A ClassInfo object is stored in __storm_class_info__, not its columns attribute. ClassInfo.columns is only ever set by ClassInfo.__init__, I believe.

Gustavo Niemeyer (niemeyer) wrote :

Maybe what you're trying to say is that get_cls_info today may return two different values if it's called concurrently?
It shouldn't indeed, so let's fix that too.

There's still something else to be done in addition to the suggested fix, though. Iterating over the class attributes may happen within Storm despite the lock being acquired there, so the lock isn't a guarantee.

najeira (najeira) wrote :

ClassInfo object is stored in __storm_class_info__, and ClassInfo object has columns attribute.
If __storm_class_info__ is changed, columns is changed too.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers