Pdata objects can not be Pickled once the chain is longer than 500

Bug #143899 reported by Darryl Dixon
2
Affects Status Importance Assigned to Milestone
Zope 2
Invalid
Medium
Unassigned

Bug Description

In OFS.Image there is a 'File' class and 'Pdata' wrapper class. Pdata objects are designed to operate efficiently with large amounts of data by allowing themselves to be chained together in a C linked-list, while presenting an interface as if to a single object at the head of the list.
The 'File' class (which is used for, EG, ATFile, CMFFile, etc) stores its data internally in a Pdata object. It provides a 'manage_upload()' interface for modifying or supplying data to be stored in the File. This function splits the data into 64KB chunks and assigns each chunk to a new Pdata object in the C linked-list Pdata chain.
Once a file more than 32MB has been stored via manage_upload(), the File object can no longer be dumped with the Pickler() class.
EG, cPickle.Pickler.dump(OFS.Image.File) fails when the data stored in OFS.Image.File is greater than 32MB and was stored via the supplied manage_upload() interface.
The error is a RuntimeError for maximum recursion being hit.

Attached is a simple ZopeTestCase that demonstrates the problem.

Tags: bug zope
Revision history for this message
Darryl Dixon (esrever-otua) wrote :
Revision history for this message
Andreas Jung (ajung) wrote :

Please submit a *simple* testcase on top of ZopeTestCase but not on top of PloneTestCase with lots of different dependencies from other products.

Revision history for this message
Tres Seaver (tseaver) wrote :

I can't reproduce the described behavior at all, using the Python
API of File::

    $ cd ~/projects/Zope-CVS/Zope-2.10-branch
    $ PYTHONPATH=lib/python ../bin/python
    Python 2.4.4 (#1, Apr 30 2007, 11:13:06)
    [GCC 4.1.2 (Ubuntu 4.1.2-0ubuntu4)] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> x = 'X' * 1024 * 1024
    >>> data = open('/tmp/issue_2306', 'wb')
    >>> for i in range(34):
    ... data.write(x)
    ...
    >>> data.flush()
    >>> data.close()
    >>> data = open('/tmp/issue_2306', 'rb')
    >>> from OFS.Image import File
    >>> f = File('issue_2306', '', '')
    >>> f.manage_upload(data)
    >>> f.size
    35651584L
    >>> from cPickle import dump, load
    >>> target = open('/tmp/pickle', 'wb')
    >>> dump(f, target)
    >>> target.flush()
    >>> target.close()
    >>> source = open('/tmp/pickle', 'rb')
    >>> f2 = load(source)
    >>> f2 == f
    False
    >>> f2.data == f.data
    False
    >>> f2.size
    35651584L
    >>> str(f2.data) == str(f.data)
    True

Revision history for this message
Darryl Dixon (esrever-otua) wrote :

Uploaded: pdatatest.py

Attached is a single file implementing the tests as ZopeTestCase tests. I trust that you will be able to drop this into the correct folder and run.

Revision history for this message
Andreas Jung (ajung) wrote :

I also have some doubts. We're storing large objects up to 150MB using Pdata chunking within the ZODB...never had any issues so far.

Revision history for this message
Darryl Dixon (esrever-otua) wrote :

Folks, I'm gobsmacked.

Fine, the testcase is a PloneTestCase - I will create a ZopeTestCase when I have a spare moment. Honestly, though, this is a trivial problem to duplicate. Upload a big file TTW, then try to run some code that pickles it.

Have either of you *actually run* the attached testcase? Does it fail? Can you offer some *constructive input* into why it is perfectly acceptable that it should fail?

tseaver:
You are using Zope 2.10 - the bug is logged against Zope 2.8 and Zope 2.9 - maybe it's fixed in 2.10, who knows? Also, you have dumped the data without using the Pickler class - maybe this affects it?

Revision history for this message
Andreas Jung (ajung) wrote :

If you submit a bugreport including a test then the test should be as small as possible. I *did* look at the test just because I am also a Plone developer _but_ not everybody as a serveral Plone sandboxes at hand - and especially not if all 3rd-party dependencies like TextIndexNG (as used in your tests)...such tests aren't pretty helpful for us.

Revision history for this message
Andreas Jung (ajung) wrote :

The error also occurs with the Python pickle module:

  File "/opt/python-2.4.4/lib/python2.4/pickle.py", line 695, in _batch_setitems
    save(v)
  File "/opt/python-2.4.4/lib/python2.4/pickle.py", line 338, in save
    self.save_reduce(obj=obj, *rv)
  File "/opt/python-2.4.4/lib/python2.4/pickle.py", line 433, in save_reduce
    save(state)
  File "/opt/python-2.4.4/

Also interesting is the behavior of _read_file() generating
only 1 chunk when called through the constructor.

Revision history for this message
Tres Seaver (tseaver) wrote :

> tseaver:
> You are using Zope 2.10 - the bug is logged against Zope 2.8 and Zope 2.9
> - maybe it's fixed in 2.10, who knows? Also, you have dumped the data
> without using the Pickler class - maybe this affects it?

My test script behaves identically in Zope 2.8 and 2.9, and
if I create a pickler wrapped around 'target' instead of calling
'dump' as a module scope function.

Revision history for this message
Tres Seaver (tseaver) wrote :

I'll note a couple of observations:

 - The chunking into individual 64k Pdata chunks can't work
   when you pass the file to the constructor: the implementaiton
   relies on having a connection ('_p_jar') already associated
   with the File object, which will never be true ducring the
   constructor. Therefore, you are likely to end up with a
   *single* Pdata chunk, which will be as large as the entire
   value passed to the constructor. This is a Bad Thing (tm),
   and is why the 'manage_addFile' factory passes an empty
   string as data to the constructor, and then calls
   'manage_upload' with the real data.

 - Your test is spending a lot of effort getting around
   acquisition wrappers, which can't be pickled properly.
   I *suspect* that the recursion is due to an improperly-
   wrapped object somewhere.

Revision history for this message
Darryl Dixon (esrever-otua) wrote :

Hi tseaver -

Can you confirm the output of sys.getrecursionlimit() on your machine? What happens if you force it down to about sys.setrecursionlimit(20) or so?

D

Revision history for this message
Andreas Jung (ajung) wrote :
Download full text (5.8 KiB)

Original followup by esrever_otua:

"""
Running copy.deepcopy against such an object fails as well for the same reason (Maximum recursion depth exceeded).
 >>> y = deepcopy(x)
 2007-09-25 12:25:10 ERROR ZODB.Connection Couldn't load state for 0x209fe1
 Traceback (most recent call last):
   File "/u01/app/zope-2.8.7/lib/python/ZODB/Connection.py", line 704, in setstate
     self._setstate(obj)
   File "/u01/app/zope-2.8.7/lib/python/ZODB/Connection.py", line 758, in _setstate
     self._reader.setGhostState(obj, p)
   File "/u01/app/zope-2.8.7/lib/python/ZODB/serialize.py", line 495, in setGhostState
     state = self.getState(pickle)
   File "/u01/app/zope-2.8.7/lib/python/ZODB/serialize.py", line 488, in getState
     return unpickler.load()
   File "/u01/app/zope-2.8.7/lib/python/ZODB/serialize.py", line 436, in _persistent_load
     return self._conn.get(oid)
   File "/u01/app/zope-2.8.7/lib/python/ZODB/Connection.py", line 207, in get
     p, serial = self._storage.load(oid, self._version)
   File "/u01/app/zope-2.8.7/lib/python/ZEO/ClientStorage.py", line 746, in load
     return self.loadEx(oid, version)[:2]
   File "/u01/app/zope-2.8.7/lib/python/ZEO/ClientStorage.py", line 769, in loadEx
     data, tid, ver = self._server.loadEx(oid, version)
   File "/u01/app/zope-2.8.7/lib/python/ZEO/ServerStub.py", line 192, in loadEx
     return self.rpc.call("loadEx", oid, version)
   File "/u01/app/zope-2.8.7/lib/python/ZEO/zrpc/connection.py", line 531, in call
     r_flags, r_args = self.wait(msgid)
   File "/u01/app/zope-2.8.7/lib/python/ZEO/zrpc/connection.py", line 638, in wait
     asyncore.poll(delay, self._singleton)
   File "/u01/app/python/lib/python2.3/asyncore.py", line 119, in poll
     read(obj)
   File "/u01/app/python/lib/python2.3/asyncore.py", line 73, in read
     obj.handle_error()
   File "/u01/app/zope-2.8.7/lib/python/ZEO/zrpc/connection.py", line 453, in handle_error
     level=logging.ERROR, exc_info=True)
   File "/u01/app/zope-2.8.7/lib/python/ZEO/zrpc/connection.py", line 327, in log
     self.logger.log(level, self.log_label + message, exc_info=exc_info)
   File "/u01/app/python/lib/python2.3/logging/__init__.py", line 959, in log
     apply(self._log, (level, msg, args), kwargs)
   File "/u01/app/python/lib/python2.3/logging/__init__.py", line 994, in _log
     self.handle(record)
   File "/u01/app/python/lib/python2.3/logging/__init__.py", line 1004, in handle
     self.callHandlers(record)
   File "/u01/app/python/lib/python2.3/logging/__init__.py", line 1037, in callHandlers
     hdlr.handle(record)
   File "/u01/app/python/lib/python2.3/logging/__init__.py", line 588, in handle
     rv = self.filter(record)
 RuntimeError: maximum recursion depth exceeded
 Traceback (most recent call last):
   File "<stdin>", line 1, in ?
   File "/u01/app/python/lib/python2.3/copy.py", line 222, in deepcopy
     y = _reconstruct(x, rv, 1, memo)
   File "/u01/app/python/lib/python2.3/copy.py", line 369, in _reconstruct
     state = deepcopy(state, memo)
   File "/u01/app/python/lib/python2.3/copy.py", line 192, in deepcopy

 [ ... Snipped many, many duplicate lines of recursive traceback ...]

     y = copier(x, memo)
   File "/u...

Read more...

Revision history for this message
Tres Seaver (tseaver) wrote :

Can you please provide a failing testcase against OFS.Image.File which does *not* try to create the File object passing a file to its constructor? E.g.:

  def test_cant_pickle_PData(self):
       file = self._makeFile()
       file.manage_upload(self._makeReallyBigTempFile())
       s = cPickle.dumps(f)

This is the reason that 'OFS.Image.manage_addFile' does the add of a new empty file, and only calls manage_Upload after seating the object into its container: the PData optimization fails for Files constructed with file objects (because there is no _p_jar yet).

Changed in zope2:
status: New → Incomplete
Revision history for this message
Darryl Dixon (esrever-otua) wrote :

The pdatatest.py that is already attached to this bug does this.

Revision history for this message
Colin Watson (cjwatson) wrote :

The zope2 project on Launchpad has been archived at the request of the Zope developers (see https://answers.launchpad.net/launchpad/+question/683589 and https://answers.launchpad.net/launchpad/+question/685285). If this bug is still relevant, please refile it at https://github.com/zopefoundation/zope2.

Changed in zope2:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.