MemoryError occurs with large signature files.

Bug #686839 reported by Tom Eastman
24
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Duplicity
New
Undecided
Unassigned

Bug Description

I just tried switching from using the SFTP backend to the WebDAV backend. But am unable to do so because the WebDAV backend crashes when trying to retrieve the signature file from my backup set.

================================================================================
duplicity 0.6.09 (July 25, 2010)
Args: /usr/bin/duplicity -v5 --name XXXX_full --volsize 500 --exclude-globbing-filelist /etc/duplicity/ALL.exclude --include-globbing-filelist /etc/duplicity/ALL.include / webdavs://<email address hidden>/duplicity/XXXX.XXXX
Linux XXXXXXXXXXX 2.6.26-2-xen-amd64 #1 SMP Thu Sep 16 16:32:15 UTC 2010 x86_64
/usr/bin/python 2.5.2 (r252:60911, Jan 24 2010, 17:44:40)
[GCC 4.3.2]
================================================================================

Synchronizing remote metadata to local cache...
Deleting local /home/XXXXXX/.cache/duplicity/XXXXXXXX_full/duplicity-full-signatures.20101110T214029Z.sigtar.gz (not authoritative at backend).
Deleting local /home/XXXX/.cache/duplicity/XXXXXXXXXX/duplicity-full.20101110T214029Z.manifest (not authoritative at backend).
Copying duplicity-full-signatures.20100917T120401Z.sigtar to local cache.
Retrieving /duplicity/XXXXXXXXX/duplicity-full-signatures.20100917T120401Z.sigtar.gpg from WebDAV server
Traceback (most recent call last):
  File "/usr/bin/duplicity", line 1251, in <module>
    with_tempdir(main)
  File "/usr/bin/duplicity", line 1244, in with_tempdir
    fn()
  File "/usr/bin/duplicity", line 1145, in main
    sync_archive()
  File "/usr/bin/duplicity", line 959, in sync_archive
    copy_to_local(fn)
  File "/usr/bin/duplicity", line 911, in copy_to_local
    fileobj = globals.backend.get_fileobj_read(rem_name)
  File "/usr/lib/python2.5/site-packages/duplicity/backend.py", line 463, in get_fileobj_read
    self.get(filename, tdp)
  File "/usr/lib/python2.5/site-packages/duplicity/backends/webdavbackend.py", line 235, in get
    target_file.write(response.read())
  File "/usr/lib/python2.5/httplib.py", line 516, in read
    s = self._safe_read(self.length)
  File "/usr/lib/python2.5/httplib.py", line 607, in _safe_read
    return ''.join(s)
MemoryError

If I read this correctly, then 'target_file.write(response.read())' is trying to read the entire response into a string in memory before writing it to a file. This particular signature file is 1.2 gigabytes and duplicity thus exhausts all RAM, but in general I'll bet the signature file will ALWAYS be big enough that you don't want to deal with it as a string in RAM.

My workaround for the moment will just have to be to return to the SFTP backend, which doesn't have this problem.

Revision history for this message
mfitz (mfitz) wrote :

I'm getting issues with signature files getting too big for memory as well. I've had it happen on restores, and more recently during a full backup.

I've tried 2Gb of swapfile on a low spec 512MB machine and it still fails. My backups are 48Gb.

mfitz (mfitz)
summary: - MemoryError occurs when using WebDAV backend on large files.
+ MemoryError occurs with large signature files.
Revision history for this message
c sights (cwseys) wrote :

Similar problem with pydrive backend.
Possibly would be fixed by fixing
"Large backup signature and manifest files should be split with --volsize too"
https://bugs.launchpad.net/duplicity/+bug/385495

Writing duplicity-full-signatures.20150928T212314Z.sigtar.gz
PyDrive backend: file 'duplicity-full-signatures.20150928T212314Z.sigtar.gz' not found in cache or on server
PyDrive backend: creating new file 'duplicity-full-signatures.20150928T212314Z.sigtar.gz'
Backtrace of previous error: Traceback (innermost last):
  File "/usr/lib/python2.7/dist-packages/duplicity/backend.py", line 365, in inner_retry
    return fn(self, *args)
  File "/usr/lib/python2.7/dist-packages/duplicity/backend.py", line 531, in move
    self.__do_put(source_path, remote_filename)
  File "/usr/lib/python2.7/dist-packages/duplicity/backend.py", line 501, in __do_put
    self.backend._put(source_path, remote_filename)
  File "/usr/lib/python2.7/dist-packages/duplicity/backends/pydrivebackend.py", line 133, in _put
    drive_file.Upload()
  File "/usr/local/lib/python2.7/dist-packages/pydrive/files.py", line 225, in Upload
    self._FilesInsert(param=param)
  File "/usr/local/lib/python2.7/dist-packages/pydrive/auth.py", line 54, in _decorated
    return decoratee(self, *args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/pydrive/files.py", line 241, in _FilesInsert
    metadata = self.auth.service.files().insert(**param).execute()
  File "/usr/local/lib/python2.7/dist-packages/googleapiclient/discovery.py", line 795, in method
    payload = media_upload.getbytes(0, media_upload.size())
  File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 357, in getbytes
    return self._fd.read(length)
 MemoryError

Giving up after 2 attempts. MemoryError:

Revision history for this message
John Leach (johnleach) wrote :

I think this might get fixed with the work going on to split the signature database in #385495

Revision history for this message
Bruce Pieterse (octoquad) wrote :

I've been getting this pretty frequently lately as well, but with the SSH File Transfer Protocol (ssh://user@host). The stacktrace is different, but I thought I would share it here. Please let me know if you want me to create a new bug report for this specific backtrace and memory error:

Traceback (most recent call last):
  File "/usr/bin/duplicity", line 1546, in <module>
    with_tempdir(main)
  File "/usr/bin/duplicity", line 1540, in with_tempdir
    fn()
  File "/usr/bin/duplicity", line 1391, in main
    do_backup(action)
  File "/usr/bin/duplicity", line 1522, in do_backup
    incremental_backup(sig_chain)
  File "/usr/bin/duplicity", line 662, in incremental_backup
    bytes_written = dummy_backup(tarblock_iter)
  File "/usr/bin/duplicity", line 238, in dummy_backup
    while tarblock_iter.next():
  File "/usr/lib/python2.7/dist-packages/duplicity/diffdir.py", line 523, in next
    result = self.process(self.input_iter.next())
  File "/usr/lib/python2.7/dist-packages/duplicity/diffdir.py", line 218, in get_delta_iter
    (new_path, sig_path, sigTarFile))
  File "/usr/lib/python2.7/dist-packages/duplicity/robust.py", line 38, in check_common_error
    return function(*args)
  File "/usr/lib/python2.7/dist-packages/duplicity/diffdir.py", line 139, in get_delta_path
    delta_path.setfileobj(librsync.DeltaFile(old_sigfp, newfp))
  File "/usr/lib/python2.7/dist-packages/duplicity/librsync.py", line 151, in __init__
    sig_string = signature.read()
  File "/usr/lib/python2.7/tarfile.py", line 829, in read
    buf += self.fileobj.read()
  File "/usr/lib/python2.7/tarfile.py", line 743, in read
    return self.readnormal(size)
  File "/usr/lib/python2.7/tarfile.py", line 758, in readnormal
    return self.__read(size)
  File "/usr/lib/python2.7/tarfile.py", line 748, in __read
    buf = self.fileobj.read(size)
  File "/usr/lib/python2.7/gzip.py", line 268, in read
    self._read(readsize)
  File "/usr/lib/python2.7/gzip.py", line 320, in _read
    self._add_read_data( uncompress )
  File "/usr/lib/python2.7/gzip.py", line 338, in _add_read_data
    self.extrabuf = self.extrabuf[offset:] + data
MemoryError

This only started occurring on 17.10. Removing the remote backup files and starting fresh normally works for a while, but incrementals seem to be causing the issue.

$ apt-cache policy duplicity
duplicity:
  Installed: 0.7.12-1ubuntu1
  Candidate: 0.7.12-1ubuntu1
  Version table:
 *** 0.7.12-1ubuntu1 500
        500 http://za.archive.ubuntu.com/ubuntu artful/main amd64 Packages
        100 /var/lib/dpkg/status

$ apt-cache policy deja-dup
deja-dup:
  Installed: 36.2-0ubuntu1
  Candidate: 36.2-0ubuntu1
  Version table:
 *** 36.2-0ubuntu1 500
        500 http://za.archive.ubuntu.com/ubuntu artful/main amd64 Packages
        100 /var/lib/dpkg/status

Revision history for this message
Jussi Pollari (jussipol) wrote :
Download full text (3.1 KiB)

I seem to be getting MemoryError constantly when trying to restore a big backup:

Duplicity version: duplicity 0.8.17
Synchronizing remote metadata to local cache...
Copying duplicity-full-signatures.20201108T233647Z.sigtar.gpg to local cache.
Copying duplicity-full.20201108T233647Z.manifest.gpg to local cache.
Copying duplicity-inc.20201108T233647Z.to.20201110T040042Z.manifest.gpg to local cache.
Copying duplicity-inc.20201110T040042Z.to.20201111T123355Z.manifest.gpg to local cache.
Copying duplicity-inc.20201111T123355Z.to.20201112T035043Z.manifest.gpg to local cache.
Copying duplicity-new-signatures.20201108T233647Z.to.20201110T040042Z.sigtar.gpg to local cache.
Copying duplicity-new-signatures.20201110T040042Z.to.20201111T123355Z.sigtar.gpg to local cache.
Copying duplicity-new-signatures.20201111T123355Z.to.20201112T035043Z.sigtar.gpg to local cache.
Traceback (innermost last):
  File "/usr/local/bin/duplicity", line 117, in <module>
    with_tempdir(main)
  File "/usr/local/bin/duplicity", line 103, in with_tempdir
    fn()
  File "/usr/local/lib/python3.6/dist-packages/duplicity/dup_main.py", line 1535, in main
    do_backup(action)
  File "/usr/local/lib/python3.6/dist-packages/duplicity/dup_main.py", line 1561, in do_backup
    sync_archive(col_stats)
  File "/usr/local/lib/python3.6/dist-packages/duplicity/dup_main.py", line 1344, in sync_archive
    col_stats.set_values()
  File "/usr/local/lib/python3.6/dist-packages/duplicity/dup_collections.py", line 741, in set_values
    self.get_backup_chains(partials + backend_filename_list)
  File "/usr/local/lib/python3.6/dist-packages/duplicity/dup_collections.py", line 868, in get_backup_chains
    add_to_sets(f)
  File "/usr/local/lib/python3.6/dist-packages/duplicity/dup_collections.py", line 862, in add_to_sets
    if new_set.add_filename(filename, pr):
  File "/usr/local/lib/python3.6/dist-packages/duplicity/dup_collections.py", line 114, in add_filename
    self.set_manifest(filename)
  File "/usr/local/lib/python3.6/dist-packages/duplicity/dup_collections.py", line 165, in set_manifest
    self.set_files_changed()
  File "/usr/local/lib/python3.6/dist-packages/duplicity/dup_collections.py", line 141, in set_files_changed
    mf = self.get_manifest()
  File "/usr/local/lib/python3.6/dist-packages/duplicity/dup_collections.py", line 268, in get_manifest
    return self.get_local_manifest()
  File "/usr/local/lib/python3.6/dist-packages/duplicity/dup_collections.py", line 246, in get_local_manifest
    return manifest.Manifest().from_string(manifest_buffer)
  File "/usr/local/lib/python3.6/dist-packages/duplicity/manifest.py", line 213, in from_string
    for match in vi_iterator:
 MemoryError

The sigtar file mentioned last in the log is also the last sigtar file in the backup s3 bucket folder, so I am not sure if the MemoryError comes during the download or after it. The sigtar file sizes are: 10.3 GB, 604.5 MB, 514.4 MB, 385.7 MB.

Duplicity restore parameters used are:
    --encrypt-sign-key="${GPG_KEY}" \
    --file-to-restore "${2}" \
    --s3-use-new-style \
    --s3-use-multiprocessing \
    --archive-dir="${ARCHIVE_DIR}" \
    --verbosity=${VERBOSITY} \
    --restore-time "$...

Read more...

Revision history for this message
Jussi Pollari (jussipol) wrote :

alright, looks like the memory error indeed is related to.... memory. So I was trying to restore the above thing on instance with 16 gb memory and constantly got that error. Switched to instance with 64 gb memory and looks like it works... this is of course not a good solution. Any ideas on how to fix this issue, or workaround in a nicer way?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.