Frequent BackendException:s lately (SFTP)

Bug #946992 reported by Daniel Andersson
30
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Duplicity
New
Undecided
Unassigned
Python
Fix Released
Unknown

Bug Description

I have had a backup script using duplicity running for a long time. Recently (I have noticed it since mid-January, perhaps related to the introduction of python-paramiko) it frequently (not always) fails with (using switches "--asynchronous-upload -v5" (I haven't gotten a "-v9" report since it doesn't always happen)):

"""
AsyncScheduler: task execution done (success: False)
....[some files added to next volume]...
AsyncScheduler: scheduling task for asynchronous execution
AsyncScheduler: a previously scheduled task has failed; propagating the result
immediately
Backend error detail: Traceback (most recent call last):
  File "/usr/bin/duplicity", line 1388, in <module>
    with_tempdir(main)
  File "/usr/bin/duplicity", line 1381, in with_tempdir
    fn()
  File "/usr/bin/duplicity", line 1351, in main
    full_backup(col_stats)
  File "/usr/bin/duplicity", line 500, in full_backup
    globals.backend)
  File "/usr/bin/duplicity", line 399, in write_multivol
    (tdp, dest_filename, vol_num)))
  File "/usr/lib/python2.7/dist-packages/duplicity/asyncscheduler.py", line
151, in schedule_task
    return self.__run_asynchronously(fn, params)
  File "/usr/lib/python2.7/dist-packages/duplicity/asyncscheduler.py", line
215, in __run_asynchronously
    with_lock(self.__cv, wait_for_and_register_launch)
  File "/usr/lib/python2.7/dist-packages/duplicity/dup_threading.py", line 100,
in with_lock
    return fn()
  File "/usr/lib/python2.7/dist-packages/duplicity/asyncscheduler.py", line
196, in wait_for_and_register_launch
    check_pending_failure() # raise on fail
  File "/usr/lib/python2.7/dist-packages/duplicity/asyncscheduler.py", line
191, in check_pending_failure
    self.__failed_waiter()
  File "/usr/lib/python2.7/dist-packages/duplicity/dup_threading.py", line 201,
in caller
    value = fn()
  File "/usr/lib/python2.7/dist-packages/duplicity/asyncscheduler.py", line
183, in <lambda>
    (waiter, caller) = async_split(lambda: fn(*params))
  File "/usr/bin/duplicity", line 398, in <lambda>
    async_waiters.append(io_scheduler.schedule_task(lambda tdp, dest_filename,
vol_num: put(tdp, dest_filename, vol_num),
  File "/usr/bin/duplicity", line 296, in put
    backend.put(tdp, dest_filename)
  File "/usr/lib/python2.7/dist-packages/duplicity/backends/sshbackend.py",
line 189, in put
    raise BackendException("sftp put of %s (as %s) failed: %s" %
(source_path.name,remote_filename,e))
BackendException: sftp put of /media/5-2000/tmp/duplicity-tmp/duplicity-CyAVNR-
tempdir/mktemp-d28oCp-409 (as duplicity-
full.20120207T111003Z.vol408.difftar.gpg) failed: Server connection dropped:
"""

and to STDERR it is written

"""
No handlers could be found for logger "paramiko.transport"
BackendException: sftp put of /media/5-2000/tmp/duplicity-tmp/duplicity-CyAVNR-
tempdir/mktemp-d28oCp-409 (as duplicity-
full.20120207T111003Z.vol408.difftar.gpg) failed: Server connection dropped:
"""

It sounds like a local connection error, but it has never happened before and the error mentions paramiko explicitly. I have recently started using "--asynchronous-upload", but I believe it happened before that as well (it is hard to track down since it seems to happen randomly).

It happens more often during full backups, when 20GB+ is transferred (default volume size 25MB), than during incremental backups when only a few volumes typically are transferred.

Perhaps it is a local connection error that always have existed, but the previous SFTP backend retried as default action instead of failing.

Also reported as Debian bug #659007, but it has gotten no attention for a month.

Duplicity 0.6.17
Python 2.7.2+
Debian Sid
Linux

Revision history for this message
Daniel Andersson (drandersson) wrote :
Download full text (5.6 KiB)

Re-ran with "-v9" until an error occured. It confirms my suspicion that it has to do with the switch to python-paramiko.

STDOUT (which I log to /var/log) is very similar to the -v5 one reported above. It is attached as "duplicity-error-full-stdout-log-v9"

STDERR gave more information. Full output (except repeated lines as indicated):

-----

ssh: starting thread (client mode): 0xa11d48cL
ssh: Connected (version 2.0, client OpenSSH_5.3p1)
ssh: kex algos:['diffie-hellman-group-exchange-sha256', 'diffie-hellman-group-exchange-sha1', 'diffie-hellman-group14-sha1', 'diffie-hellman-group1-sha1'] server key:['ssh-rsa', 'ssh-dss'] client encrypt:['aes128-ctr', 'aes192-ctr', 'aes256-ctr', 'arcfour256', 'arcfour128', 'aes128-cbc', '3des-cbc', 'blowfish-cbc', 'cast128-cbc', 'aes192-cbc', 'aes256-cbc', 'arcfour', '<email address hidden>'] server encrypt:['aes128-ctr', 'aes192-ctr', 'aes256-ctr', 'arcfour256', 'arcfour128', 'aes128-cbc', '3des-cbc', 'blowfish-cbc', 'cast128-cbc', 'aes192-cbc', 'aes256-cbc', 'arcfour', '<email address hidden>'] client mac:['hmac-md5', 'hmac-sha1', '<email address hidden>', 'hmac-ripemd160', '<email address hidden>', 'hmac-sha1-96', 'hmac-md5-96'] server mac:['hmac-md5', 'hmac-sha1', '<email address hidden>', 'hmac-ripemd160', '<email address hidden>', 'hmac-sha1-96', 'hmac-md5-96'] client compress:['none', '<email address hidden>'] server compress:['none', '<email address hidden>'] client lang:[''] server lang:[''] kex follows?False
ssh: Ciphers agreed: local=aes128-ctr, remote=aes128-ctr
ssh: using kex diffie-hellman-group1-sha1; server key type ssh-rsa; cipher: local aes128-ctr, remote aes128-ctr; mac: local hmac-sha1, remote hmac-sha1; compression: local none, remote none
ssh: Switch to new keys ...
ssh: Trying discovered key xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx in /root/.ssh/id_rsa
ssh: userauth is OK
ssh: Authentication (publickey) successful!
ssh: [chan 1] Max packet in: 34816 bytes
ssh: [chan 1] Max packet out: 32768 bytes
ssh: Secsh channel 1 opened.
ssh: [chan 1] Sesch channel 1 request ok
ssh: [chan 1] Opened sftp connection (server version 3)
ssh: [chan 1] stat('backup')
ssh: [chan 1] stat('backup')
ssh: [chan 1] normalize('backup')
ssh: [chan 1] stat('/home/daniel/backup/home')
ssh: [chan 1] stat('/home/daniel/backup/home')
ssh: [chan 1] normalize('/home/daniel/backup/home')
ssh: [chan 1] stat('/home/daniel/backup/home/daniel')
ssh: [chan 1] stat('/home/daniel/backup/home/daniel')
ssh: [chan 1] normalize('/home/daniel/backup/home/daniel')
ssh: [chan 1] listdir('/home/daniel/backup/home/daniel/.')
ssh: [chan 1] listdir('/home/daniel/backup/home/daniel/.')

...[SNIP of second thread initializing, giving ~identical output to all previous lines]...

ssh: [chan 1] open('/home/daniel/backup/home/daniel/duplicity-full.20120305T105839Z.vol1.difftar.gpg', 'wb')
Exception AttributeError: AttributeError("'_DummyThread' object has no attribute '_Thread__block'",) in <module 'threading' from '/usr/lib/python2.7/threading.pyc'> ignored
ssh: [chan 1] open('/home/daniel/backup/home/daniel/duplicity-full.20120305T105839Z.vol1.difftar.gpg', 'wb') -> 00000000
ssh: [chan 1] close(00000000)
ssh: [chan 1] stat('/home/daniel/backup/hom...

Read more...

Revision history for this message
Daniel Andersson (drandersson) wrote :

Did personal pseudo-fix by using the rsync backend instead of scp.

Revision history for this message
Herbert V. Riedel (hvr) wrote :
Changed in python:
status: Unknown → New
Revision history for this message
Kristian Kostecky (kris-gq) wrote :

Guys,

I've experienced this happening. I took a look at the code and from a quick scan of the code alongside anecdotal testing it appears that if using the paramiko ssh-backend and your SSH connection disappears, even if you have --num-retries working (with the newer versions of duplicity), it will just retry the SFTP operations --num-retries times. It will, however, NOT attempt to re-open the SSH session before issuing the SFTP command, which I believe it should.

The important thing to note in the ssh error is: "failed: Server connection dropped" and then the repetition of the sftp operation afterwards with consistent failures, no matter how many --num-retries, even while independently testing that the remote ssh server is healthy and accepting new connections.

Revision history for this message
Edward Rudd (urkle-outoforder) wrote :

From looking through the source code of Paramiko, it does not have the ability to reconnect the SFTP client.. So duplicity needs to be reworked to catch the SSHException and handle the reconnect itself..

For now, as a workaround I simply am telling duplicity to use rsync instead.

Changed in python:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.