Timeout on sftp command 'ls -1'

Bug #794123 reported by Daniel Werner
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Duplicity
Fix Released
Medium
Unassigned

Bug Description

We use Duplicity to backup our Linux fileserver, with incremental backups every hour from 9 am to 7 pm and a full backup once a week. Recently, these backups have been started to fail while trying to get the existing file list on the target server via SFTP. At first, they mostly succeeded after a random amount of tries, now they fail at almost all times:

$ duplicity incr --asynchronous-upload --volsize 50 --no-encryption --verbosity=5 --archive-dir=/home/user/.cache/duplicity/ --log-file=/home/user/logs/duplicity.log --allow-source-mismatch --exclude='**.tmp' --exclude='**/~$*' --include=/home/some/path --include=/home/other/path --exclude='**' /home/ ssh://backup@192.168.178.8//srv/backups/fileserver

Running 'sftp backup@192.168.178.8' (attempt #1)
sftp command: 'mkdir /srv/backups/fileserver'
sftp command: 'cd /srv/backups/fileserver'
sftp command: 'ls -1'
Timeout waiting for response
Running 'sftp backup@192.168.178.8' failed (attempt #1)
Running 'sftp backup@192.168.178.8' (attempt #2)
[ ... several more attempts ... ]
Running 'sftp backup@192.168.178.8' (attempt #5)
sftp command: 'mkdir /srv/backups/fileserver'
sftp command: 'cd /srv/backups/fileserver'
sftp command: 'ls -1'
Timeout waiting for response
Running 'sftp backup@192.168.178.8' failed (attempt #5)
Giving up trying to execute 'sftp backup@192.168.178.8' after 5 attempts
Using temporary directory /tmp/duplicity-dnUTX7-tempdir
Traceback (most recent call last):
  File "/usr/bin/duplicity", line 583, in ?
    with_tempdir(main)
  File "/usr/bin/duplicity", line 577, in with_tempdir
    fn()
  File "/usr/bin/duplicity", line 506, in main
    globals.archive_dir).set_values()
  File "/usr/lib/python2.4/site-packages/duplicity/collections.py", line 524, in set_values
    backend_filename_list = self.backend.list()
  File "/usr/lib/python2.4/site-packages/duplicity/backends/sshbackend.py", line 230, in list
    l = self.run_sftp_command(commandline, commands).split('\n')[1:]
  File "/usr/lib/python2.4/site-packages/duplicity/backends/sshbackend.py", line 194, in run_sftp_command
    raise BackendException("Error running '%s'" % commandline)
BackendException: Error running 'sftp backup@192.168.178.8'

It seems that the initial 'ls -1' issued via SFTP is already taking too long, having to download a list of over 17.000 files. Running with --verbosity=9 showed that sftp 'ls -1' was indeed returning a list of files, but stopped transferring it right in the middle of some filename, presumably because it ran into the timeout.

What I have already tried to remedy this:

1. --ssh-options='-oConnectTimeout=999' did not make a difference.
2. Deleting some of the older backups chains and raising volsize to 50 MB helped for a short time, but the timeouts soon started to crop up again.
3. Setting the global timeout value in globals.py to "timeout = 60", up from the default of 30.

No. 3 has actually been working now for the last couple of days, but is rather a hack/workaround rather than a real solution. Perhaps the best thing to do would be to add an --sftp-timeout or --timeout option to override the global setting? I would happily provide a patch if the developers agree to this solution.

Duplicity version: 0.5.06 (the same timeout code seems to be present in 0.6.13 as well)
Python version: 2.4.4
OS Distro and version: Debian GNU/Linux 4.0 (etch)
Target filesystem: Linux

I can reproduce the problem with --verbosity=9 if requested.

Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote :

You just found a bug in the documentation, and in the code. There is a --timeout option, but it does not actually set anything appropriate. Plus, it seems to be missing in the man page. Will get it fixed. Thanks!

Changed in duplicity:
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Kenneth Loafman (kenneth-loafman)
milestone: none → 0.6.14
Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote :

Never mind the code issue. It was cleverly concealed. It's just not in the man page.

Changed in duplicity:
assignee: Kenneth Loafman (kenneth-loafman) → nobody
status: In Progress → Fix Committed
Changed in duplicity:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.