num-retries ignored for some http errors

Bug #709973 reported by az
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Duplicity
Fix Released
Medium
edso
Debian
Fix Released
Unknown

Bug Description

This is a forward of Debian bug #611465, which lives here: http://bugs.debian.org/611465

Here is the original report:

Duplicity just stopped uploading an backup to a webdav server after receiving a HTTP 500 error - after successfully uploading 104 archives just before.
According to the log there were no retries:
-----
A somefile
AsyncScheduler: scheduling task for asynchronous execution
WebDAV PUT attempt #1 failed: 500 Internal Server Error
Saving /somepath/duplicity/duplicity-full.20110129T094436Z.vol105.difftar.gpg on WebDAV server
AsyncScheduler: a previously scheduled task has failed; propagating the result immediately
AsyncScheduler: task execution done (success: False)
Traceback (most recent call last):
  File "/usr/bin/duplicity", line 1251, in <module>
    with_tempdir(main)
  File "/usr/bin/duplicity", line 1244, in with_tempdir
    fn()
-----
Command line used:
/usr/bin/duplicity --verbosity info --log-file "$HOME/backup-logs/backup-dfrank.$STARTTIME.log" --encrypt-key=12345 --sign-key=12345 --asynchronous-upload --num-retries 20 --include-globbing-filelist "$HOME/backup.excl" --full-if-older-than 1M /something webdavs://<email address hidden>/somepath/duplicity

WebDAV Server is humyo.de, apparently sometimes their server returns an error, even when everything is fine.
In my opinion it would be good to just stick to the number of retries, probably even after a short waiting period (e.g. 10 seconds), probably with an option like "--retry-all-errors".

Revision history for this message
Tokuko (launchpad-net-tokuko) wrote :
Download full text (3.2 KiB)

A broken TCP connection seems to result in the same issue (at least with webdavs):

INFO 1
. Args: /usr/bin/duplicity --verbosity info --log-file /root/backup-logs/backup-dfrank.2011-02-05 13:18.log --encrypt-key=12345 --sign-key=12345 --asynchronous-upload --num-retries 20 --include-globbing-filelist /root/backup.dfrank.excl --full-if-older-than 3M /something webdavs://<email address hidden>/somepath/duplicity

[... removed about 6000 lines ...]

INFO 12
. AsyncScheduler: scheduling task for asynchronous execution

INFO 14
. AsyncScheduler: task execution done (success: False)

INFO 14
. AsyncScheduler: a previously scheduled task has failed; propagating the result immediately

ERROR 30 SSLError
. Traceback (most recent call last):
. File "/usr/bin/duplicity", line 1251, in <module>
. with_tempdir(main)
. File "/usr/bin/duplicity", line 1244, in with_tempdir
. fn()
. File "/usr/bin/duplicity", line 1217, in main
. full_backup(col_stats)
. File "/usr/bin/duplicity", line 416, in full_backup
. globals.backend)
. File "/usr/bin/duplicity", line 315, in write_multivol
. (tdp, dest_filename)))
. File "/usr/lib/python2.6/dist-packages/duplicity/asyncscheduler.py", line 151, in schedule_task
. return self.__run_asynchronously(fn, params)
. File "/usr/lib/python2.6/dist-packages/duplicity/asyncscheduler.py", line 215, in __run_asynchronously
. with_lock(self.__cv, wait_for_and_register_launch)
. File "/usr/lib/python2.6/dist-packages/duplicity/dup_threading.py", line 100, in with_lock
. return fn()
. File "/usr/lib/python2.6/dist-packages/duplicity/asyncscheduler.py", line 207, in wait_for_and_register_launch
. check_pending_failure() # raise on fail
. File "/usr/lib/python2.6/dist-packages/duplicity/asyncscheduler.py", line 191, in check_pending_failure
. self.__failed_waiter()
. File "/usr/lib/python2.6/dist-packages/duplicity/dup_threading.py", line 201, in caller
. value = fn()
. File "/usr/lib/python2.6/dist-packages/duplicity/asyncscheduler.py", line 183, in <lambda>
. (waiter, caller) = async_split(lambda: fn(*params))
. File "/usr/bin/duplicity", line 314, in <lambda>
. async_waiters.append(io_scheduler.schedule_task(lambda tdp, dest_filename: put(tdp, dest_filename),
. File "/usr/bin/duplicity", line 240, in put
. backend.put(tdp, dest_filename)
. File "/usr/lib/python2.6/dist-packages/duplicity/backends/webdavbackend.py", line 251, in put
. response = self.request("PUT", url, source_file.read())
. File "/usr/lib/python2.6/dist-packages/duplicity/backends/webdavbackend.py", line 106, in request
. self.conn.request(method, quoted_path, data, self.headers)
. File "/usr/lib/python2.6/httplib.py", line 914, in request
. self._send_request(method, url, body, headers)
. File "/usr/lib/python2.6/httplib.py", line 954, in _send_request
. self.send(body)
. File "/usr/lib/python2.6/httplib.py", line 759, in send
. self.sock.sendall(str)
. File "/usr/lib/python2.6/ssl.py", line 203, in sendall
. v = self.send(data[count:])
. File "/usr/lib/python2.6/ssl.py", line 174, in send

Currently duplicity has not been able to finish a b...

Read more...

Changed in debian:
status: Unknown → Confirmed
Revision history for this message
AndyS (a-salnikov) wrote :

I have experienced the same problem when I switched to box.com with webdav backend. box.com apparently likes to drop connections which causes exceptions in ssl module and crashes duplicity. Re-try logic does not work because it does not check exceptions, only HTTP response codes.

I made a simple patch for webdav backend which seems to work OK for me. What it does is to wrap calls to self.request() into try ... except. If exception is detected then connection is re-opened (just in case it was dropped, I do not know how to precisely detect dropped connections so I just re-open it) and next try is attempted. I have modified all four methods - list(), get(), put(), delete(); this should make backend more robust, but I'm still not sure that error handling is 100% waterproof.

I am including this patch in the hope it could be useful to others or could be integrated into duplicity. This patch also includes a fix to #943001 (listbody change).

Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote :

ede, please look at this approach instead of retry_fatal. It may solve a few more problems.

Changed in duplicity:
assignee: nobody → edso (ed.so)
Revision history for this message
edso (ed.so) wrote : Re: [Bug 709973] Re: num-retries ignored for some http errors

thanks for pointing out.. the need of reconnection wasn't obvious to me.. i'd probably combine the approaches. any complaint about the retry_fatal decorator? i like it for the simplicity it provides. decorate and save all these redundant for loops and log.Info() messaging.

..ede

On 02.01.2013 14:42, Kenneth Loafman wrote:
> ede, please look at this approach instead of retry_fatal. It may solve
> a few more problems.
>
>
> ** Changed in: duplicity
> Assignee: (unassigned) => edso (ed.so)
>

Revision history for this message
edso (ed.so) wrote :

i'll hack it accordingly then and update the branch later.. any timeframe for the release? ..ede

On 02.01.2013 15:07, Kenneth Loafman wrote:
> I have no complaints about it. It simplifies the code.
>
> ...Ken
>
> On Wed, Jan 2, 2013 at 8:04 AM, <<email address hidden> <mailto:<email address hidden>>> wrote:
>
> thanks for pointing out.. the need of reconnection wasn't obvious to me.. i'd probably combine the approaches. any complaint about the retry_fatal decorator? i like it for the simplicity it provides. decorate and save all these redundant for loops and log.Info() messaging.
>
> ..ede
>
> On 02.01.2013 14:42, Kenneth Loafman wrote:
> > ede, please look at this approach instead of retry_fatal. It may solve
> > a few more problems.
> >
> >
> > ** Changed in: duplicity
> > Assignee: (unassigned) => edso (ed.so)
> >
>
>

Changed in duplicity:
status: New → Confirmed
importance: Undecided → Medium
milestone: none → 0.6.21
Revision history for this message
Kenneth Loafman (kenneth-loafman) wrote :

Probably next week. I need to rebuild my test environment after an upgrade.

On Wed, Jan 2, 2013 at 8:20 AM, <email address hidden> wrote:

> i'll hack it accordingly then and update the branch later.. any timeframe
> for the release? ..ede
>
> On 02.01.2013 15:07, Kenneth Loafman wrote:
> > I have no complaints about it. It simplifies the code.
> >
> > ...Ken
> >
> > On Wed, Jan 2, 2013 at 8:04 AM, <<email address hidden> <mailto:
> <email address hidden>>> wrote:
> >
> > thanks for pointing out.. the need of reconnection wasn't obvious to
> me.. i'd probably combine the approaches. any complaint about the
> retry_fatal decorator? i like it for the simplicity it provides. decorate
> and save all these redundant for loops and log.Info() messaging.
> >
> > ..ede
> >
> > On 02.01.2013 14:42, Kenneth Loafman wrote:
> > > ede, please look at this approach instead of retry_fatal. It may
> solve
> > > a few more problems.
> > >
> > >
> > > ** Changed in: duplicity
> > > Assignee: (unassigned) => edso (ed.so)
> > >
> >
> >
>

Revision history for this message
edso (ed.so) wrote :
Changed in duplicity:
milestone: 0.6.21 → 0.6.22
Revision history for this message
edso (ed.so) wrote :

should be fixed with 0.6.21 already.. ede

Changed in duplicity:
milestone: 0.6.22 → none
status: Confirmed → Fix Released
Changed in debian:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.