st retry failure error message could be more informative

Bug #782446 reported by Jon Slenk
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Medium
gholt

Bug Description

swift-1.2.0

While trying to use 'st' to upload a large file, several segments were successfully uploaded but then a ClientException was thrown:
st -A
https://example.net:443/v1.0 -U
<email address hidden> -K xxx upload -S 4294967296 test_container t2
t2 segment 7
t2 segment 9
t2 segment 6
t2 segment 2
t2 segment 8
t2 segment 5
t2 segment 1
t2 segment 0
t2 segment 4
t2 segment 3
t2 segment 15
t2 segment 16
t2 segment 10
t2 segment 13
t2 segment 11
t2 segment 12

Exception in thread Thread-21:
Traceback (most recent call last):
 File "/usr/lib/python2.6/threading.py", line 532, in __bootstrap_inner
   self.run()
 File "/usr/local/lib/python2.6/dist-packages/swift-1.2.0.cs20110406a-py2.6.egg/EGG-INFO/scripts/st",
line 861, in run
   self.func(item, *self.args, **self.kwargs)
 File "/usr/local/lib/python2.6/dist-packages/swift-1.2.0.cs20110406a-py2.6.egg/EGG-INFO/scripts/st",
line 1473, in _segment_job
   job['obj'], fp, content_length=job['segment_size'])
 File "/usr/local/lib/python2.6/dist-packages/swift-1.2.0.cs20110406a-py2.6.egg/EGG-INFO/scripts/st",
line 820, in put_object
   content_type=content_type, headers=headers)
 File "/usr/local/lib/python2.6/dist-packages/swift-1.2.0.cs20110406a-py2.6.egg/EGG-INFO/scripts/st",
line 742, in _retry
   rv = func(self.url, self.token, *args, **kwargs)
 File "/usr/local/lib/python2.6/dist-packages/swift-1.2.0.cs20110406a-py2.6.egg/EGG-INFO/scripts/st",
line 640, in put_object
   http_status=resp.status, http_reason=resp.reason)
ClientException: Object PUT failed:
https://api.storage.santa-clara.internapcloud.net:443/v1/YWjXcQtcSnHSuNArWFO1zoua/test_container_segments/t2/1305311416.0/73004144640/00000014
408 Request Timeout

Related branches

Revision history for this message
Jon Slenk (jslenk) wrote :

Forgot to say what I hazard to suggest should be done:

(A) Some # of retries should be attempted before giving up so drastically.

and/or

(B) there should be a way to resume (like wget -c) from that last failed segment.

Revision history for this message
Jon Slenk (jslenk) wrote :

I've been looking into the code and must correct myself. Apparently it /does/ retry (I haven't looked at it dynamically, just statically). So then perhaps this bug boils down to a specific case of the other 'st' bug about better error messages (https://bugs.launchpad.net/swift/+bug/634171), in particular: say how many retries have been done.

Revision history for this message
Jon Slenk (jslenk) wrote :

(Apparently the code does 5 retries with backoff: I have to go try to see what our receiving server was doing such that 408 was generated.)

Revision history for this message
Jon Slenk (jslenk) wrote :

Ja, digging through logs it appears to be a timeout on the Object-Proxy side of things, not 'st'. So I changed the title of the bug.

summary: - st large object upload should retry segments
+ st retry failure error message could be more informative
Revision history for this message
gholt (gholt) wrote :

I think st only retries when getting a pure HTTPException/socket.err (connection refused, connection timeout, that sort of thing), a 5xx from the server, or (once) when getting a 401 Unauthorized. I'll test to be sure, but I'm pretty sure right now it doesn't retry on 408 Request Timeout.

Changed in swift:
assignee: nobody → gholt (gholt)
importance: Undecided → Medium
status: New → In Progress
Revision history for this message
gholt (gholt) wrote :

Okay, I've pushed fixes to lp:~gholt/swift/stfixes
Since st is made to be standalone, you can grab directly from http://bazaar.launchpad.net/~gholt/swift/stfixes/view/head:/bin/st if you'd like.

Updated st's copy of client.py
st no longer aborts everything on one error
st prints when it had to retry
st prints ClientExceptions without the full stack trace
st aborts manifest creation if segments couldn't be uploaded
client.py will retry on 408s
client.py will treat empty contents as resettable

Revision history for this message
Jon Slenk (jslenk) wrote :

muchas gracias!

gholt (gholt)
Changed in swift:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in swift:
milestone: none → 1.4.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.