long lasting upload fails throwing http-error 502 Proxy Error or 504 Gateway Time-out

Reported by Matthias on 2012-08-26
34
This bug affects 6 people
Affects Status Importance Assigned to Milestone
Ubuntu One Servers
High
Sidnei da Silva

Bug Description

I have a bash script uploading a file to Ubuntu One through rest-api (https://one.ubuntu.com/developer/files/store_files/cloud).
The script can successfully display meta data, download files and _upload small files_. But when uploading large files which correspond to a long lasting upload the process fails.

The upload is done using curl and the call reads as:
   curl $CURL_PARAMETERS $UPLOAD_LIMIT -i -o "$RESPONSE_FILE"
    --upload-file "$FILE_SRC" "$SIGNED_URL"

The content of $RESPONSE_FILE after trying uploading a large file (curl uploads until 100%) is either

   HTTP/1.1 100 Continue

   HTTP/1.1 504 Gateway Time-out
   Date: Fri, 24 Aug 2012 10:23:29 GMT
   Cache-Control: no-cache
   Content-Type: text/html
   Via: 1.1 files.one.ubuntu.com
   Vary: Accept-Encoding
   Connection: close
   Transfer-Encoding: chunked

   <html><body><h1>504 Gateway Time-out</h1>
   The server didn't respond in time.
   </body></html>

or

   HTTP/1.1 100 Continue

   HTTP/1.1 502 Proxy Error
   Date: Fri, 24 Aug 2012 09:24:50 GMT
   Vary: Accept-Encoding
   Content-Length: 558
   Connection: close
   Content-Type: text/html; charset=iso-8859-1

   <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
   <html><head>
   <title>502 Proxy Error</title>
   </head><body>
   <h1>Proxy Error</h1>
   <p>The proxy server received an invalid
   response from an upstream server.<br />
   The proxy server could not handle the request <em><a href="/content
    /~/Ubuntu One/rpi/upload/Messtechnik.7z">PUT&nbsp;/content/~
    /Ubuntu One/rpi/upload/Messtechnik.7z</a></em>.<p>
   Reason: <strong>Error reading from remote server</strong></p></p>
   <hr>
   <address>Apache/2.2.14 (Ubuntu) Server at files.one.ubuntu.com Port
    443</address>
   </body></html>

Don't know if this is of interest: The 504 error seems to be triggered more often then 502.
Sometimes, when response was 502, the uploaded file was successfully committed to Ubuntu One (SHA1 check ok), other times the upload totally failed.

As shown in the date fields of the above http-headers the errors accoured after the fix of bug 989157 (2012-08-23).

Please let me know if I can provide further informations.

Matthias (matthiasw) on 2012-08-26
description: updated
visibility: private → public
Roman Yepishev (rye) wrote :

What are the sizes of the file being transferred and how much time does the transfer take?

Changed in ubuntuone-servers:
assignee: nobody → Ubuntu One Ops+ team (ubuntuone-ops+)
Matthias (matthiasw) wrote :

The file from the provided 502 error has a size of 47,2MB but I havn't stopped the real transfer-time. Calculated from my DSL uplink speed of 388kbps this will result in about 16,6 minutes.
I will do some tests tomorrow whether the time-limit is 15 minutes like in the OAuth bug.

Matthias (matthiasw) wrote :

Comment #2 contains wrong informations, i'm sorry. (havn't find a way to edit/delete the comment)

I have retried to upload the file from the provided 502 error which is of size 118,8MB. The upload itself (until the progressbar of curl reaches 100%) took 49 minutes 25 seconds, the time for processing the whole curl command took 50 minutes 16 seconds. This time, the thrown error was HTTP/1.1 504 Gateway Time-out.

By now, I havn't found the time-limit for the error. I have successfully uploaded a file within 16 minutes 45 seconds. I will run further uploads to determine the time-limit, but please be patient because each testrun takes tens of minutes.

Matthias (matthiasw) wrote :

Here what I have done so far:

filesize curl --limit-rate duration http-code
---------------------------------------------------
8.9MB 10k 15m2s 201 CREATED
8.9MB 10k 15m16s 201 CREATED
9MB 10k 15m8s 201 CREATED
9.1MB 10k 15m14s 201 CREATED
10MB 10k 16m45s 201 CREATED
20MB 10k 33m42s 201 CREATED
118MB not set 50m15s 504 Gateway Time-out
30MB 10k 50m36s 201 CREATED
120MB not set 50m43s 502 Proxy Error
61MB 20k 51m20s 504 Gateway Time-out
45MB 15k 50m50s 504 Gateway Time-out
45MB not set 19m14s 201 CREATED
42MB 10k 70m19s 201 CREATED

The above table doesn't make sense to me... can't see any system when/why error is occurring.
Should i test an upload at 10k for several hours? (don't know what to test now)

Matthias (matthiasw) wrote :

Unfortunately launchpad removes more than one spaces in text. Here is an readable version of the table:
http://pastebin.com/Ec8yf9YK

Roman Yepishev (rye) wrote :

So here's my current findings.

Since the OAuth timestamp was fixed the files of larger size were enabled to go through. However the updown server has only 50 seconds to reply before haproxy kicks in and sends 504 response.

During my two latest test of 300Mb upload the files were uploaded properly but the request failed with 502 Proxy error and 504 Gateway Timeout. In both cases the files were handled properly by updown but the client did not know that.

updown needs to spool the file completely first before compressing it, getting the hashes and sending it to storage and this process may take much more than 50 seconds.

Changed in ubuntuone-servers:
assignee: Ubuntu One Ops+ team (ubuntuone-ops+) → Sidnei da Silva (sidnei)
Roman Yepishev (rye) on 2012-08-28
Changed in ubuntuone-servers:
status: New → Confirmed
importance: Undecided → High
Sidnei da Silva (sidnei) wrote :

We've now bumped the backend timeout from 50 to 600 seconds, which should be more than enough. Since there could be other timeouts at play, we kindly ask you to test and confirm that this fixes your issue.

I've successfully uploaded a 5Gb file using curl (with no --limit-rate) in 4 minutes, so hopefully that should work for you as well.

Changed in ubuntuone-servers:
status: Confirmed → Fix Released
Matthias (matthiasw) wrote :

Sadly I can't confirm the fix for my issue (usage)...

Uploading two files:
1)
size: 98MB
time curl: 42m32s
--> upload failed, response: HTTP/1.1 502 Proxy Error, Date: Fri, 07 Sep 2012 05:52:53 GMT

 2)
size: 209MB
time curl: 97m48s
time updown: ~8m20s (<- this time was measured with a stopwatch from progressbar 100% till termination of curl)
--> upload succeed, response: HTTP/1.1 201 CREATED, Date: Fri, 07 Sep 2012 08:04:57 GMT

So surprisingly the second (bigger) file became commited, the first (smaller) one not.

Furthermore my second upload needs really long (scratching on the limit of 600sec) for the process of spooling, compressing, hashing,... (as stated from Roman in Post #6) compared with the whole upload of 5Gb within 4 minutes from Sidnei. Maybe this can be explaned if the 5Gb file of Sidnei was just a dummy-file containing only zeroes, because my file is 7zip compressed (high entropy).

Matthias (matthiasw) wrote :

Second time uploading file 1):

size: 98MB
time curl: 43m7.638s
time updown: ~57s
--> upload succeeded, response: HTTP/1.1 201 CREATED, Date: Fri, 07 Sep 2012 09:41:33 GMT

Sidnei da Silva (sidnei) wrote :

Yes, that was indeed the case, on my test I used a file containing only zeroes, and I just realized later on that it would be trivial to compress.

An explanation for the seemingly random difference is that we have two sets of servers handling these uploads, some are medium instances in ec2 which have much less power than the machines we have in the datacenter. I'll investigate further to find out how much we can bump this timeout in order to make it work, or alternatively taking the ec2 instances out of rotation.

Matthias (matthiasw) wrote :

Further uploads of today:

3)
size: 655MB
time curl: 288m49.694s
time updown: ~5m57s
--> upload succeeded, response: HTTP/1.1 201 CREATED, Date: Sat, 08 Sep 2012 14:25:03 GMT

4)
size: 179MB
time curl: 76m54.606s
time updown: ~50s
--> upload failed, response: HTTP/1.1 502 Proxy Error, Date: Sat, 08 Sep 2012 14:37:03 GMT

Looking at upload 4) the timeout of 50 seconds after upload recur.

I'm not familiar with timestamping of http-headers, but the timestamp of upload 4) could also be of interest. The logfile of the upload has a timestamp from 15:53 GMT. Subtracting the 77 minutes of curl from 15:53 GMT yields approximately the timestamp of the 502 error response. Comparison with the timestamp from upload 3) the timestamp from response 4) could be the start of the upload (I don't know exactly when I have initiated the upload).
Is it expected behaviour that the timestamp of the error response is the time-instance from the beginning of the upload?

Raal (raal) wrote :

Still problem with Thunderbird FileLink feature. Upload 217 MB file:
Error console in TB:
Error: 502 - Proxy Error
Source File: resource:///modules/http.jsm
Line: 47

I can confirm the bug reported by raal.

When uploading a large (~100MB) file from Thunderbird 16.0.1 with FileLink from Ubuntu 12.04 amd64 I get the error "Error: 502 - Proxy Error". The file is correctly uploaded to the server (I can check through Ubuntu One web interface), but Thunderbird is not aware of that. Although this behaviour is repetable, it does not occur 100% of the times (even uploading the same file).

Do I have to open a new bug report?

Thanks!

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers