Client(curl) gets partial content while proxy-server got ChunkRead timeout

Bug #1717401 reported by Hugo Kou
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
High
Unassigned

Bug Description

It looks like a potential bug while proxy-server encounter ChunkRead timeout from a replica. There's a chance it ended with the new byte range content-length. And client is keep waiting for more bytes.

https://gist.github.com/HugoKuo/cd0745b78dda485a461aed0bb7894717

Reproduce
* Upload a 325MB file to a 3replicas container
* Run a simple script to spawn 50 curl instances to download the file. Please change the domain name and file name.

```
#! /bin/bash

for i in {0..150}
do
    curl -o ./tmp/file$i -v "http://rrdns.openstack.com.tw/v1/AUTH_demo/con1/325MB.img" >> ./tmp/logs/$i.log 2>&1 &
done
```

```

* My env is 2 nodes PACO with 8 disks in total. 2vCPU and 2GB vRAM.
* The latest Swift version 2.15.1.2 provided by us in controller 5.7.
* CentOS 7.2 OS.

I run the fetch script from my mac.

This looks like problem from either network latency under high concurrency or single disk IO bound causes the proxy-server timeout chunkRead then try another replica to keep pulling data from.

Revision history for this message
Vivek Jain (v-jain007) wrote :

I also want to add that this issue is not limited to small cluster. We have originally noticed it in big production cluster (multiple Swift Proxies and Swift Object nodes) and can easily reproduce in a small cluster as well. Issue was seen during high concurrent reads (100+) of same object of around 300+MB.

Tim Burke (1-tim-z)
Changed in swift:
status: New → Confirmed
importance: Undecided → High
Revision history for this message
clayg (clay-gerrard) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/504604
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=6b19ca7a7d5833f5648976d8d30c776975e361db
Submitter: Jenkins
Branch: master

commit 6b19ca7a7d5833f5648976d8d30c776975e361db
Author: Tim Burke <email address hidden>
Date: Fri Sep 15 22:52:26 2017 +0000

    proxy: Use the right ranges when going to multiple object servers

    When the proxy times out talking to a backend server (say, because it
    was under heavy load and having trouble servicing the request), we catch
    the ChunkReadTimeout and try to get the rest from another server. The
    client by and large doesn't care; there may be a brief pause in the
    download while the proxy get the new connection, but all the bytes
    arrive and in the right order:

        GET from node1, serve bytes 0 through N, timeout
        GET from node2, serve bytes N through end

    When we calculate the range for the new request, we check to see if we
    already have one from the previous request -- if one exists, we adjust
    it based on the bytes sent to the client thus far. This works fine for
    single failures, but if we need to go back *again* we double up the
    offset and send the client incomplete, bad data:

        GET from node1, serve bytes 0 through N, timeout
        GET from node2, serve bytes N through M, timeout
        GET from node3, serve bytes N + M through end

    Leaving the client missing bytes M through N + M.

    We should adjust the range based on the number of bytes pulled from the
    *backend* rather than delivered to the *frontend*. This just requires
    that we reset our book-keeping after adjusting the Range header.

    Change-Id: Ie153d01479c4242c01f48bf0ada78c2f9b6c8ff0
    Closes-Bug: 1717401

Changed in swift:
status: Confirmed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (stable/pike)

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/505743

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (stable/ocata)

Fix proposed to branch: stable/ocata
Review: https://review.openstack.org/505744

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (stable/newton)

Fix proposed to branch: stable/newton
Review: https://review.openstack.org/505745

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/deep)

Fix proposed to branch: feature/deep
Review: https://review.openstack.org/505844

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/deep)
Download full text (8.5 KiB)

Reviewed: https://review.openstack.org/505844
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=67b3a747869ad1727d88cb77fa65858a3c98f8a5
Submitter: Jenkins
Branch: feature/deep

commit 8556b06bf75e46369088f1cc6e2aa5d6cc00251b
Author: Christian Schwede <email address hidden>
Date: Wed Sep 20 10:56:41 2017 +0200

    Add test to ensure cache cleared after container PUT

    The parent commit fixes a race condition. Let's make sure there won't be
    a regression in the future, thus testing the order to ensure the cache
    is cleared after the request is executed.

    Related-Bug: #1715177
    Change-Id: I4f6750b7c556b498da0a2b56aa6c8cee5e42a90c

commit 6b19ca7a7d5833f5648976d8d30c776975e361db
Author: Tim Burke <email address hidden>
Date: Fri Sep 15 22:52:26 2017 +0000

    proxy: Use the right ranges when going to multiple object servers

    When the proxy times out talking to a backend server (say, because it
    was under heavy load and having trouble servicing the request), we catch
    the ChunkReadTimeout and try to get the rest from another server. The
    client by and large doesn't care; there may be a brief pause in the
    download while the proxy get the new connection, but all the bytes
    arrive and in the right order:

        GET from node1, serve bytes 0 through N, timeout
        GET from node2, serve bytes N through end

    When we calculate the range for the new request, we check to see if we
    already have one from the previous request -- if one exists, we adjust
    it based on the bytes sent to the client thus far. This works fine for
    single failures, but if we need to go back *again* we double up the
    offset and send the client incomplete, bad data:

        GET from node1, serve bytes 0 through N, timeout
        GET from node2, serve bytes N through M, timeout
        GET from node3, serve bytes N + M through end

    Leaving the client missing bytes M through N + M.

    We should adjust the range based on the number of bytes pulled from the
    *backend* rather than delivered to the *frontend*. This just requires
    that we reset our book-keeping after adjusting the Range header.

    Change-Id: Ie153d01479c4242c01f48bf0ada78c2f9b6c8ff0
    Closes-Bug: 1717401

commit d6fcf7459435077b525123e8b78e553070d5c070
Author: Kota Tsuyuzaki <email address hidden>
Date: Sat Sep 16 04:58:31 2017 +0900

    Make gate keeper to save relative location header path

    Why we need this:
      Some middlewares want to keep HTTP Location header as relative path
      (e.g. using Load balancer in front of proxy).

    What is the problem in current Swift:
      Current Swift already has the flag to keep it as relative when returning
      the reponse using swift.common.swob.Response. However, auth_token middleware,
      that is from keystonemiddleware, unfortunately can change the relative path
      to absolute because of using webob instead of swob.

    What this patch is doing:
      Make gate_keeper able to re-transform the location header from absolute path
      to relative path if 'swift.leave_relative_location' is explicitely set...

Read more...

tags: added: in-feature-deep
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (stable/newton)

Reviewed: https://review.openstack.org/505745
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=cbc14596159b9086c9e799e2232f2960294ebd5a
Submitter: Jenkins
Branch: stable/newton

commit cbc14596159b9086c9e799e2232f2960294ebd5a
Author: Tim Burke <email address hidden>
Date: Fri Sep 15 22:52:26 2017 +0000

    proxy: Use the right ranges when going to multiple object servers

    When the proxy times out talking to a backend server (say, because it
    was under heavy load and having trouble servicing the request), we catch
    the ChunkReadTimeout and try to get the rest from another server. The
    client by and large doesn't care; there may be a brief pause in the
    download while the proxy get the new connection, but all the bytes
    arrive and in the right order:

        GET from node1, serve bytes 0 through N, timeout
        GET from node2, serve bytes N through end

    When we calculate the range for the new request, we check to see if we
    already have one from the previous request -- if one exists, we adjust
    it based on the bytes sent to the client thus far. This works fine for
    single failures, but if we need to go back *again* we double up the
    offset and send the client incomplete, bad data:

        GET from node1, serve bytes 0 through N, timeout
        GET from node2, serve bytes N through M, timeout
        GET from node3, serve bytes N + M through end

    Leaving the client missing bytes M through N + M.

    We should adjust the range based on the number of bytes pulled from the
    *backend* rather than delivered to the *frontend*. This just requires
    that we reset our book-keeping after adjusting the Range header.

    Change-Id: Ie153d01479c4242c01f48bf0ada78c2f9b6c8ff0
    Closes-Bug: 1717401
    (cherry picked from commit 6b19ca7a7d5833f5648976d8d30c776975e361db)

tags: added: in-stable-newton
tags: added: in-stable-ocata
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (stable/ocata)

Reviewed: https://review.openstack.org/505744
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=35ab6a5da363aa10c8fa88df0a388ad6518164e7
Submitter: Jenkins
Branch: stable/ocata

commit 35ab6a5da363aa10c8fa88df0a388ad6518164e7
Author: Tim Burke <email address hidden>
Date: Fri Sep 15 22:52:26 2017 +0000

    proxy: Use the right ranges when going to multiple object servers

    When the proxy times out talking to a backend server (say, because it
    was under heavy load and having trouble servicing the request), we catch
    the ChunkReadTimeout and try to get the rest from another server. The
    client by and large doesn't care; there may be a brief pause in the
    download while the proxy get the new connection, but all the bytes
    arrive and in the right order:

        GET from node1, serve bytes 0 through N, timeout
        GET from node2, serve bytes N through end

    When we calculate the range for the new request, we check to see if we
    already have one from the previous request -- if one exists, we adjust
    it based on the bytes sent to the client thus far. This works fine for
    single failures, but if we need to go back *again* we double up the
    offset and send the client incomplete, bad data:

        GET from node1, serve bytes 0 through N, timeout
        GET from node2, serve bytes N through M, timeout
        GET from node3, serve bytes N + M through end

    Leaving the client missing bytes M through N + M.

    We should adjust the range based on the number of bytes pulled from the
    *backend* rather than delivered to the *frontend*. This just requires
    that we reset our book-keeping after adjusting the Range header.

    Change-Id: Ie153d01479c4242c01f48bf0ada78c2f9b6c8ff0
    Closes-Bug: 1717401
    (cherry picked from commit 6b19ca7a7d5833f5648976d8d30c776975e361db)

tags: added: in-stable-pike
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (stable/pike)

Reviewed: https://review.openstack.org/505743
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=02ee5c1cb6940f03b518887c88a65b74599fe828
Submitter: Jenkins
Branch: stable/pike

commit 02ee5c1cb6940f03b518887c88a65b74599fe828
Author: Tim Burke <email address hidden>
Date: Fri Sep 15 22:52:26 2017 +0000

    proxy: Use the right ranges when going to multiple object servers

    When the proxy times out talking to a backend server (say, because it
    was under heavy load and having trouble servicing the request), we catch
    the ChunkReadTimeout and try to get the rest from another server. The
    client by and large doesn't care; there may be a brief pause in the
    download while the proxy get the new connection, but all the bytes
    arrive and in the right order:

        GET from node1, serve bytes 0 through N, timeout
        GET from node2, serve bytes N through end

    When we calculate the range for the new request, we check to see if we
    already have one from the previous request -- if one exists, we adjust
    it based on the bytes sent to the client thus far. This works fine for
    single failures, but if we need to go back *again* we double up the
    offset and send the client incomplete, bad data:

        GET from node1, serve bytes 0 through N, timeout
        GET from node2, serve bytes N through M, timeout
        GET from node3, serve bytes N + M through end

    Leaving the client missing bytes M through N + M.

    We should adjust the range based on the number of bytes pulled from the
    *backend* rather than delivered to the *frontend*. This just requires
    that we reset our book-keeping after adjusting the Range header.

    Change-Id: Ie153d01479c4242c01f48bf0ada78c2f9b6c8ff0
    Closes-Bug: 1717401
    (cherry picked from commit 6b19ca7a7d5833f5648976d8d30c776975e361db)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/s3api)

Fix proposed to branch: feature/s3api
Review: https://review.openstack.org/512277

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: feature/s3api
Review: https://review.openstack.org/512283

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on swift (feature/s3api)

Change abandoned by Alistair Coles (<email address hidden>) on branch: feature/s3api
Review: https://review.openstack.org/512283
Reason: I was just trying to get sensible topic

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/s3api)
Download full text (23.2 KiB)

Reviewed: https://review.openstack.org/512277
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=f94d6567a7e2e8b3ca1168b4a41c42c1ee371af5
Submitter: Zuul
Branch: feature/s3api

commit 24188beb81d39790034fa0902246163a7bf54c91
Author: Samuel Merritt <email address hidden>
Date: Thu Oct 12 16:13:25 2017 -0700

    Remove some leftover threadpool cruft.

    Change-Id: I43a1a428bd96a2e18aac334c03743a9f94f7d3e1

commit 1d67485c0b935719e0c8999eb353dfd84713add6
Author: Samuel Merritt <email address hidden>
Date: Fri Apr 15 12:43:44 2016 -0700

    Move all monkey patching to one function

    Change-Id: I2db2e53c50bcfa17f08a136581cfd7ac4958ada2

commit 407f5394f0f5cb422c06b4e5b2f9fbfdb07782d1
Author: OpenStack Proposal Bot <email address hidden>
Date: Thu Oct 12 08:12:38 2017 +0000

    Imported Translations from Zanata

    For more information about this automatic import see:
    https://docs.openstack.org/i18n/latest/reviewing-translation-import.html

    Change-Id: I628cb09aa78d8e339b4762a3c9ed8aed43941261

commit 45ca39fc68cdb42b382c1638a92cc8d3cec5529a
Author: Clay Gerrard <email address hidden>
Date: Tue Oct 10 11:47:50 2017 -0700

    add mangle_client_paths to example config

    Change-Id: Ic1126fc95e8152025fccf25356c253facce3e3ec

commit 94bac4ab2fe65104d602378e8e49c37b8187a75d
Author: Tim Burke <email address hidden>
Date: Fri May 12 10:55:21 2017 -0400

    domain_remap: stop mangling client-provided paths

    The root_path option for domain_remap seems to serve two purposes:
     - provide the first component (version) for the backend request
     - be an optional leading component for the client request, which
       should be stripped off

    As a result, we have mappings like:

       c.a.example.com/v1/o -> /v1/AUTH_a/c/o

    instead of

       c.a.example.com/v1/o -> /v1/AUTH_a/c/v1/o

    which is rather bizarre. Why on earth did we *ever* start doing this?

    Now, this second behavior is managed by a config option
    (mangle_client_paths) with the default being to disable it.

    Upgrade Consideration
    =====================

    If for some reason you *do* want to drop some parts of the
    client-supplied path, add

       mangle_client_paths = True

    to the [filter:domain_remap] section of your proxy-server.conf. Do this
    before upgrading to avoid any loss of availability.

    UpgradeImpact
    Change-Id: I87944bfbf8b767e1fc36dbc7910305fa1f11eeed

commit a4a5494fd2fe8a43a5d50a21a1951266cc7c4212
Author: Alistair Coles <email address hidden>
Date: Mon Oct 9 11:33:28 2017 +0100

    test account autocreate listing format

    Related-Change: Id3ce37aa0402e2d8dd5784ce329d7cb4fbaf700d
    Change-Id: I50c22225bbebff71600bea9158bda1edd18b48b0

commit 8b7f15223cde4c19fd9cbbd97e8ad79a1b4afa8d
Author: Alistair Coles <email address hidden>
Date: Mon Oct 9 10:06:19 2017 +0100

    Add example to container-sync-realms.conf.5 man page

    Related-Change: I0760ce149e6d74f2b3f1badebac3e36da1ab7e77

    Change-Id: I129de42f91d7924c7bcb9952f17fe8a1a10ae219

commit 816331155c624c444ed123bcab412...

tags: added: in-feature-s3api
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/swift 2.16.0

This issue was fixed in the openstack/swift 2.16.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/swift 2.15.2

This issue was fixed in the openstack/swift 2.15.2 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/swift ocata-eol

This issue was fixed in the openstack/swift ocata-eol release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.