OpenStack Object Storage (swift)

A response code of get using SLO is changed regarding a position of missing segment

Bug #1386568 reported by Hisashi Osanai on 2014-10-28

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Object Storage (swift)	Fix Released	Low	Hisashi Osanai	OpenStack Object Storage (swift) 2.2.2

Bug Description

[Description]
A response code of get using SLO is changed regarding a position of missing segment

I think SLO should behave same when users download an object even if
there is a difference for a position of missing segment.

[Version details]
swift 2.1.0

[Crystal clear details to reproduce the bug]

preparation:
    AUTH_TOKEN=<authentication token>
    ENDPOINT=<objectstorage endpoint>
    CONTAINER="slotest-container"
    MANIOBJECT="slotest-maniobj"
    SEGOBJECT1="segobj1"
    SEGOBJECT2="segobj2"
    SEGOBJECT3="segobj3"

(1) create files for the segments
    python -c 'open("./seg1", "w").write("a"*1024*1024)'
    python -c 'open("./seg2", "w").write("b"*1024*1024)'
    python -c 'open("./seg3", "w").write("c"*1024*1024)'

(2) create a container
curl -i -X PUT -H "X-Auth-Token: ${AUTH_TOKEN}" ${ENDPOINT}/${CONTAINER}

(3) upload the segments
    curl -i -X PUT -H "X-Auth-Token: ${AUTH_TOKEN}" -T "./seg1" ${ENDPOINT}/${CONTAINER}/${SEGOBJECT1}
    curl -i -X PUT -H "X-Auth-Token: ${AUTH_TOKEN}" -T "./seg2" ${ENDPOINT}/${CONTAINER}/${SEGOBJECT2}
    curl -i -X PUT -H "X-Auth-Token: ${AUTH_TOKEN}" -T "./seg3" ${ENDPOINT}/${CONTAINER}/${SEGOBJECT3}

(4) upload the manifest object
    echo "[
        {
            \"path\": \"${CONTAINER}/${SEGOBJECT1}\",
            \"etag\": \"7202826a7791073fe2787f0c94603278\",
            \"size_bytes\": 1048576
        },
        {
            \"path\": \"${CONTAINER}/${SEGOBJECT2}\",
            \"etag\": \"96767d2b46489f3520698a6df536dc4c\",
            \"size_bytes\": 1048576
        },
        {
            \"path\": \"${CONTAINER}/${SEGOBJECT3}\",
            \"etag\": \"95d674ce4178cc3ef807606ecb8ec0f5\",
            \"size_bytes\": 1048576
        }
    ]" > "./manilistfile.json"
    curl -i -X PUT -H "X-Auth-Token: ${AUTH_TOKEN}" -T "./manilistfile.json" ${ENDPOINT}/${CONTAINER}/${MANIOBJECT}?multipart-manifest=put

test case1 (in case of deleting the first object):
(1) delete the first object
curl -i -X DELETE -H "X-Auth-Token: ${AUTH_TOKEN}" ${ENDPOINT}/${CONTAINER}/${SEGOBJECT1}

(2) download the manifest object
curl -v -X GET -H "X-Auth-Token: ${AUTH_TOKEN}" -o "./test02_1st_seg_lost" ${ENDPOINT}/${CONTAINER}/${MANIOBJECT}

=== output ===
* About to connect() to 10.124.121.74 port 80 (#0)
* Trying 10.124.121.74... connected
* Connected to 10.124.121.74 (10.124.121.74) port 80 (#0)
> GET /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/slotest-maniobj HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: 10.124.121.74
> Accept: */*
> X-Auth-Token: snip
>
  % Total % Received % Xferd Average Speed Time Time Time Current
                                 Dload Upload Total Spent Left Speed
  0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0< HTTP/1.1 500 Internal Server Error <=== 500
< Server: GlassFish Server Open Source Edition 4.1
< X-Trans-Id: txa34efe453a504d6cbfa66-00544dc783
< Date: Mon, 27 Oct 2014 04:18:11 GMT
< Connection: close
< Content-Type: text/plain
< Content-Length: 17
<
{ [data not shown]
  0 17 0 17 0 0 145 0 --:--:-- --:--:-- --:--:-- 217* Closing connection #0

# ls -l
-rw-r--r-- 1 root root 9568 10月 27 13:10 2014 eplist.json
-rw-r--r-- 1 root root 14536 10月 27 13:10 2014 eplist_format.json
-rw-r--r-- 1 root root 423 10月 27 13:13 2014 manilistfile.json
-rw-r--r-- 1 root root 1048576 10月 27 13:11 2014 seg1
-rw-r--r-- 1 root root 1048576 10月 27 13:11 2014 seg2
-rw-r--r-- 1 root root 1048576 10月 27 13:11 2014 seg3
-rw-r--r-- 1 root root 3145728 10月 27 13:16 2014 test01_no_seg_lost
-rw-r--r-- 1 root root 17 10月 27 13:18 2014 test02_1st_seg_lost <=== failed to get the first object
===

test case2 (in case of deleting the first object):
(1) delete the first object
curl -i -X DELETE -H "X-Auth-Token: ${AUTH_TOKEN}" ${ENDPOINT}/${CONTAINER}/${SEGOBJECT2}

(2) download the manifest object
curl -v -X GET -H "X-Auth-Token: ${AUTH_TOKEN}" -o "./test03_2nd_seg_lost" ${ENDPOINT}/${CONTAINER}/${MANIOBJECT}

=== output ===
* About to connect() to 10.124.121.74 port 80 (#0)
* Trying 10.124.121.74... connected
* Connected to 10.124.121.74 (10.124.121.74) port 80 (#0)
> GET /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/slotest-maniobj HTTP/1.1
> User-Agent: curl/7.19.7 (x86_64-redhat-linux-gnu) libcurl/7.19.7 NSS/3.14.0.0 zlib/1.2.3 libidn/1.18 libssh2/1.4.2
> Host: 10.124.121.74
> Accept: */*
> X-Auth-Token: snip
>
  % Total % Received % Xferd Average Speed Time Time Time Current
                                 Dload Upload Total Spent Left Speed
  0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0< HTTP/1.1 200 OK <=== 200
< Server: GlassFish Server Open Source Edition 4.1
< Accept-Ranges: bytes
< Last-Modified: Mon, 27 Oct 2014 04:13:24 GMT
< Etag: "9b2371314d9a718bbfeb6d116811c394"
< X-Timestamp: 1414383203.08729
< X-Static-Large-Object: True
< X-Trans-Id: tx03ceaa2d161c4471abb0e-00544dc901
< Date: Mon, 27 Oct 2014 04:24:33 GMT
< Connection: keep-alive
< Content-Type: application/octet-stream
< Content-Length: 3145728
<
{ [data not shown]
33 3072k 33 1024k 0 0 35015 0 0:01:29 0:00:29 0:01:00 0* transfer closed with 2097152 bytes remaining to read
33 3072k 33 1024k 0 0 34394 0 0:01:31 0:00:30 0:01:01 0* Closing connection #0

curl: (18) transfer closed with 2097152 bytes remaining to read

# ls -l
-rw-r--r-- 1 root root 9568 10月 27 13:10 2014 eplist.json
-rw-r--r-- 1 root root 14536 10月 27 13:10 2014 eplist_format.json
-rw-r--r-- 1 root root 423 10月 27 13:13 2014 manilistfile.json
-rw-r--r-- 1 root root 1048576 10月 27 13:11 2014 seg1
-rw-r--r-- 1 root root 1048576 10月 27 13:11 2014 seg2
-rw-r--r-- 1 root root 1048576 10月 27 13:11 2014 seg3
-rw-r--r-- 1 root root 3145728 10月 27 13:16 2014 test01_no_seg_lost
-rw-r--r-- 1 root root 3145728 10月 27 13:20 2014 test01_no_seg_lost_2
-rw-r--r-- 1 root root 17 10月 27 13:18 2014 test02_1st_seg_lost
-rw-r--r-- 1 root root 1048576 10月 27 13:25 2014 test03_2nd_seg_lost <=== got the first object
===

[Test environment details]
-

[Actual results]
-

[Expected results]
Both cases the response code is 200 (same as DLO)
with 0 byte object for a missing segment.

I would like to have this if possible.

Revision history for this message

Samuel Merritt (torgomatic) wrote on 2014-10-28:

The only way to achieve this is to HEAD each segment to ensure it is present* prior to serving the object. Otherwise, the knowledge that a segment is missing comes too late to do any good.

Remember that an HTTP response consists of a status line, then headers, then body. Thus, after the body has started going out to the client, there is no further possibility to change the status or headers.

Now, a SLO manifest can reference up to 1000** segments, so in the worst case, that's 1000 HEAD requests that have to complete prior to sending the status code. If that takes more than a few seconds, many clients will time out.

It's actually even worse than that: a "segment" can be another SLO manifest, in which case the sub-manifest's segments are included recursively. The only restriction here is a depth restriction on the tree formed by SLO manifests, and that depth limit is 10. The end result is that a SLO manifest may reference up to 10^30 segments; even if a particular cluster were able to perform one HEAD request per nanosecond, it would still take roughly 31 trillion years to validate all the segments.

Unfortunately, this is not possible to fix due to resource constraints.

However, notice that the Content-Length header was set to 3145728 but only 1048576 bytes were received. A client can use this to notice the truncation and handle the error appropriately.

* Yes, there's a race condition here. I am deliberately ignoring it.

** Configurable, 1000 is the default.

Changed in swift:
status:	New → Won't Fix

Revision history for this message

clayg (clay-gerrard) wrote on 2014-10-28:

Should test case2 (the one that returns the 200 then blows up) says "in case of deleting an object other than the first object" instead of "in case of deleting the first object" - because seems to be what the curl commands describe and matches up more intuitively with how Sam described the limitation.

I'm sort of surprised we managed to get a 500 out the door when the first object 404's - go us! On the other hand, if consistency is more important to the client we could return the 200 in either case as soon as we retrieve the manifest, leaving the client to only have to deal with the mismatched etag or incomplete read error cases (even though you sorta always have to deal with 500 on the client, i'm a little worried that the request would just get retried to no avail?)

Can you check the response codes that get logged in this case? I actually think it should be closer to a 4xx response than a 5xx - but maybe either is reasonable.

Changed in swift:
status:	Won't Fix → Incomplete

Revision history for this message

Hisashi Osanai (osanai-hisashi) wrote on 2014-10-29:

@Samuel, Thanks for the detailed explanation.
@clayg, Thanks for the clarification.

I think I have same understanding with Samuel but my bug report was not enough explanation.
Let me explain again with clayg's advice.

First, I summarized the result of each test.
+-----------------------+------------------------------+--------------------------+------------------------+
| Test Cases | Response code *** | Received bytes |Expected result |
+-----------------------+------------------------------+--------------------------+------------------------+
| Test Case 1 * | 500 | 17 | (1) |
+-----------------------+------------------------------+--------------------------+------------------------+
| Test Case 2 ** | 200 | 1048576 | (2) |
+-----------------------+------------------------------+--------------------------+------------------------+

* in case of deleting the first object
** in case of deleting an object other than the first object (thanks clayg!)
*** Please search the original bug report with "<==="

Then, I would like to explain my expectation for these behaviors.
(2) looks good. (Samuel's comment is perfect explanation for this)
(1) The response code should be changed from 500 to other.
       My idea for this case is a response code is 200 like Test Case 2 and
       send 0 byte object for missing segment. (but not necessary to send
       3rd segment)
       Then, client checks the object with etag or content-lengh.

Revision history for this message

Hisashi Osanai (osanai-hisashi) wrote on 2014-10-29:

The above table was corrupted for viewing so I put same table on paste.openstack.org
http://paste.openstack.org/show/126246/

Revision history for this message

clayg (clay-gerrard) wrote on 2014-10-29:

Awesome! I'm honestly not sure what causes the 500 - are we explicitly waiting to call start response until we GET the first segment and then returning a HTTPServerError on purpose, or is there just some traceback blowing up?

We need logs...

Revision history for this message

Hisashi Osanai (osanai-hisashi) wrote on 2014-10-30:

Thanks for the comment.

In swift/common/request_helpers.py (L339-), SLO raises SegmentError then
it's cathed at the catch_errors middleware and mapped to 500.

                if not is_success(seg_resp.status_int): <===point 1
                    close_if_possible(seg_resp.app_iter)
                    raise SegmentError( <===point 2
                        'ERROR: While processing manifest %s, '
                        'got %d while retrieving %s' %
                        (self.name, seg_resp.status_int, seg_path))

I think that it is better to check raising SegmentError cases in the code.
If this is recognized a bug, I will check them.

I put the log but the swift version is 1.13.1 instead of 2.1.0 (same logic).

# grep "txa34efe453a504d6cbfa66-00544dc783" /var/log/swift/swift.log
Oct 27 13:18:11 localhost proxy-server: - - 27/Oct/2014/04/18/11 GET /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/segobj1%3Fmultipart-manifest%3Dget HTTP/1.0 499 - curl/7.19.7%20%28x86_64-redhat-linux-gnu%29%20libcurl/7.19.7%20NSS/3.14.0.0%20zlib/1.2.3%20libidn/1.18%20libssh2/1.4.2%20%20SLO%20MultipartGET MIInHAYJKoZIhvcN... - 70 - txa34efe453a504d6cbfa66-00544dc783 - 0.0147 SLO - 1414383491.511203051 1414383491.525928020
Oct 27 13:18:11 localhost proxy-server: ERROR: While processing manifest /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/slotest-maniobj, got 404 while retrieving /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/segobj1: #012Traceback (most recent call last):#012 File "/usr/lib/python2.6/site-packages/swift/common/request_helpers.py", line 307, in __iter__#012 (self.name, seg_resp.status_int, seg_path))#012SegmentError: ERROR: While processing manifest /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/slotest-maniobj, got 404 while retrieving /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/segobj1 (txn: txa34efe453a504d6cbfa66-00544dc783)
Oct 27 13:18:11 localhost catch_errors: Error: An error occurred: #012Traceback (most recent call last):#012 File "/usr/lib/python2.6/site-packages/swift/common/middleware/catch_errors.py", line 36, in handle_request#012 resp = self._app_call(env)#012 File "/usr/lib/python2.6/site-packages/swift/common/wsgi.py", line 528, in _app_call#012 first_chunk = resp.next()#012 File "/usr/lib/python2.6/site-packages/swift/common/middleware/proxy_logging.py", line 247, in iter_response#012 chunk = iterator.next()#012 File "/usr/lib/python2.6/site-packages/swift/common/request_helpers.py", line 307, in __iter__#012 (self.name, seg_resp.status_int, seg_path))#012SegmentError: ERROR: While processing manifest /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/slotest-maniobj, got 404 while retrieving /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/segobj1 (txn: txa34efe453a504d6cbfa66-00544dc783)

Thanks for the comment.

In swift/common/request_helpers.py (L339-), SLO raises SegmentError then 
it's cathed at the catch_errors middleware and mapped to 500.

I think that it is better to check raising SegmentError cases in the code. 
If this is recognized a bug, I will check them.

I put the log but the swift version is 1.13.1 instead of 2.1.0 (same logic).

# grep "txa34efe453a504d6cbfa66-00544dc783" /var/log/swift/swift.log
Oct 27 13:18:11 localhost proxy-server: - - 27/Oct/2014/04/18/11 GET /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/segobj1%3Fmultipart-manifest%3Dget HTTP/1.0 499 - curl/7.19.7%20%28x86_64-redhat-linux-gnu%29%20libcurl/7.19.7%20NSS/3.14.0.0%20zlib/1.2.3%20libidn/1.18%20libssh2/1.4.2%20%20SLO%20MultipartGET MIInHAYJKoZIhvcN... - 70 - txa34efe453a504d6cbfa66-00544dc783 - 0.0147 SLO - 1414383491.511203051 1414383491.525928020
Oct 27 13:18:11 localhost proxy-server: ERROR: While processing manifest /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/slotest-maniobj, got 404 while retrieving /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/segobj1: #012Traceback (most recent call last):#012  File "/usr/lib/python2.6/site-packages/swift/common/request_helpers.py", line 307, in __iter__#012    (self.name, seg_resp.status_int, seg_path))#012SegmentError: ERROR: While processing manifest /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/slotest-maniobj, got 404 while retrieving /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/segobj1 (txn: txa34efe453a504d6cbfa66-00544dc783)
Oct 27 13:18:11 localhost catch_errors: Error: An error occurred: #012Traceback (most recent call last):#012  File "/usr/lib/python2.6/site-packages/swift/common/middleware/catch_errors.py", line 36, in handle_request#012    resp = self._app_call(env)#012  File "/usr/lib/python2.6/site-packages/swift/common/wsgi.py", line 528, in _app_call#012    first_chunk = resp.next()#012  File "/usr/lib/python2.6/site-packages/swift/common/middleware/proxy_logging.py", line 247, in iter_response#012    chunk = iterator.next()#012  File "/usr/lib/python2.6/site-packages/swift/common/request_helpers.py", line 307, in __iter__#012    (self.name, seg_resp.status_int, seg_path))#012SegmentError: ERROR: While processing manifest /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/slotest-maniobj, got 404 while retrieving /v1/AUTH_683ca2b5c4de4ce4824c5939df04b44a/slotest-container/segobj1 (txn: txa34efe453a504d6cbfa66-00544dc783)

Revision history for this message

clayg (clay-gerrard) wrote on 2014-10-30:

well I think I'd be fine with seeing missing segment always return 200, I think letting the segment error bubble all the way out to catch errors is not optimal, and I would argue that regardless of client adoption returning a 500 on the first segment can't be clasified as pre-existing well defined behavior and reasonable to change. As you've accurately described any client that was handling the 500 case should also be handling the 200 incomplete read/etag error so I think we can fix this without worrying about a well behaved client getting broke any more than it already was.

Changed in swift:
status:	Incomplete → Confirmed
importance:	Undecided → Low

Revision history for this message

Hisashi Osanai (osanai-hisashi) wrote on 2014-11-04:

Thanks for the comment.
I try to make a patch so I would like to have this bug. Is it OK?

Hisashi Osanai (osanai-hisashi) on 2014-11-05

Changed in swift:
assignee:	nobody → Hisashi Osanai (osanai-hisashi)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2014-11-21: Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/136258

Changed in swift:
status:	Confirmed → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-01-05: Fix merged to swift (master)

#10

Reviewed: https://review.openstack.org/136258
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=5ca49ca92485b6ba868544f12fa524d9d7b666c6
Submitter: Jenkins
Branch: master

commit 5ca49ca92485b6ba868544f12fa524d9d7b666c6
Author: Hisashi Osanai <email address hidden>
Date: Wed Nov 26 05:25:01 2014 +0900

Fix the GET's response code when there is a missing segment in LO

This patch changes the response code from Internal Server Error to
Conflict when there is a missing segment and the position is first.

    Co-Authored-By: Samuel Merritt <email address hidden>
    Closes-Bug: #1386568
    Change-Id: Iac175b4dc6ac9081436738697a27fe669acce0eb

Changed in swift:
status:	In Progress → Fix Committed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-01-21: Fix proposed to swift (feature/ec)

#11

Fix proposed to branch: feature/ec
Review: https://review.openstack.org/148983

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-01-26: Fix merged to swift (feature/ec)

#12

Download full text (18.2 KiB)

Reviewed: https://review.openstack.org/148983
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=ef59cde83f176df9e36064f70142c9d8e81318fe
Submitter: Jenkins
Branch: feature/ec

commit 6f21504ccc9046e3f0b4db88f78297a00030dd3d
Author: Kota Tsuyuzaki <email address hidden>
Date: Tue Jan 13 05:34:37 2015 -0800

Fix missing content length of Response

This patch fixes swob.Response to set missing content
length correctly.

    When a child class of swob.Response is initialized with
    both "body" and "headers" arguments which includes content
    length, swob.Response might loose the acutual content length
    generated from the body because "headers" will overwrite the
    content length property after the body assignment.

    It'll cause the difference between headers's content length
    and acutual body length. This would affect mainly 3rd party
    middleware(s) to make an original response as follows:

    req = swob.Request.blank('/')
    req.method = 'HEAD'
    resp = req.get_response(app)
    return HTTPOk(body='Ok', headers=resp.headers)

This patch changes the order of headers updating and then
fixes init() to set correct content length.

Change-Id: Icd8b7cbfe6bbe2c7965175969af299a5eb7a74ef

commit b434be452ead0625728afedfe01bac1c30629d30
Author: Donagh McCabe <email address hidden>
Date: Thu Jan 8 14:52:32 2015 +0000

Use TCP_NODELAY on outgoing connections

    On a loopback device (e.g., when proxy-server and object-server are on
    same node), PUTs in the range 64-200K may experience a delay due to the
    effect of Nagel interacting with the loopback MTU of 64K.

    This effect has been directly seen by Mark Seger and Rick Jones on a
    proxy-server to object-server PUT. However, you could expect to see a
    similar effect on replication via ssync if the object being replicated
    is on a different drive on the same node.

A prior change [1] related to Nagel set TCP_NODELAY on responses. This change
sets it on all outgoing connections.

[1] I11f86df1f56fba1c6ab6084dc1f580c395f072dc

Change-Id: Ife8885a42b289a5eb4ac7e4698f8889858bc8b7e
Closes-bug: 1408622

commit b5586427e503ee22c0b20b109cad83e166ed3fd8
Author: Pete Zaitcev <email address hidden>
Date: Sat Jan 10 17:14:46 2015 -0700

Drop redundant index output

The output of format_device() now includes index as the first "dX"
element, for example d1r1z2-127.0.0.1:6200R127.0.0.1:6200/db_"".

Change-Id: Ib5f8e3a692fddbe8b1f4994787b2883130e9536f

commit c65bc49e099928801b80dce399d6098f7e10e137
Author: Pete Zaitcev <email address hidden>
Date: Sat Jan 10 08:20:25 2015 -0700

Mark the --region as mandatory

    We used to permit to omit region in the old parameter syntax, although
    we now throw a warning if it's missing. In the new parameter syntax,
    --region is mandatory. It's enforced by build_dev_from_opts in
    swift/common/ring/utils.py.

On the other hand, --replication-ip, --replication-port, and --meta
are not obligatory.

Change-Id: Ia70228f2c99595501271765286431f68e82e800b
...

Reviewed:  https://review.openstack.org/148983
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=ef59cde83f176df9e36064f70142c9d8e81318fe
Submitter: Jenkins
Branch:    feature/ec

commit 6f21504ccc9046e3f0b4db88f78297a00030dd3d
Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Date:   Tue Jan 13 05:34:37 2015 -0800

Fix missing content length of Response
    
    This patch fixes swob.Response to set missing content
    length correctly.
    
    When a child class of swob.Response is initialized with
    both "body" and "headers" arguments which includes content
    length, swob.Response might loose the acutual content length
    generated from the body because "headers" will overwrite the
    content length property after the body assignment.
    
    It'll cause the difference between headers's content length
    and acutual body length. This would affect mainly 3rd party
    middleware(s) to make an original response as follows:
    
    req = swob.Request.blank('/')
    req.method = 'HEAD'
    resp = req.get_response(app)
    return HTTPOk(body='Ok', headers=resp.headers)
    
    This patch changes the order of headers updating and then
    fixes init() to set correct content length.
    
    Change-Id: Icd8b7cbfe6bbe2c7965175969af299a5eb7a74ef

commit b434be452ead0625728afedfe01bac1c30629d30
Author: Donagh McCabe <donagh.mccabe@hp.com>
Date:   Thu Jan 8 14:52:32 2015 +0000

Use TCP_NODELAY on outgoing connections
    
    On a loopback device (e.g., when proxy-server and object-server are on
    same node), PUTs in the range 64-200K may experience a delay due to the
    effect of Nagel interacting with the loopback MTU of 64K.
    
    This effect has been directly seen by Mark Seger and Rick Jones on a
    proxy-server to object-server PUT. However, you could expect to see a
    similar effect on replication via ssync if the object being replicated
    is on a different drive on the same node.
    
    A prior change [1] related to Nagel set TCP_NODELAY on responses. This change
    sets it on all outgoing connections.
    
    [1] I11f86df1f56fba1c6ab6084dc1f580c395f072dc
    
    Change-Id: Ife8885a42b289a5eb4ac7e4698f8889858bc8b7e
    Closes-bug: 1408622

commit b5586427e503ee22c0b20b109cad83e166ed3fd8
Author: Pete Zaitcev <zaitcev@kotori.zaitcev.us>
Date:   Sat Jan 10 17:14:46 2015 -0700

Drop redundant index output
    
    The output of format_device() now includes index as the first "dX"
    element, for example d1r1z2-127.0.0.1:6200R127.0.0.1:6200/db_"".
    
    Change-Id: Ib5f8e3a692fddbe8b1f4994787b2883130e9536f

commit c65bc49e099928801b80dce399d6098f7e10e137
Author: Pete Zaitcev <zaitcev@kotori.zaitcev.us>
Date:   Sat Jan 10 08:20:25 2015 -0700

Mark the --region as mandatory
    
    We used to permit to omit region in the old parameter syntax, although
    we now throw a warning if it's missing. In the new parameter syntax,
    --region is mandatory. It's enforced by build_dev_from_opts in
    swift/common/ring/utils.py.
    
    On the other hand, --replication-ip, --replication-port, and --meta
    are not obligatory.
    
    Change-Id: Ia70228f2c99595501271765286431f68e82e800b

commit 99fa8b3f8e4dc024bab68899736a2881cc9fedf4
Author: Harshit <harshit@acelio.com>
Date:   Sat Jan 10 01:07:45 2015 -0800

Removing commented out test in test_db_replicator
    
    It removes test_dispatch test from test_db_replicator
    which has been commented out for a while.
    
    Change-Id: Ia28fa923a65ad7d85804cbf6f7acef244741bab1
    Closes-Bug: #1408502

commit 172a9b369f8e19d1dd6526a10787e79e4309e74e
Author: David Goetz <dpgoetz@gmail.com>
Date:   Fri Oct 3 12:11:06 2014 -0700

Change black/white-listing to use sysmeta.
    
    The way we do this now involves a conf change and a proxy
    reload which is a pain. You can now just set these:
    
    X-Account-Sysmeta-Global-Write-Ratelimit: WHITELIST
    
    or
    
    X-Account-Sysmeta-Global-Write-Ratelimit: BLACKLIST
    
    NOTE:
    The existing proxy config settings: account_whitelist
    and account_blacklist will continue to work.
    
    Change-Id: I532663f1d2c75d03170c5fdb9b330416822fbc88

commit 7958729198045a2fc95480e9713a4dde2f86ad01
Author: Alistair Coles <alistair.coles@hp.com>
Date:   Fri Jan 9 14:38:23 2015 +0000

Test that SLO disallows too small first segment if other segments
    
    SLO allows the first segment to be less than min_segment_size if
    it is the only segment. Current tests verify that a single small
    segment is allowed, and that multiple small segments are disallowed.
    This patch adds a test to verify that SLO will disallow a manifest
    with a small first segment followed by another correctly sized
    segment.
    
    Change-Id: I920c0aee38e4e16c49bd84a3b772308a00794fa7

commit 92fd28aa6afdb97a23027a2f02631eaf693a41b4
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Thu Jan 8 18:22:40 2015 -0800

Fix a few edges where we lost python 2.6 support
    
    a.k.a. put my head in the sand about the reality of not supporting python 2.6
    a little while longer.  We need to get something in the next release notes
    about deprecating support for python 2.6 ASAP.  I don't really care enough
    about it to keep cleaning up the junk we're going to let slip through not
    testing python 2.6 in the gate.
    
    Change-Id: Ib36cd66bda29d75d3b5f4ef0a0ef7b824923df28

commit 404ac092d19ef80a5f4d96e9cd36a5bd69499a1f
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Thu Dec 11 01:59:52 2014 -0800

Fix large out of sync out of date containers
    
    As I understand it db replication starts with a preflight sync request
    to the remote container server who's response will include the last
    synced row_id that it has on file for the sending nodes database id.
    
    If the difference in the last sync point returned is more than 50% of
    the local sending db's rows, it'll fall back to sending the whole db
    over rsync and let the remote end merge items locally - but generally
    there's just a few rows missing and they're shipped over the wire as
    json and stuffed into some rather normal looking merge_items calls.
    
    The one thing that's a bit different with these remote merge_items calls
    (compared to your average run of the mill eat a bunch of entries out of
    a .pending file) is the is source kwarg.  When this optional kwarg comes
    into merge_items it's the remote sending db's uuid, and after we eat all
    the rows it sent us we update our local incoming_sync table for that
    uuid so that next time when it makes it's pre-flight sync request we can
    tell it where it left off.
    
    Now normally the sending db is going to push out it's rows up from the
    returned sync_point in 1000 item diffs, up to 10 batches total (per_diff
    and max_diffs options) - 10K rows.  If that goes well then everything is
    in sync up to at least the point it started, and the sending db will
    *also* ship over *it's* incoming_sync rows to merge_syncs on the remote
    end.  Since the sending db is in sync with these other db's up to those
    points so is the remote db now by way of the transitive property.  Also
    note through some weird artifact that I'm not entirely convinced isn't
    an unrelated and possibly benign bug the incoming_sync table on the
    sending db will often also happen to include it's own uuid - maybe it
    got pushed back to it from another node?
    
    Anyway, that seemed to work well enough until a sending db got diff
    capped (i.e. sent it's 10K rows and wasn't finished), when this happened
    the final merge_syncs call never gets sent because the remote end is
    definitely *not* up to date with the other databases that the sending db
    is - it's not even up-to-date with the sending db yet!  But the hope is
    certainly that on the next pass it'll be able to finish sending the
    remaining items.  But since the remote end is who decides what the last
    successfully synced row with this local sending db was - it's super
    important that the incoming_sync table is getting updated in merge_items
    when that source kwarg is there.
    
    I observed this simple and straight forward process wasn't working well
    in one case - which is weird considering it didn't have much in the way
    of tests.  After I had the test and started looking into it seemed maybe
    the source kwarg handling got over-indented a bit in the bulk insert
    merge_items refactor.  I think this is correct - maybe we could send
    someone up to the mountain temple to seek out gholt?
    
    Change-Id: I4137388a97925814748ecc36b3ab5f1ac3309659

commit bcf26f52096e444fd03cbab26e016b3306b354df
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Wed Dec 17 13:48:42 2014 -0800

Add notion of overload to swift-ring-builder
    
    The ring builder's placement algorithm has two goals: first, to ensure
    that each partition has its replicas as far apart as possible, and
    second, to ensure that partitions are fairly distributed according to
    device weight. In many cases, it succeeds in both, but sometimes those
    goals conflict. When that happens, operators may want to relax the
    rules a little bit in order to reach a compromise solution.
    
    Imagine a cluster of 3 nodes (A, B, C), each with 20 identical disks,
    and using 3 replicas. The ring builder will place 1 replica of each
    partition on each node, as you'd expect.
    
    Now imagine that one disk fails in node C and is removed from the
    ring. The operator would probably be okay with remaining at 1 replica
    per node (unless their disks are really close to full), but to
    accomplish that, they have to multiply the weights of the other disks
    in node C by 20/19 to make C's total weight stay the same. Otherwise,
    the ring builder will move partitions around such that some partitions
    have replicas only on nodes A and B.
    
    If 14 more disks failed in node C, the operator would probably be okay
    with some data not living on C, as a 4x increase in storage
    requirements is likely to fill disks.
    
    This commit introduces the notion of "overload": how much extra
    partition space can be placed on each disk *over* what the weight
    dictates.
    
    For example, an overload of 0.1 means that a device can take up to 10%
    more partitions than its weight would imply in order to make the
    replica dispersion better.
    
    Overload only has an effect when replica-dispersion and device weights
    come into conflict.
    
    The overload is a single floating-point value for the builder
    file. Existing builders get an overload of 0.0, so there will be no
    behavior change on existing rings.
    
    In the example above, imagine the operator sets an overload of 0.112
    on his rings. If node C loses a drive, each other drive can take on up
    to 11.2% more data. Splitting the dead drive's partitions among the
    remaining 19 results in a 5.26% increase, so everything that was on
    node C stays on node C. If another disk dies, then we're up to an
    11.1% increase, and so everything still stays on node C. If a third
    disk dies, then we've reached the limits of the overload, so some
    partitions will begin to reside solely on nodes A and B.
    
    DocImpact
    
    Change-Id: I3593a1defcd63b6ed8eae9c1c66b9d3428b33864

commit 5b99ba1c8a78fe8cc1c5ad2ca554289188881919
Author: Dhriti Shikhar <dhrish20@gmail.com>
Date:   Wed Jan 7 00:13:33 2015 +0530

Substituted object storage paragraph with simple definition
    
    Change-Id: I32711fd10dfb1b84cbea9d05638b9ee002588104
    Closes-bug: #1373925

commit 1880351f1a862ae434ab23701535628f6f9258e1
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Wed Dec 10 15:59:21 2014 -0800

Only move too-close-together replicas when they can spread out.
    
    Imagine a 3-zone ring, and consider a partition in that ring with
    replicas placed as follows:
    
    * replica 0 is on device A (zone 2)
    * replica 1 is on device B (zone 1)
    * replica 2 is on device C (zone 2)
    
    Further, imagine that there are zero parts_wanted in all of zone 3;
    that is, zone 3 is completely full. However, zones 1 and 2 each have
    at least one parts_wanted on at least one device.
    
    When the ring builder goes to gather replicas to move, it gathers
    replica 0 because there are three zones available, but the replicas
    are only in two of them. Then, it places replica 0 in zone 1 or 2
    somewhere because those are the only zones with parts_wanted. Notice
    that this does *not* do anything to spread the partition out better.
    
    Then, on the next rebalance, replica 0 gets picked up and moved
    (again) but doesn't improve its placement (again).
    
    If your builder has min_part_hours > 0 (and it should), then replicas
    1 and 2 cannot move at all. A coworker observed the bug because a
    customer had such a partition, and its replica 2 was on a zero-weight
    device. He thought it odd that a zero-weight device should still have
    one partition on it despite the ring having been rebalanced dozens of
    times.
    
    Even if you don't have zero-weight devices, having a bunch of
    partitions trade places on each rebalance isn't particularly good.
    
    Note that this only happens with an unbalanceable ring; if the ring
    *can* balance, the gathered partitions will swap places, but they will
    get spread across more zones, so they won't get gathered up again on
    the next rebalance.
    
    Change-Id: I8f44f032caac25c44778a497dedf23f5cb61b6bb
    Closes-Bug: 1400083

commit fd8eb6b280ca15c0cfc9723c056cdac8548b34fd
Author: Alistair Coles <alistair.coles@hp.com>
Date:   Tue Jan 6 16:57:17 2015 +0000

Add undocumented options to keystoneauth sample config
    
    Adds is_admin and allow_overrides to the keystoneauth section
    of proxy-server.conf.sample and also adds related comments to
    the keystoneauth docstring.
    
    DocImpact
    
    Change-Id: I7c751880cb6742db7347f31c4d32b523e33da75b

commit 60504a9d23a8f910d883b448565dbfb798776415
Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Date:   Tue Jan 6 01:31:19 2015 -0800

Fix slo constraint according to the error message
    
    This patch allows to create a slo manifest when the manifest
    includes only one segment less than min_segment_size.
    
    When putting a manifest for slo with only one segment less than
    min_segment_size, it will fail as EntityTooSmall with the message
    "Each segment, except the last, must be at least min_segment_size
    bytes." This behavior is different from the message because when
    there is only one segment, the segment is absolutely the last
    segment.
    
    Change-Id: I8f0203afe55536207a41e1267128f8378f1ba15f

commit 42c790d04b85e2d2665da7c13f800d03b263a22f
Author: Prashanth Pai <ppai@redhat.com>
Date:   Thu Dec 25 14:26:28 2014 +0530

Avoid unnecessary unlink() on every successful PUT
    
    If you do a strace on object-server PUT operation, you'd see that
    there's an unlink() sys call which _always_ fails with ENOENT.
    
    mkstemp() creates a temp file which is renamed later to it's final
    object path in filesystem. Hence, on a successful object PUT, the
    tempfile would never exist in its original location after rename.
    
    Change-Id: I805c7c200107e2d56278f0fb35692a51cb1edc0b
    Signed-off-by: Prashanth Pai <ppai@redhat.com>

commit 199bf8fce45cfedfc060a00ede8f603110872c14
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Tue Jan 6 06:14:04 2015 +0000

Imported Translations from Transifex
    
    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure
    
    Change-Id: I50077e8a0a840f64b197fecf266f0c8fcd605804

commit 5ca49ca92485b6ba868544f12fa524d9d7b666c6
Author: Hisashi Osanai <osanai.hisashi@jp.fujitsu.com>
Date:   Wed Nov 26 05:25:01 2014 +0900

Fix the GET's response code when there is a missing segment in LO
    
    This patch changes the response code from Internal Server Error to
    Conflict when there is a missing segment and the position is first.
    
    Co-Authored-By: Samuel Merritt <sam@swiftstack.com>
    Closes-Bug: #1386568
    Change-Id: Iac175b4dc6ac9081436738697a27fe669acce0eb

commit bf4c78bc25303264d661ae144a46217e39007219
Author: Nicolas Trangez <ikke@nicolast.be>
Date:   Thu Dec 18 17:09:10 2014 +0100

Add tests for unavailability of `tee` and `splice` in `libc`
    
    As suggested by Paul Luse in review 135319 (for 2a0a8ae00f2), this
    brings test coverage of the `swift.common.splice` module up to 100%.
    
    The mechanism used to check whether the functions are looked up on `libc`
    is somewhat ugly, but using a `PropertyMock` raising an `AttributeError`
    as `side_effect` doesn't work: it results in `mock` creating a `Mock`
    instance and returning it.
    
    Change-Id: I14828cfc2ae644dbd9ead8c20613b19cea8607f1
    See: https://review.openstack.org/#/c/135319/4/swift/common/splice.py,cm
    See: 2a0a8ae00f2d3b7db255b0905b063e930f824f3d

commit 99d501831ebdba1a228022805ee1a4bb98ecd77a
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Mon Nov 17 20:29:45 2014 -0800

Consistently apply node error limiting rules in proxy
    
    All GET or HEAD requests consistently error limit nodes that return 507
    and increment errors for nodes responding with any other 5XX.
    
    There were two places in the object PUT path where the proxy was error
    limiting nodes and their behavior was inconsistent.  During expect-100
    connect we would only error_limit nodes on 507, and during response we
    would increment errors for all 5XX series responses.  This was pretty
    hard to reason about and the divergence in behavior of questionable
    value.
    
    An audit of base controller highlighted where make_requests would apply
    error_limit's on 507 but not increment errors on other 5XX responses.
    
    Now anywhere we track errors on nodes we use error_limit on 507 and
    error_occurred on any other 5XX series request.  Additionally a Timeout
    or Exception that is logged through exception_occurred will bump errors -
    which is consistent with the approach in "Add Error Limiting to slow
    nodes" [1].
    
    1. https://review.openstack.org/#/c/112424/
    
    Change-Id: I67e489d18afd6bdfc730bfdba76f85a2e3ca74f0

Thierry Carrez (ttx) on 2015-01-29

Changed in swift:
milestone:	none → 2.2.2
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.