A response code of get using SLO is changed regarding a position of missing segment
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Fix Released
|
Low
|
Hisashi Osanai |
Bug Description
[Description]
A response code of get using SLO is changed regarding a position of missing segment
I think SLO should behave same when users download an object even if
there is a difference for a position of missing segment.
[Version details]
swift 2.1.0
[Crystal clear details to reproduce the bug]
preparation:
AUTH_
ENDPOINT=
CONTAINER=
MANIOBJECT=
SEGOBJECT1=
SEGOBJECT2=
SEGOBJECT3=
(1) create files for the segments
python -c 'open("./seg1", "w").write(
python -c 'open("./seg2", "w").write(
python -c 'open("./seg3", "w").write(
(2) create a container
curl -i -X PUT -H "X-Auth-Token: ${AUTH_TOKEN}" ${ENDPOINT}
(3) upload the segments
curl -i -X PUT -H "X-Auth-Token: ${AUTH_TOKEN}" -T "./seg1" ${ENDPOINT}
curl -i -X PUT -H "X-Auth-Token: ${AUTH_TOKEN}" -T "./seg2" ${ENDPOINT}
curl -i -X PUT -H "X-Auth-Token: ${AUTH_TOKEN}" -T "./seg3" ${ENDPOINT}
(4) upload the manifest object
echo "[
{
},
{
},
{
}
]" > "./manilistfile
curl -i -X PUT -H "X-Auth-Token: ${AUTH_TOKEN}" -T "./manilistfile
test case1 (in case of deleting the first object):
(1) delete the first object
curl -i -X DELETE -H "X-Auth-Token: ${AUTH_TOKEN}" ${ENDPOINT}
(2) download the manifest object
curl -v -X GET -H "X-Auth-Token: ${AUTH_TOKEN}" -o "./test02_
=== output ===
* About to connect() to 10.124.121.74 port 80 (#0)
* Trying 10.124.121.74... connected
* Connected to 10.124.121.74 (10.124.121.74) port 80 (#0)
> GET /v1/AUTH_
> User-Agent: curl/7.19.7 (x86_64-
> Host: 10.124.121.74
> Accept: */*
> X-Auth-Token: snip
>
% Total % Received % Xferd Average Speed Time Time Time Current
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0< HTTP/1.1 500 Internal Server Error <=== 500
< Server: GlassFish Server Open Source Edition 4.1
< X-Trans-Id: txa34efe453a504
< Date: Mon, 27 Oct 2014 04:18:11 GMT
< Connection: close
< Content-Type: text/plain
< Content-Length: 17
<
{ [data not shown]
0 17 0 17 0 0 145 0 --:--:-- --:--:-- --:--:-- 217* Closing connection #0
# ls -l
-rw-r--r-- 1 root root 9568 10月 27 13:10 2014 eplist.json
-rw-r--r-- 1 root root 14536 10月 27 13:10 2014 eplist_format.json
-rw-r--r-- 1 root root 423 10月 27 13:13 2014 manilistfile.json
-rw-r--r-- 1 root root 1048576 10月 27 13:11 2014 seg1
-rw-r--r-- 1 root root 1048576 10月 27 13:11 2014 seg2
-rw-r--r-- 1 root root 1048576 10月 27 13:11 2014 seg3
-rw-r--r-- 1 root root 3145728 10月 27 13:16 2014 test01_no_seg_lost
-rw-r--r-- 1 root root 17 10月 27 13:18 2014 test02_1st_seg_lost <=== failed to get the first object
===
test case2 (in case of deleting the first object):
(1) delete the first object
curl -i -X DELETE -H "X-Auth-Token: ${AUTH_TOKEN}" ${ENDPOINT}
(2) download the manifest object
curl -v -X GET -H "X-Auth-Token: ${AUTH_TOKEN}" -o "./test03_
=== output ===
* About to connect() to 10.124.121.74 port 80 (#0)
* Trying 10.124.121.74... connected
* Connected to 10.124.121.74 (10.124.121.74) port 80 (#0)
> GET /v1/AUTH_
> User-Agent: curl/7.19.7 (x86_64-
> Host: 10.124.121.74
> Accept: */*
> X-Auth-Token: snip
>
% Total % Received % Xferd Average Speed Time Time Time Current
0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0< HTTP/1.1 200 OK <=== 200
< Server: GlassFish Server Open Source Edition 4.1
< Accept-Ranges: bytes
< Last-Modified: Mon, 27 Oct 2014 04:13:24 GMT
< Etag: "9b2371314d9a71
< X-Timestamp: 1414383203.08729
< X-Static-
< X-Trans-Id: tx03ceaa2d161c4
< Date: Mon, 27 Oct 2014 04:24:33 GMT
< Connection: keep-alive
< Content-Type: application/
< Content-Length: 3145728
<
{ [data not shown]
33 3072k 33 1024k 0 0 35015 0 0:01:29 0:00:29 0:01:00 0* transfer closed with 2097152 bytes remaining to read
33 3072k 33 1024k 0 0 34394 0 0:01:31 0:00:30 0:01:01 0* Closing connection #0
curl: (18) transfer closed with 2097152 bytes remaining to read
# ls -l
-rw-r--r-- 1 root root 9568 10月 27 13:10 2014 eplist.json
-rw-r--r-- 1 root root 14536 10月 27 13:10 2014 eplist_format.json
-rw-r--r-- 1 root root 423 10月 27 13:13 2014 manilistfile.json
-rw-r--r-- 1 root root 1048576 10月 27 13:11 2014 seg1
-rw-r--r-- 1 root root 1048576 10月 27 13:11 2014 seg2
-rw-r--r-- 1 root root 1048576 10月 27 13:11 2014 seg3
-rw-r--r-- 1 root root 3145728 10月 27 13:16 2014 test01_no_seg_lost
-rw-r--r-- 1 root root 3145728 10月 27 13:20 2014 test01_
-rw-r--r-- 1 root root 17 10月 27 13:18 2014 test02_1st_seg_lost
-rw-r--r-- 1 root root 1048576 10月 27 13:25 2014 test03_2nd_seg_lost <=== got the first object
===
[Test environment details]
-
[Actual results]
-
[Expected results]
Both cases the response code is 200 (same as DLO)
with 0 byte object for a missing segment.
I would like to have this if possible.
Changed in swift: | |
assignee: | nobody → Hisashi Osanai (osanai-hisashi) |
Changed in swift: | |
milestone: | none → 2.2.2 |
status: | Fix Committed → Fix Released |
The only way to achieve this is to HEAD each segment to ensure it is present* prior to serving the object. Otherwise, the knowledge that a segment is missing comes too late to do any good.
Remember that an HTTP response consists of a status line, then headers, then body. Thus, after the body has started going out to the client, there is no further possibility to change the status or headers.
Now, a SLO manifest can reference up to 1000** segments, so in the worst case, that's 1000 HEAD requests that have to complete prior to sending the status code. If that takes more than a few seconds, many clients will time out.
It's actually even worse than that: a "segment" can be another SLO manifest, in which case the sub-manifest's segments are included recursively. The only restriction here is a depth restriction on the tree formed by SLO manifests, and that depth limit is 10. The end result is that a SLO manifest may reference up to 10^30 segments; even if a particular cluster were able to perform one HEAD request per nanosecond, it would still take roughly 31 trillion years to validate all the segments.
Unfortunately, this is not possible to fix due to resource constraints.
However, notice that the Content-Length header was set to 3145728 but only 1048576 bytes were received. A client can use this to notice the truncation and handle the error appropriately.
* Yes, there's a race condition here. I am deliberately ignoring it.
** Configurable, 1000 is the default.