Socket leak on proxy->obj when HTTPRequestedRangeNotSatisfiable with erasure code
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| OpenStack Object Storage (swift) |
High
|
Samuel Merritt |
Bug Description
Issue:
proxy-server does not close connections with object-servers when the client requested a range that is not satisfiable for an object on an erasure code policy (no issue with replica policy)
How to reproduce on a SAIO:
eval $(swift -A http://
curl -H "X-Auth-Token: $OS_AUTH_TOKEN" $OS_STORAGE_
curl -H "X-Auth-Token: $OS_AUTH_TOKEN" $OS_STORAGE_
for i in $(seq 1 10); do curl -H "X-Auth-Token: $OS_AUTH_TOKEN" $OS_STORAGE_
netstat -tpn | grep -E "127.0.0.1:[0-9]+ +127.0.
# => many CLOSE_WAIT that will never be closed
Romain LE DISEZ (rledisez) wrote : | #1 |
Romain LE DISEZ (rledisez) wrote : | #2 |
FWIW, probably not the right way to do it so i'm not submitting a patch, but it does the job: http://
clayg (clay-gerrard) wrote : | #3 |
ubuntu@saio:~$ sudo netstat -tpn | grep -i close_wait
tcp 0 0 127.0.0.1:54264 127.0.0.3:6037 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:54278 127.0.0.3:6037 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:39970 127.0.0.1:6016 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:46518 127.0.0.1:6017 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:39960 127.0.0.1:6016 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:46508 127.0.0.1:6017 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:47166 127.0.0.2:6026 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:44300 127.0.0.4:6047 CLOSE_WAIT 21011/python
ubuntu@saio:~$ curl -H "X-Auth-Token: $OS_AUTH_TOKEN" $OS_STORAGE_
<html><h1>Requested Range Not Satisfiable<
ubuntu@saio:~$ sudo netstat -tpn | grep -i close_wait
tcp 0 0 127.0.0.1:54288 127.0.0.3:6037 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:41524 127.0.0.2:6027 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:44310 127.0.0.4:6047 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:54264 127.0.0.3:6037 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:39980 127.0.0.1:6016 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:54278 127.0.0.3:6037 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:39970 127.0.0.1:6016 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:46518 127.0.0.1:6017 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:39960 127.0.0.1:6016 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:46508 127.0.0.1:6017 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:47166 127.0.0.2:6026 CLOSE_WAIT 21011/python
tcp 0 0 127.0.0.1:44300 127.0.0.4:6047 CLOSE_WAIT 21011/python
Changed in swift: | |
status: | New → Confirmed |
importance: | Undecided → High |
Changed in swift: | |
assignee: | nobody → Samuel Merritt (torgomatic) |
Fix proposed to branch: master
Review: https:/
Changed in swift: | |
status: | Confirmed → In Progress |
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: master
commit f709eed41b9579c
Author: Samuel Merritt <email address hidden>
Date: Thu Dec 28 14:56:08 2017 -0800
Fix socket leak on 416 EC GET responses.
Sometimes, when handling an EC GET request with a Range header, the
object servers reply 206 to the proxy, but the proxy (correctly)
replies 416 to the client[1]. In that case, the connections to the object
servers were not being closed. This was due to improper error handling
in ECAppIter.
Since ECAppIter is intended to be a WSGI iterable, it expects to have
its close() method called when the caller is done with it. In this
particular case, the caller (ECAppIter.
close() when an exception was raised. Now it is.
[1] consider a 4+2 EC policy with segment size 1024, an 20 byte
object, and a request with "Range: bytes=21-50". The proxy needs whole
fragments to decode, so it asks the object server for "Range:
bytes=0-255" [2], the object server says 206, and then the proxy
realizes that the client's request is unsatisfiable and tells the
client 416.
[2] segment size 1024 and 4 data fragments means the fragments have
size 1024 / 4 = 256, hence "bytes=0-255" asks for the first whole
fragment
Change-Id: Ide2edf8c449c97
Closes-Bug: 1738804
Changed in swift: | |
status: | In Progress → Fix Released |
Fix proposed to branch: feature/s3api
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: feature/s3api
commit 88eea33ccd1875a
Author: Clay Gerrard <email address hidden>
Date: Thu Jan 11 13:36:09 2018 -0800
Recenter builder test expectation around random variance
... in order to make the test pass with more seeds and fail less
frequently in the gate.
Change-Id: I059e80af87fd33
commit d924fa759967b7c
Author: Samuel Merritt <email address hidden>
Date: Tue Jan 16 22:19:09 2018 -0800
Remove old post-as-copy leftovers from tests.
Since commit 1e79f828, we don't need to test with post_as_copy=True
any more since we haven't got post_as_copy at all.
Change-Id: I9c96ce0b812d87
commit dfa0c4e604fb931
Author: Alistair Coles <email address hidden>
Date: Wed Jan 17 12:04:45 2018 +0000
Preserve expiring object behaviour with old proxy-server
The related change [1] causes expiring object records to no longer be
created if the X-Delete-
server, but old proxies prior to [2] (i.e. releases prior to 1.9.0)
did not send this header.
The goal of [1] can be alternatively achieved by making expiring
object record creation be conditional on the X-Delete-At-Host header.
[1] Related-Change: I20fc2f42f590fd
[2] Related-Change: Id0873a3f2198ce
Change-Id: Ia0081693f01631
commit d707fc7b6d0ceb4
Author: Clay Gerrard <email address hidden>
Date: Tue Jan 16 16:30:13 2018 -0800
DRY out tests until the stone bleeds
Can we go deeper!?
Change-Id: Ibd3b06542aa1bf
commit ba8f1b1c3786df4
Author: Alistair Coles <email address hidden>
Date: Wed Jan 17 15:25:33 2018 +0000
Fix intermittent unit test failure
test_
failing intermittently due to rounding of float time
values.
Change-Id: Ia126ad6988f387
Closes-Bug: #1743804
commit e747f94313f315f
Author: Kota Tsuyuzaki <email address hidden>
Date: Wed Dec 27 14:37:29 2017 +0900
Fix InternalClient to drain response body if the request fails
If we don't drain the body, the proxy logging in the internal client
pipeline will log 499 client disconnect instead of actual error response
code.
For error responses, we try to do the most helpful thing using swob's
closing and caching response body attribute. For non-error responses
which are returned to the client, we endeavour to keep the app_iter
intact and unconsumed, trusting expecting the caller to do the right
thing is the only reasonable interface. We must cleanly close any WSGI
app_iter which we do not return to the client rega...
tags: | added: in-feature-s3api |
Fix proposed to branch: feature/deep
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: feature/deep
commit ddb13aa5eab03b6
Author: vxlinux <email address hidden>
Date: Sat Jan 20 17:23:35 2018 +0800
Remove redundant blank space in README.rst
Change-Id: If347476e3b9185
commit 12f874534925b52
Author: vxlinux <email address hidden>
Date: Fri Jan 19 16:54:26 2018 +0800
Add Docstrings to validate_
New common functions should have Docstrings
Change-Id: Icbb3cdf38509fd
commit d2034cd7b694682
Author: Clay Gerrard <email address hidden>
Date: Tue Jan 16 17:03:38 2018 -0800
Keep object-updater stats logging consistent
If we're going to encapsulate the stats tracking it seems reasonable if
we ever add any more metrics we can reduce the number of places we need
to update log messages.
Change-Id: I187cf6cfec1e0a
commit cd2c73fd955317a
Author: Tim Burke <email address hidden>
Date: Tue Jan 16 01:07:35 2018 +0000
internal_
This boils down to 404, 412, or 416; or 409 when we provided an
X-Timestamp.
This means, among other things, that the expirer won't issue 3 DELETEs
every cycle for every stale work item.
Related-Change: Icd63c80c73f864
Change-Id: Ie5f2d3824e040b
commit 222df9185782f59
Author: chengebj5238 <email address hidden>
Date: Thu Jan 18 17:03:11 2018 +0800
Modify redirection URL and broken URL
Change-Id: I9a04cb2fbe61e1
commit d1656e334959e09
Author: Tim Burke <email address hidden>
Date: Fri Jan 12 13:17:45 2018 -0800
slo: Send ETag header in 206 responses
Why weren't we doing that before?? The etag should be the same as for
GET/HEAD, and by sending it, we can assure resuming clients that they're
downlading the same object even if they didn't include an If-Match
header.
Change-Id: I4ccbd1ae3a909e
Related-Change: Ic11662eb5c7176
commit 88eea33ccd1875a
Author: Clay Gerrard <email address hidden>
Date: Thu Jan 11 13:36:09 2018 -0800
Recenter builder test expectation around random variance
... in order to make the test pass with more seeds and fail less
frequently in the gate.
Change-Id: I059e80af87fd33
commit f64c00b00aa8df3
Author: Samuel Merritt <email address hidden>
Date: Fri Jan 12 07:17:18 2018 -0800
Improve object-updater's stats logging
The object updater has five different stats, but its logging only told
you two of them (successes and failures), and it only told you after
finishing all the async_pendings for a device. If y...
tags: | added: in-feature-deep |
This issue was fixed in the openstack/swift 2.17.0 release.
Fix proposed to branch: stable/pike
Review: https:/
Reviewed: https:/
Committed: https:/
Submitter: Zuul
Branch: stable/pike
commit 8ff69e69bf4be0a
Author: Samuel Merritt <email address hidden>
Date: Thu Dec 28 14:56:08 2017 -0800
Fix socket leak on 416 EC GET responses.
Sometimes, when handling an EC GET request with a Range header, the
object servers reply 206 to the proxy, but the proxy (correctly)
replies 416 to the client[1]. In that case, the connections to the object
servers were not being closed. This was due to improper error handling
in ECAppIter.
Since ECAppIter is intended to be a WSGI iterable, it expects to have
its close() method called when the caller is done with it. In this
particular case, the caller (ECAppIter.
close() when an exception was raised. Now it is.
[1] consider a 4+2 EC policy with segment size 1024, an 20 byte
object, and a request with "Range: bytes=21-50". The proxy needs whole
fragments to decode, so it asks the object server for "Range:
bytes=0-255" [2], the object server says 206, and then the proxy
realizes that the client's request is unsatisfiable and tells the
client 416.
[2] segment size 1024 and 4 data fragments means the fragments have
size 1024 / 4 = 256, hence "bytes=0-255" asks for the first whole
fragment
Change-Id: Ide2edf8c449c97
Closes-Bug: 1738804
tags: | added: in-stable-pike |
This issue was fixed in the openstack/swift 2.15.2 release.
After few more tests, it seems the range value is important.
If the range begin is lower than 1M (1024*1024), it leaks: URL/range_ leak_ec/ obj -H "Range: bytes=1048575- 1048580" ; echo; done
for i in $(seq 1 10); do curl -H "X-Auth-Token: $OS_AUTH_TOKEN" $OS_STORAGE_
If the range begin is greater or equal than 1M, it does not leak: URL/range_ leak_ec/ obj -H "Range: bytes=1048576- 1048580" ; echo; done
for i in $(seq 1 10); do curl -H "X-Auth-Token: $OS_AUTH_TOKEN" $OS_STORAGE_