GET response returns 404 even if all primaries ERROR

Bug #1646933 reported by Chris Auston
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Medium
Unassigned

Bug Description

Below is a snippet of proxy and container server logs for a container GET that returned 404 despite connection timeouts to all three primaries. The container server logs are the first three handoffs for this container. Without concurrent_get it seems like this would have returned a 503 instead? I am curious what y'all think should happen in this situation. It seems to me that if all primaries failed and there is no 200 response from a handoff, then a 503 might be more intuitive.

Nov 25 13:10:30 proxy-b swift-proxy: ERROR with Container server 1.2.3.4:6001/sdi re: Trying to GET /v1/AUTH_abcdefab-abcd-abcd-abcd-abcdefabcdef/container: ConnectionTimeout (1.0s) (txn: txbc7d72d60ccf49959df2f-0058388ca5) (client_ip: 127.0.0.1)
Nov 25 13:10:30 proxy-b swift-proxy: ERROR with Container server 1.2.3.4:6001/sdi re: Trying to GET /v1/AUTH_abcdefab-abcd-abcd-abcd-abcdefabcdef/container: ConnectionTimeout (1.0s) (txn: txbc7d72d60ccf49959df2f-0058388ca5) (client_ip: 127.0.0.1)
Nov 25 13:10:31 proxy-b swift-proxy: ERROR with Container server 1.2.3.4:6001/sdb re: Trying to GET /v1/AUTH_abcdefab-abcd-abcd-abcd-abcdefabcdef/container: ConnectionTimeout (1.0s) (txn: txbc7d72d60ccf49959df2f-0058388ca5) (client_ip: 127.0.0.1)
Nov 25 13:10:31 proxy-b swift-proxy: 127.0.0.1 127.0.0.1 25/Nov/2016/19/10/31 GET /v1/AUTH_abcdefab-abcd-abcd-abcd-abcdefabcdef/container%3Fformat%3Dxml%26prefix%3DDATA%252F%26limit%3D10000 HTTP/1.0 404 - java-app/1.9.2%2C%20app-info account:user%2CAU... - 70 - txbc7d72d60ccf49959df2f-0058388ca5 - 2.0351 - - 1480101029.332170963 1480101031.367302895 0

Nov 25 13:10:30 meta-c swift-container: 2.3.4.5 - - [25/Nov/2016:19:10:30 +0000] "GET /sdg/52577/AUTH_abcdefab-abcd-abcd-abcd-abcdefabcdef/container" 404 - "GET http://lb.tld/v1/AUTH_abcdefab-abcd-abcd-abcd-abcdefabcdef/container?format=xml&prefix=DATA%2F&limit=10000" "txbc7d72d60ccf49959df2f-0058388ca5" "proxy-server 7198" 0.0005 "-" 166835 0
Nov 25 13:10:30 meta-c swift-container: 2.3.4.5 - - [25/Nov/2016:19:10:30 +0000] "GET /sdb/52577/AUTH_abcdefab-abcd-abcd-abcd-abcdefabcdef/container" 404 - "GET http://lb.tld/v1/AUTH_abcdefab-abcd-abcd-abcd-abcdefabcdef/container?format=xml&prefix=DATA%2F&limit=10000" "txbc7d72d60ccf49959df2f-0058388ca5" "proxy-server 7198" 0.0009 "-" 111082 0
Nov 25 13:10:30 meta-c swift-container: 2.3.4.5 - - [25/Nov/2016:19:10:30 +0000] "GET /sdh/52577/AUTH_abcdefab-abcd-abcd-abcd-abcdefabcdef/container" 404 - "GET http://lb.tld/v1/AUTH_abcdefab-abcd-abcd-abcd-abcdefabcdef/container?format=xml&prefix=DATA%2F&limit=10000" "txbc7d72d60ccf49959df2f-0058388ca5" "proxy-server 7198" 0.0005 "-" 232150 0

Revision history for this message
clayg (clay-gerrard) wrote :

how may devices do you have in the cluster? i'm struggling to parse those logs lines - was there 3 requests or 6?

After the primary timeouts there should have resulted in some additional handoff requests.

I'm not super convinced [Timeout(), Timeout(), Timeout(), 404, 404, 404] != 404

possibly related in diagnosing the issue:

https://review.openstack.org/#/c/406392
https://review.openstack.org/#/c/286589

Revision history for this message
Chris Auston (cfauston) wrote :

I don't have an exact count, but it is probably over 3000 disks.
There were six requests; sorry for the terse initial description, it's been a long day.

The replica count is 3. The ERRORs above are the three primaries. The requests that made it to container servers are the first 3 handoffs.

Revision history for this message
Matthew Oliver (matt-0) wrote :

I can see why this happened. You fired off (in the end) 6 requests to the container backend. the first 3 timed out but the next 3 got 404, so as Clay mentioned you got:

  [Timeout(), Timeout(), Timeout(), 404, 404, 404]

All it takes is 1 good response to serve up the data. Because that didn't happen, swift looks at the responses and finds the best response to send back. If we have a quorum of something then this is best.

The quorum size is 2 (because you have a replica count of 3).
So the proxy will check for quorum in http status code ranges, 100s, 200s, 300s, and then 400s. If it gets a quorum then it'll return, if not it returns 503. In your case the 404's get triggered.. and really I think it should, you got 3 valid 404's albeit from hand off nodes.

The fact is, what if the conatiner doesn't exist.. then it is a 404, if we can't find something then we shouldn't return a 503. At least that's my $0.02

Revision history for this message
Chris Auston (cfauston) wrote :

@matt-0 I agree with your analysis, but I can see the case for treating this as a 503. Without concurrent get I think only one primary is tried and if that was a Timeout then a 503 would be returned. IMO the results from a handoff are a longshot and should only be used if something is found that would otherwise return an error or not-found. In this case we could detect that all primaries failed and the container wasn't found on handoffs so we should return a 50x so the client can try again later. A 404 confuses and panics users that know their container should exist.

I think a 50x code would also help correlate these events with errors for monitoring. If you see counts for 503s and errors going up, then it might indicate a need to scale up. It would also help highlight that customers are being impacted and something needs to be done. The way it is now, a monitoring system would have to collect Timeouts by transaction ID and then look at a ring to determine if they were all primaries.

Revision history for this message
clayg (clay-gerrard) wrote :

> Without concurrent get I think only one primary is tried and if that was a Timeout then a 503 would be returned.

I don't believe this is true:

1) I think without concurrent gets we'd still make at least six or so requests

2) I don't think best_response cares if concurrent get's generated the responses or not

The nature and cardinality of this report is somewhat different depending on who's right.

I think I'm right, and if that holds - then we only have the one bug...

The bug as I understand it has to do with best_response putting some sort of priority or preference on the primary nodes and making better decisions based on the responses - which seems reasonable to me.

Also Matt and I have both confirm that issue, so this is complete AFAIK.

summary: - concurrent_get returning 404 after 3 primary errors
+ GET response returns 404 even if all primaries ERROR
Changed in swift:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
Chris Auston (cfauston) wrote :

Ok, I think you are right, too. I thought the waitfirst below was only called once, but in an error situation the green thread returns False so the loop would iterate again.

for node in nodes:
            pile.spawn(self._make_node_request, node, node_timeout,
                       self.app.logger.thread_locals)
            _timeout = self.app.concurrency_timeout \
                if pile.inflight < self.concurrency else None
            if pile.waitfirst(_timeout):
                break

Thanks for updating the info.

Revision history for this message
Andrey Groshrev (greenx) wrote :

The same when receiving objects under overload.
If the all primary nodes don't answer becouse timeouts, and the handoff-nodes honestly replied "404", then client received "best answer" from [404,404,404]
As I understand it, the key is somewhere here:

proxy/controllers/base.py
... snip ...
    def GETorHEAD_base(self, req, server_type, node_iter, partition, path,
                       concurrency=1, client_chunk_size=None):
... snip ...
        res = handler.get_working_response(req)
        if not res:
            res = self.best_response(
                req, handler.statuses, handler.reasons, handler.bodies,
                '%s %s' % (server_type, req.method),
                headers=handler.source_headers)
If "res" is empty after call "get_working_response", then means:
1. We send all requests in the quantity = "request_node_count".
2. No response has been received the code in [200 to 399].
3. handler.statuses contains only 4xx or 5xx, but best_response ignoring 5xx codes.

Proceeding from all this, it is necessary to somehow determine what to return.
404 - because the file does not exist anywhere, or
503 - if the timeouts intersected with 404.
While the only criterion that I see:
if the size of "handler.statuses" is equal to the counter node_iter:
  then best_resp....,
  else 503.

Revision history for this message
Tim Burke (1-tim-z) wrote :

There have been a bunch of changes related to this over the last year or two; I think it's safe to call this closed -- we should respond 503 in these sorts of situations now.

Return 503 when primary containers can't respond: https://github.com/openstack/swift/commit/563e1671
Ignore 404s from handoffs for objects when calculating quorum: https://github.com/openstack/swift/commit/3189410f
Use less responses from handoffs: https://github.com/openstack/swift/commit/28608222
Client should retry when there's just one 404 and a bunch of errors: https://github.com/openstack/swift/commit/754defc3

Changed in swift:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.