Return fast 404's in multi-region cluster
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Won't Fix
|
Undecided
|
Unassigned |
Bug Description
In a multi-region cluster when getting a 404 it will go over the wire to check all remote disks and secondary locations before returning to the client. Ideally, when read affinity is enabled it would be good to have an additional setting to only check local region instead of all regions for existence of an object to return very fast on failure. In a multi-region replication scenario having such a setting would eliminate a great deal of over the wire latency to the remote regions when performing the lookup.
As Swift clusters continue to extend across multiple regions with object replicas in those regions , customers are expecting the same response times for their applications. Allowing the ability to fail fast would reduce this additional multi-region overhead for 404's.
description: | updated |
description: | updated |
description: | updated |
What if the only local copy is unavailable? Surely there is a use-case where after a 201 we should not return 404 just because we can't find it on the local primary and couple local handoffs?
I guess immediately following a PUT there'd be a good chance to find it on the local handoffs if you have write affinity enabled. But most of the time it's stupid to check a handoff until you've checked all the primaries!
Why does the application want Swift to be so quick to tell them it doesn't have something? Isn't the risk of not returning data that we have that the client asked for worth a little bit of latency - what are the states goals - just "faster?" I think we could accomplish making 404's faster without even *talking* to the object servers :P
I'm worried this might be an anti-feature. Here's a thing you *could* turn on but don't do it because in practice under failure it leads to clients observing horrible behaviors.
*Maybe* next to read affinity you could add a setting something like "I have 4 replicas globally, check 2 primaries, then the local handoff that would have been used in the affinity write case and then finally *one* of the two other primaries in the remote, but that's it" - still seems scary, with enough local primaries it's probably workable, and from the a network traversal situation it might still be a little better than the default 2 * replicas checks.
-Clay