DELETE operation not write affinity aware

Bug #1318375 reported by Caleb Tennis
26
This bug affects 5 people
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
High
Matthew Oliver

Bug Description

This was discussed previously in #1198926, though the fix applied there just masked the issue by adding a wait for replication to finalize before issuing a DELETE.

That bug dealt with a 2 replica cluster, but in the more common cases, 3 or 4 replicas, it's still an issue.

Examples:

In a 3 region 3 replica cluster, with write affinity enabled, a DELETE operation will return a 404 error because the nodes return (200, 404, 404).

In a 2 region cluster, with 4 replicas, with write affinity enabled, a DELETE operation will return a 503 because the nodes return (200, 200, 404, 404).

I don't know the best course of action here, just noting this as multiple folks have reported it to me.

Revision history for this message
Samuel Merritt (torgomatic) wrote :

One fix might be for a 404 on object DELETE to be treated as successful the same way 2xx is. After all, the object server will write out a tombstone, and replication will eventually ensure that the object is deleted everywhere.

This would increase the chances of DELETE then GET still being able to read the object, but that can happen on master already (DELETE hits primaries, GET finds it on a handoff), so it's probably not a big deal.

Revision history for this message
John Dickinson (notmyname) wrote :

Sam, I think that sounds like a reasonable solution.

Changed in swift:
importance: Undecided → High
Matthew Oliver (matt-0)
Changed in swift:
assignee: nobody → Matthew Oliver (matt-0)
Revision history for this message
Matthew Oliver (matt-0) wrote :

So I have 2 directions I can take this in.

1. Proxy change::
    - Modify the _make_request method in Controller base class to check for DELETE method and change a 404 into a 200, there by changing all DELETES for account, container and objs but making it work correctly in calculating the have_quorum and best_response methods.
    - This will allow logging on each (obj, cont and account) servers to log correctly sending a 404 when the file wasn't found (for whatever reason.
   - But it will change ALL 404's raised by servers.. which might be ok.

2. Obj server change.
  - Modify the object server to return HTTPOk objects instead of HTTPNotFound.
  - This will only effect object servers,
  - We can send 200's back on 404's we think are OK. (ie, only if deleted already).
  - but this will log 200's when really the file wasn't found in the obj server logs
  - this will fix deleted objects but not cont and acct servers.

I guess the real question is what is important, do we want to fix all acct, cont and container servers in one place, or more finely tune what 404's should be 200s?

Revision history for this message
Matthew Oliver (matt-0) wrote :

Based on Sam's suggestion, I guess 1 is closest to what he suggested.

I guess the other approach could be to do it in the obj controller class in the proxy, but that would mean overriding the method in question, giving us 2 methods to maintain.

OR

Changing the _make_requests method to pass in an optional override map, and then pass it in from the obj controller in the proxy.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/114120

Changed in swift:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/114120
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=f4d3facdf4b6ec8ee0dcacc7be8999731c68a8ec
Submitter: Jenkins
Branch: master

commit f4d3facdf4b6ec8ee0dcacc7be8999731c68a8ec
Author: Matthew Oliver <email address hidden>
Date: Thu Aug 14 14:39:18 2014 +1000

    Treat 404s as 204 on object delete in proxy

    This change adds an optional overrides map to _make_request method
    in the base Controller class.

      def make_requests(self, req, ring, part, method, path, headers,
                        query_string='', overrides=None)

    Which will be passed on the the best_response method. If set and
    no quorum it reached, the override map is used to attempt to find
    quorum.

    The overrides map is in the form:

        { <response>: <override response>, .. }

    The ObjectController, in the DELETE method now passes an override map
    to make_requests method in the base Controller class in the form of:

        { 404: 204 }

    Statuses/responses that have been overridden are used in calculation
    of the quorum but never returned to the user. They are replaced by:

        (STATUS, '', '', '')

    And left out of the search for best response.

    Change-Id: Ibf969eac3a09d67668d5275e808ed626152dd7eb
    Closes-Bug: 1318375

Changed in swift:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/ec)

Fix proposed to branch: feature/ec
Review: https://review.openstack.org/122541

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/ec)
Download full text (21.0 KiB)

Reviewed: https://review.openstack.org/122541
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=18901494e9b4b47fb181a5ead8ed82738db8563f
Submitter: Jenkins
Branch: feature/ec

commit fc5cee5f05692f7e6dd5ad5a6d0ae682dd4bf3e0
Author: Christian Schwede <email address hidden>
Date: Mon Sep 15 17:22:54 2014 +0000

    Allow filtering by region in swift-recon

    The option "-r" is already used, thus only "--region" is used to specify
    filter by region.

    Change-Id: If769f2f3191c202933b03b48fe0f22b7c94a4dd6
    Closes-Bug: 1369583

commit 423ac74e888dcd693129100e0b37a51428bb62e1
Author: Christian Schwede <email address hidden>
Date: Sun Sep 14 23:41:19 2014 +0200

    Fix internal link to keystoneauth in documentation

    This patch fixes a broken link at the end of the table in
    http://docs.openstack.org/developer/swift/logs.html#swift-source

    Change-Id: I989173ac93e0f840997333be0d5cec07eb77b304

commit 64548420c87f3163ed543c9e9a02a4f1abec69e0
Author: Andreas Jaeger <email address hidden>
Date: Sat Sep 13 09:48:14 2014 +0200

    Stop using intersphinx

    Remove intersphinx from the docs build as it triggers network calls that
    occasionally fail, and we don't really use intersphinx (links other
    sphinx documents out on the internet)

    This also removes the requirement for internet access during docs build.

    This can cause docs jobs to fail if the project errors out on
    warnings.

    Change-Id: I71e941e2a639641a662a163c682eb86d51de42fb
    Related-Bug: #1368910

commit 5c9835125802c51e2eb2823f5208d53c358a5e84
Author: Christian Schwede <email address hidden>
Date: Fri Sep 12 14:37:04 2014 +0000

    Fix RingBuilder._build_max_replicas_by_tier docstring

    The current docstring doesn't include zones, and the order of the
    entries is not up to date with the current code. Let's fix this.

    Change-Id: Ibabd79427b83d9e8c86b2caeb93dee219c8274c0

commit a03732e142540e5a7d6cb11de5232f0642beb20d
Author: Alistair Coles <email address hidden>
Date: Fri Sep 12 10:20:19 2014 +0100

    Add comments to clarify change to www-authenticate test

    Trivial patch to tidy-up change to the functional test for
    www-authenticate header and add a comment to explain
    that multiple header values might be returned.

    Change-Id: If62cb3fd9e11450a2be0cec71e80ecb74a959d04
    Related-bug: 1368048

commit ab96796dc8d1da9037330da0822c8b8d2264d192
Author: Alistair Coles <email address hidden>
Date: Thu Sep 11 10:23:32 2014 +0100

    Fix broken www-authenticate functional test

    testQuotedWWWAuthenticateHeader functional test started failing
    due to a change to keystonemiddleware.auth_token, which now adds
    its own www-authenticate header in addition to the one that swift
    keystoneauth adds.

    This patch changes the functional test to check expected
    swift generated header value is in the concatenation of
    www-authenticate values.

    Verified that functional tests still pass using tempauth.

    Closes-Bug: 1368048
    Change-Id: I913af077df800a559d259c1622f286ad10eae9df

commit f4d3facdf...

Thierry Carrez (ttx)
Changed in swift:
milestone: none → 2.2.0-rc1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in swift:
milestone: 2.2.0-rc1 → 2.2.0
Revision history for this message
clayg (clay-gerrard) wrote :

This can still cause problems.

When the return codes are [204, 404, 404] we currently return 404:

https://github.com/openstack/swift/blob/bcd0eb70afacae1483e9e53d5a4082536770aed8/test/unit/proxy/controllers/test_obj.py#L308

However, with a 2 region 3 replica geo cluster using write affinity - there's always some partitions where two out of three replicas are on handoffs.

That cluster topology isn't ideal for a number of reasons, but we *could* do better at the proxy later.

idea #1) use timestamps (or the lack thereof) to distinguish if the 204 response "missed" an earlier delete - i.g.

[204 ts2, 404 ts1, 404 ts1] => 404
[204 ts2, 404 ?, 404 ?] => 204

idea #2) if *any* response is 204, make additional DELETE requests to handoffs - i.e.

[404, 404, 404] = > 404
[404, 404, 204] => *second round to replica count handoffs*

The second idea could also help when have full disks with expired error limiting

[204, 507, 507] => *second round to throw tombstones on handoffs which has a fair chance to reap .data files and free up space.

/me shrugs

A combination of the both might be good - if we can find an elegant way to model it that isn't overly complicated and causes annoying additional requests in the common case.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.