mid-upgrade clusters can cause versioned write errors

Bug #1562083 reported by John Dickinson
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Medium
Tim Burke

Bug Description

If a cluster has pre-2.6.0 container servers and post-2.6.0 proxy servers, then DELETE'ing from a versioned writes container will cause the oldest version to replace the current object instead of the newest version.

The reason is because the upgraded proxy server makes a reverse=True request to the container server and assumes the results it gets back are actually reversed. But the older container server ignores the reverse query parameter and return the results in normal order.

This can be mitigated by ensuring that container servers are upgraded before proxy servers.

A more permanent fix would be for the proxy to attempt determination of the order of the response from the container server and do the right thing.

Tim Burke (1-tim-z)
Changed in swift:
assignee: nobody → Tim Burke (1-tim-z)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/299686

Changed in swift:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/299686
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=ebf0b220127b14bec7c05f1bc0286728f27f39d1
Submitter: Jenkins
Branch: master

commit ebf0b220127b14bec7c05f1bc0286728f27f39d1
Author: Tim Burke <email address hidden>
Date: Wed Mar 30 14:19:00 2016 -0700

    Fix upgrade bug in versioned_writes

    Previously, versioned_writes assumed that all container servers would
    always have the latest Swift code, allowing them to return reversed
    listings. This could cause the wrong version of a file to be restored
    during rolling upgrades.

    Now, versioned_writes will check that the listing returned is actually
    reversed. If it isn't, we will revert to getting the full (in-order)
    listing of versions and reversing it on the proxy.

    Change-Id: Ib53574ff71961592426cb386ef00a75eb5824def
    Closes-Bug: 1562083

Changed in swift:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/302462

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/crypto)

Fix proposed to branch: feature/crypto
Review: https://review.openstack.org/304242

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: feature/crypto
Review: https://review.openstack.org/304251

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (stable/mitaka)

Reviewed: https://review.openstack.org/302462
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=3083f5700c0c12c8caef638b013730805ac06ff6
Submitter: Jenkins
Branch: stable/mitaka

commit 3083f5700c0c12c8caef638b013730805ac06ff6
Author: Tim Burke <email address hidden>
Date: Wed Mar 30 14:19:00 2016 -0700

    Fix upgrade bug in versioned_writes

    Previously, versioned_writes assumed that all container servers would
    always have the latest Swift code, allowing them to return reversed
    listings. This could cause the wrong version of a file to be restored
    during rolling upgrades.

    Now, versioned_writes will check that the listing returned is actually
    reversed. If it isn't, we will revert to getting the full (in-order)
    listing of versions and reversing it on the proxy.

    Change-Id: Ib53574ff71961592426cb386ef00a75eb5824def
    Closes-Bug: 1562083
    (cherry picked from commit ebf0b220127b14bec7c05f1bc0286728f27f39d1)

tags: added: in-stable-mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on swift (feature/crypto)

Change abandoned by Alistair Coles (<email address hidden>) on branch: feature/crypto
Review: https://review.openstack.org/304242

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/crypto)
Download full text (7.3 KiB)

Reviewed: https://review.openstack.org/304251
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=248b9c6679911261efec5a05c92e9c97216080ee
Submitter: Jenkins
Branch: feature/crypto

commit 33f06dc48f7bec2e128b44427fb429ad640cd486
Author: Ondřej Nový <email address hidden>
Date: Sat Apr 9 18:47:58 2016 +0200

    Fixed Sphinx errors

    doc/source/deployment_guide.rst:1372: ERROR: Malformed table.
    swift/obj/diskfile.py:docstring of swift.obj.diskfile.BaseDiskFileManager.yield_hashes:13: ERROR: Unexpected indentation.
    doc/source/ops_runbook/diagnose.rst:188: WARNING: Inline emphasis start-string without end-string.

    Change-Id: Id20eb62eb5baebb3814e7af5676badb94f17dee5

commit a057c409ec8a23290bc72c4fa45d55a1178f4828
Author: OpenStack Proposal Bot <email address hidden>
Date: Fri Apr 8 07:02:33 2016 +0000

    Imported Translations from Zanata

    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure

    Change-Id: I9f4330ec20463e4d303e8ba3b67f86813a914ac5

commit d09ef0da62b64067b04a980c643f77526a9078ac
Author: Alistair Coles <email address hidden>
Date: Wed Apr 6 15:40:42 2016 +0100

    Assert that ChunkWriteTimouts are not raised

    Follow up for change Ibbc89449e7878fc4215e47e3f7dfe4ae58a2d638
    to add a test assertion that the ChunkWriteTimeout contexts are
    exited without raising the timeout exception in
    iter_bytes_from_response_part().

    Change-Id: I6d323cb26779e457fb5940093a81b349b333a0af

commit 7c0f58ec2ed020186ca3f269153b184fc02bf37a
Author: OpenStack Proposal Bot <email address hidden>
Date: Thu Apr 7 07:00:08 2016 +0000

    Imported Translations from Zanata

    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure

    Change-Id: Ib80e3a759fa1e4a99576710607ad07fc5f259527

commit edc413b85ec2b703d7506be9c4801eb347611c58
Author: Nguyen Hung Phuong <email address hidden>
Date: Thu Apr 7 13:31:26 2016 +0700

    Fix typos in Swift files

    Change-Id: I39dbf55c094c42347b57ef67520abff9e6fc24bc

commit 95efd3f9035ec4141e1b182516f040a59a3e5aa6
Author: Samuel Merritt <email address hidden>
Date: Wed Mar 23 13:51:47 2016 -0700

    Fix infinite recursion during logging when syslog is down

    Change-Id: Ia9ecffc88ce43616977e141498e5ee404f2c29c4

commit 0bf518e3b0eeaf66653db6972525701cacfe6333
Author: Thiago da Silva <email address hidden>
Date: Wed Apr 6 16:58:36 2016 -0400

    remove unused current_status method

    Change-Id: I574919eaa14cadc800f3a1f6014221ee382ee7e0
    Signed-off-by: Thiago da Silva <email address hidden>

commit e15bceaa7e541c77f26a1f11ee2cbddbc871cbf1
Author: Kota Tsuyuzaki <email address hidden>
Date: Mon Dec 21 03:13:50 2015 -0800

    Refactor CORS unit tests

    This is a follow-up patch for https://review.openstack.org/#/c/258392/
    That one added good unit test cases for various kinds of
    allowe_origin like '*' or ''(empty). However, the result of handling
    in Swift proxy will depend on strict_cors_mode option configur...

Read more...

tags: added: in-feature-crypto
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/hummingbird)

Fix proposed to branch: feature/hummingbird
Review: https://review.openstack.org/323599

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/hummingbird)
Download full text (84.7 KiB)

Reviewed: https://review.openstack.org/323599
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=0330478b70d0a699a0f9c21ef87c7e639d92564b
Submitter: Jenkins
Branch: feature/hummingbird

commit 5fe392b562de3baed080704df433fb392cb4fb31
Author: Ondřej Nový <email address hidden>
Date: Tue May 31 16:25:50 2016 +0200

    Fixed typo

    Change-Id: I7a35c0076360c7a23cf405189828d3c252ec6708

commit b52eccb3b1ea0591f0040587228d3705b5d3f68d
Author: Clay Gerrard <email address hidden>
Date: Wed May 25 11:21:25 2016 -0700

    Clarify overload best practices in admin guide

    Change-Id: Ib7c08bdeab6374771bb8e2b05053e7e16973524d

commit f1fd50723bb84c4941e949895576733f6eb67793
Author: Christian Schwede <email address hidden>
Date: Wed May 25 09:53:31 2016 +0200

    Add dispersion --verbose example to admin guide

    Change-Id: I5f9cacedde2a329332ccf744800b6f2453e8b28e

commit b3ab715c055283ccfea9a504d6da20741d82e7ad
Author: Matthew Oliver <email address hidden>
Date: Wed May 25 14:35:54 2016 +1000

    Add ring-builder dispersion command to admin guide

    This change updates the admin guide to point out the dispersion command
    in swift-ring-builder and mentions the dispersion verbose table to make
    it more obvious to operators.

    Change-Id: I72b4c8b2d718e6063de0fdabbaf4f2b73694e0a4

commit fb7a8e9ab7596a36a6992a3a8f8c6d005a2c2829
Author: Tim Burke <email address hidden>
Date: Tue May 24 13:37:58 2016 -0700

    Add links to mitaka install guides

    Change-Id: I62331923751c521daded4468b5cc5f03655226bc

commit e09c4ee7800e82aa09ca2f6ae375420b766182a4
Author: Tim Burke <email address hidden>
Date: Fri Apr 29 12:12:00 2016 -0500

    Allow concurrent bulk deletes

    Before, server-side deletes of static large objects could take a long
    time to complete since the proxy would wait for a response to each
    segment DELETE before starting the next DELETE request.

    Now, operators can configure a concurrency factor for the slo and bulk
    middlewares to allow up to N concurrent DELETE requests. By default, two
    DELETE requests will be allowed at a time.

    Note that objects and containers are now deleted in separate passes, to
    reduce the likelihood of 409 Conflict responses when deleting
    containers.

    Upgrade Consideration
    =====================
    If operators have enabled the bulk or slo middlewares and would like to
    preserve the prior (single-threaded) DELETE behavior, they must add the
    following line to their [filter:slo] and [filter:bulk] proxy config
    sections:

       delete_concurrency = 1

    This may be done prior to upgrading Swift.

    UpgradeImpact
    Closes-Bug: 1524454
    Change-Id: I128374d74a4cef7a479b221fd15eec785cc4694a

commit 226557afc42c245e050d84162497f46341407ef7
Author: Tim Burke <email address hidden>
Date: Thu May 19 18:55:40 2016 -0700

    Turn on H703, so our translators don't punch us

    Change-Id: I4ce3068f79563e4d4296c6e1078bc12f0cf84c96
    Related-Bug: 1559431

commit 7b706926a8ed5bbcec3a678e868e301c9a6ed8f1
Author: Alistair Coles <email address hidden>
Date: Mon May ...

tags: added: in-feature-hummingbird
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/swift 2.8.0

This issue was fixed in the openstack/swift 2.8.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/swift 2.7.1

This issue was fixed in the openstack/swift 2.7.1 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.