Bug #1697543 “Ring refuses to save even when 100% parts move” : Bugs : OpenStack Object Storage (swift)

Revision history for this message

clayg (clay-gerrard) wrote on 2017-06-12:

#1

it's very delicate - you have to hold it just wrong Edit (2.0 KiB, text/x-sh)

Revision history for this message

clayg (clay-gerrard) wrote on 2017-06-27:

#2

With enough replicas and a failed device it's easier to see that we should look at delta_dispersion in addition to delta_balance:

https://gist.github.com/clayg/b0d0d41a382e70356bb58a1ee94d1b73

With the failed device on the server that's desperately trying to shed parts, and enough replicas -
balance will not change significantly from one invocation to the next while rebalance is busy fixing dispersion...

We should expect that as a desire-able behavior and use delta_dispersion to get over the hump:

ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder rebalance
Cowardly refusing to save rebalance as it did not change at least 1%.
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder |head
stuck.builder, build version 63, id a5b9fbd213bb4c20ab60eff2a2bb3a75
256 partitions, 13.000000 replicas, 1 regions, 1 zones, 52 devices, 100.00 balance, 100.00 dispersion
...
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder rebalance
Cowardly refusing to save rebalance as it did not change at least 1%.
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder rebalance -f
Reassigned 256 (100.00%) partitions. Balance is now 100.00. Dispersion is now 0.00
-------------------------------------------------------------------------------
NOTE: Balance of 100.00 indicates you should push this
ring, wait at least 0 hours, and rebalance/repush.
-------------------------------------------------------------------------------
ubuntu@saio:/vagrant/.scratch/rings/tata$ swift-ring-builder stuck.builder rebalance
Reassigned 255 (99.61%) partitions. Balance is now 1.56. Dispersion is now 0.00

Notice the delta_dispersion when "cowardly refusing to save rebalance" is *HUGE*

Changed in swift:
importance:	Low → Medium

Revision history for this message

clayg (clay-gerrard) wrote on 2017-06-30:

#3

I was pretty sure this changed would solve this

https://review.openstack.org/#/c/479012/

But maybe there's something else going on? I'd like to know what...

Revision history for this message

clayg (clay-gerrard) wrote on 2017-12-08:

#4

So if you have a sufficient number of replicas to move you can have 100 dispersion even after moving the maximum whole replicanth

If you go from 6 replicanths in z1 and 1 in z2 to 5 replicanths in z1 and 2 in z2 the dispersion metric should probably represent some kind of improvement - but I'm not sure how exactly... and we'd still need to related change.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-12-21: Fix merged to swift (master)

#5

Reviewed: https://review.openstack.org/479012
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=aa82d2cba82209f1bf3944c6d2a67965af5a1540
Submitter: Zuul
Branch: master

commit aa82d2cba82209f1bf3944c6d2a67965af5a1540
Author: Samuel Merritt <email address hidden>
Date: Thu Jun 29 10:23:38 2017 -0700

Save ring builder if dispersion changes

    There are cases where a rebalance improves dispersion, but doesn't
    improve balance. This is because the balance of a ring builder is
    taken to be the balance of its least-balanced device, so if there's a
    device that has no partitions, wants some, but can't get them, then
    we'll never save the ring builder even if every other device in the
    ring got better.

    We can detect this situation by looking at the dispersion number; if it
    changes, then the rebalance needs to be saved in order to continue to
    make progress.

Partial-Bug: #1697543

Change-Id: Ie239b958fc7e0547ffda2bebf61546bd4ef3d829

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-01-02: Fix proposed to swift (feature/deep)

#6

Fix proposed to branch: feature/deep
Review: https://review.openstack.org/530733

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-01-02: Fix merged to swift (feature/deep)

#7

Download full text (16.1 KiB)

Reviewed: https://review.openstack.org/530733
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=e2f780492487a31d4f8111d53cbde6c02d9d3237
Submitter: Zuul
Branch: feature/deep

commit 61fe6aae81d00597c777a64ac337a8dfb990f0c2
Author: Tim Burke <email address hidden>
Date: Tue Aug 22 22:40:58 2017 +0000

Better mock out OSErrors in test_replicator before raising them

Also, provide a return value for resp.read() so we hit a
pickle error instead of a type error.

Change-Id: I56141eee63ad1ceb2edf807432fa2516fabb15a6

commit 0bdec4661b5609ca1bf813a7ccd514e5d444b07f
Author: Kazuhiro MIYAHARA <email address hidden>
Date: Mon Dec 25 09:13:17 2017 +0000

Skip symlink + vw functional tests if symlink is not enabled

Functional tests for symlink and versioned writes run and result in
falure even if symlink is not enabled.

This patch fixes the functional tests to run only if both of
symlink and versioned writes are enabled.

Change-Id: I5ffd0b6436e56a805784baf5ceb722effdf74884

commit 17e6950aa08101b5f3bec0f2f9c32cfd5f51fa36
Author: Kazuhiro MIYAHARA <email address hidden>
Date: Fri Dec 22 02:18:09 2017 +0000

Fix manpage docs' daemon names

In current manpage docs, some of daemon names for concurrency
explanation is wrong.

This patch fixes the daemon names.

Change-Id: I2a505c9590ee3a3a7e37e8d949a10db36206faec

commit af2c2a6eb54d848eefc2d0a1b619e0b86eed2eb5
Author: Samuel Merritt <email address hidden>
Date: Thu Dec 21 10:43:39 2017 -0800

Fix sometimes-flaky container name functional test.

    You've got two test classes: TestContainer and TestContainerUTF8. They
    each try to create the same set of containers with names of varying
    lengths to make sure the container-name length limit is being honored.

    Also, each test class tries to clean up pre-existing data in its
    setUpClass method. If TestContainerUTF8 fails to delete a contaienr
    that TestContainer made, then its testContainerNameLimit method will
    fail because the container PUT response has status 202 instead of 201,
    which is because the container still existed from the prior test.

I've made the test consider both 201 and 202 as success. For purposes
of testing the maximum container name length, any 2xx is fine.

Change-Id: I7b343a8ed0d12537659c051ddf29226cefa78a8f

commit 609c757e698ff7893e1b1a0e32d088ad9d05ad95
Author: Clay Gerrard <email address hidden>
Date: Tue Dec 12 21:39:54 2017 -0800

functest for symlink + versioned writes

Co-Author: Alistair Coles <email address hidden>

Related-Change-Id: I838ed71bacb3e33916db8dd42c7880d5bb9f8e18
Change-Id: I0ccff1eafcfb3fdbdda9faf55a44c45b834e723a

commit bdd4eb6936b0e25aff5357bde876309ee5b032ec
Author: Andreas Jaeger <email address hidden>
Date: Wed Dec 20 07:14:03 2017 +0100

Install liberasurecode-devel for CentOS 7

    Since I747c2b8754effbc6ec82af3bf7543fd9599a6c14 we do not install
    the RDO package repository anymore and thus liberasurecode-devel
    cannot be installed.

For CentOS 7, remove liberasure...

Reviewed:  https://review.openstack.org/530733
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=e2f780492487a31d4f8111d53cbde6c02d9d3237
Submitter: Zuul
Branch:    feature/deep

commit 61fe6aae81d00597c777a64ac337a8dfb990f0c2
Author: Tim Burke <tim.burke@gmail.com>
Date:   Tue Aug 22 22:40:58 2017 +0000

Better mock out OSErrors in test_replicator before raising them
    
    Also, provide a return value for resp.read() so we hit a
    pickle error instead of a type error.
    
    Change-Id: I56141eee63ad1ceb2edf807432fa2516fabb15a6

commit 0bdec4661b5609ca1bf813a7ccd514e5d444b07f
Author: Kazuhiro MIYAHARA <miyahara.kazuhiro@lab.ntt.co.jp>
Date:   Mon Dec 25 09:13:17 2017 +0000

Skip symlink + vw functional tests if symlink is not enabled
    
    Functional tests for symlink and versioned writes run and result in
    falure even if symlink is not enabled.
    
    This patch fixes the functional tests to run only if both of
    symlink and versioned writes are enabled.
    
    Change-Id: I5ffd0b6436e56a805784baf5ceb722effdf74884

commit 17e6950aa08101b5f3bec0f2f9c32cfd5f51fa36
Author: Kazuhiro MIYAHARA <miyahara.kazuhiro@lab.ntt.co.jp>
Date:   Fri Dec 22 02:18:09 2017 +0000

Fix manpage docs' daemon names
    
    In current manpage docs, some of daemon names for concurrency
    explanation is wrong.
    
    This patch fixes the daemon names.
    
    Change-Id: I2a505c9590ee3a3a7e37e8d949a10db36206faec

commit af2c2a6eb54d848eefc2d0a1b619e0b86eed2eb5
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Thu Dec 21 10:43:39 2017 -0800

Fix sometimes-flaky container name functional test.
    
    You've got two test classes: TestContainer and TestContainerUTF8. They
    each try to create the same set of containers with names of varying
    lengths to make sure the container-name length limit is being honored.
    
    Also, each test class tries to clean up pre-existing data in its
    setUpClass method. If TestContainerUTF8 fails to delete a contaienr
    that TestContainer made, then its testContainerNameLimit method will
    fail because the container PUT response has status 202 instead of 201,
    which is because the container still existed from the prior test.
    
    I've made the test consider both 201 and 202 as success. For purposes
    of testing the maximum container name length, any 2xx is fine.
    
    Change-Id: I7b343a8ed0d12537659c051ddf29226cefa78a8f

commit 609c757e698ff7893e1b1a0e32d088ad9d05ad95
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Tue Dec 12 21:39:54 2017 -0800

functest for symlink + versioned writes
    
    Co-Author: Alistair Coles <alistairncoles@gmail.com>
    
    Related-Change-Id: I838ed71bacb3e33916db8dd42c7880d5bb9f8e18
    Change-Id: I0ccff1eafcfb3fdbdda9faf55a44c45b834e723a

commit bdd4eb6936b0e25aff5357bde876309ee5b032ec
Author: Andreas Jaeger <aj@suse.com>
Date:   Wed Dec 20 07:14:03 2017 +0100

Install liberasurecode-devel for CentOS 7
    
    Since I747c2b8754effbc6ec82af3bf7543fd9599a6c14 we do not install
    the RDO package repository anymore and thus liberasurecode-devel
    cannot be installed.
    
    For CentOS 7, remove liberasurecode-devel from bindep.txt and install it
    from test-setup.sh instead after enabling the RDO package repositories.
    
    Update python dependencies: CentOS 7 does not have python3. Fix the
    SUSE tags.
    
    Change-Id: I72aa6b5455dfb025f54e83334983ac280f04afb2

commit dc1c55c9a07c03fe85f4bcc52419a42d75ae30fa
Author: Andreas Jaeger <aj@suse.com>
Date:   Sun Dec 17 19:46:52 2017 +0100

Native Zuul v3 tox jobs
    
    Convert the legacy tox jobs to Zuul v3 native and use the
    tools/test_setup.sh script to setup a XFS file like it's done in the
    legacy job.
    
    Needed-By: Id2b5cff998ac3a825a8f515c7bae3b433f30d272
    Change-Id: I34ed9e1c4b822f700e42fb07937df7be72cbaf4e

commit a7da2232629bfd4d5c04f5169e51b1f57b6c9362
Author: Matthew Oliver <matt@oliver.net.au>
Date:   Tue Dec 19 05:47:20 2017 +0000

Fix intermittent problem in part_swapping test
    
    There is an intermittent failure in the test_part_swapping_problem
    test found in test/unit/common/ring/test_builder.py.
    
    The test does a rebalance, then changes the ring to test a specific
    problem, does some housekeeping and then rebalances again. The problem
    is the ringbuilder keeps track of where in the ring it started the
    last ring rebalance, saved in `_last_part_gather_start`.
    
    On a rebalance, or more specifically in `_gather_parts_for_balance` we
    then we will start somewhere on the other side of the ring with:
    
      quarter_turn = (self.parts // 4)
      random_half = random.randint(0, self.parts / 2)
      start = (self._last_part_gather_start + quarter_turn +
                     random_half) % self.parts
    
    Because we don't reset `_last_part_gather_start` when we change the ring
    in the test, there is edge case where if we are unlucky during both
    rebalances whereby both calls to randint returns a relatively large
    number pushes the start of the second rebalance to the wrong side
    of the ring. Actually it's more problematic, and only 1 large random
    and a one in the middle will cause it, maybe pictures help:
    
      rabalance 1 (r1): quarter_turn = 4, random_half = 5
      rebalance 2 (r2): quarter_turn = 4, random_half = 3
    
                                         r1                   r2
                                          |                    |
                                          v                    v
      array('H', [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1]),
      array('H', [1, 1, 1, 1, 2, 2, 2, 3, 3, 3, 2, 2, 2, 3, 3, 3]),
      array('H', [2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 3, 4, 4, 4, 4, 4])]
    
    Now when gathering for rebalance 2 it'll pick:
    
      array('H', [0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, X]),
      array('H', [X, X, 1, 1, 2, 2, 2, 3, 3, 3, 2, 2, 2, 3, 3, 3]),
      array('H', [2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 3, 4, 4, 4, 4, 4])]
    
    Which can cause the 3 attempts to gather and rebalance to be used up.
    This causes the intermittent failure seen in the bug.
    
    This patch solves this by resetting `_gather_parts_for_balance` to 0
    while we tidy up the ring change. Meaning we'll always start on the
    correct side of the ring.
    
    Change-Id: I0d3a69620d4734091dfa516efd0d6b2ed87e196b
    Closes-Bug: #1724356

commit 2cf5e7ceffba007d1ff2a429385c1f3994a59d65
Author: John Dickinson <me@not.mn>
Date:   Mon Dec 18 09:33:40 2017 -0800

fix SkipTest imports in functests so they can be run directly by nose
    
    Change-Id: I7ecc48f69ca677d5ecb0986ac4042688442355bb

commit aa82d2cba82209f1bf3944c6d2a67965af5a1540
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Thu Jun 29 10:23:38 2017 -0700

Save ring builder if dispersion changes
    
    There are cases where a rebalance improves dispersion, but doesn't
    improve balance. This is because the balance of a ring builder is
    taken to be the balance of its least-balanced device, so if there's a
    device that has no partitions, wants some, but can't get them, then
    we'll never save the ring builder even if every other device in the
    ring got better.
    
    We can detect this situation by looking at the dispersion number; if it
    changes, then the rebalance needs to be saved in order to continue to
    make progress.
    
    Partial-Bug: #1697543
    
    Change-Id: Ie239b958fc7e0547ffda2bebf61546bd4ef3d829

commit 1984353f0d6db7512e4ea147ecad9e14dfb318d4
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Fri Dec 15 12:36:47 2017 +0000

Move symlink versioning functional test
    
    The functional test for versioning symlinks is better located in
    test_versioned_writes where it can be added to
    TestObjectVersioning. This saves duplicated versioned_writes specific
    setup code in test_symlink, and has the benefit of the test being
    repeated for each of the versioned writes test subclasses.  With a
    small refactor this includes the test now running with
    x-history-location mode as well as x-versions-location mode.
    
    Related-Change: I838ed71bacb3e33916db8dd42c7880d5bb9f8e18
    Change-Id: If215446c558b61c1a8aea37ce6be8fcb5a9ea2f4

commit c579e99126b61466fb3b1628170cbca37ccacce3
Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Date:   Wed Dec 13 06:04:40 2017 +0000

Add more assertions for Symlink + Copy unit tests
    
    Related-Change-Id: I838ed71bacb3e33916db8dd42c7880d5bb9f8e18
    Change-Id: Ib4c8f0c83537b74bbdec8c2dd6acc99c039faa67

commit 097e975938befb939fa6821f50c061e2c7f42cef
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Thu Dec 14 16:17:29 2017 -0800

Remove symlink from xml listing response
    
    We've had some problems with brittle XML clients in the past - it might
    be safer to ask that clients that need symlink keys in listings from
    containers request in JSON.
    
    Change-Id: I4ac7457f3ccb10f9e471ec6dc6f0869d71712878

commit 7647defb0f201cbf85baef6404067bb0eb27321f
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Thu Dec 14 12:56:49 2017 -0800

rename utils function less like stdlib
    
    Related-Change-Id: I3436bf3724884fe252c6cb603243c1195f67b701
    Change-Id: I74199c62b46e4db93a76760ebf91d84e3e1e3cfc

commit b342a8147c38ebcf02c3ba21fd09ded4ca49f69b
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Tue Dec 12 17:32:55 2017 +0000

Assert X-Newest and X-Backend headers are propagated to symlink target
    
    Adds some assertions to verify that X-Newest and X-Backend-* headers are
    copied from an original object GET requets to the symlink target
    request.
    
    Change-Id: Idce92edd002dec34f5dbc5d3c28a4cbbd2fbdc60
    Related-Change: I838ed71bacb3e33916db8dd42c7880d5bb9f8e18

commit 1d5cf3e73067751c8c8fd4f7f58c205db9b877a1
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Thu Dec 14 12:15:19 2017 -0800

add symlink to probetest for reconciler
    
    Change-Id: Ib2c5616f2965ab92b1c76d573e869206c91464c6

commit 90134ee968f6f6442eedb5548ee292fc03c77c2a
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Thu Dec 14 12:09:04 2017 -0800

add symlink to container sync default and sample config
    
    Change-Id: I83e51daa994b9527eccbb8a88952c630aacd39c3

commit 8df263184be136f9ab203a2971b4f47a52f8b431
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Tue Dec 12 17:26:03 2017 +0000

Symlink doc clean up
    
    Cleanup for docs and docstrings.
    
    Related-Change: I838ed71bacb3e33916db8dd42c7880d5bb9f8e18
    Change-Id: Ie8de0565dfaca5bd8a5693a75e6ee14ded5b7161

commit 1b6842deafe485bc87c48e1797997d599e7411c9
Author: Mahati Chamarthy <mahati.chamarthy@gmail.com>
Date:   Thu Dec 14 14:01:13 2017 +0530

add name to core emeritus
    
    Change-Id: Icab1c646ec8c9062580197482b1fd924bbc6c4bd

commit 99b89aea1051208e3d71afa35fcd62035e702628
Author: Robert Francis <robefran@ca.ibm.com>
Date:   Wed Oct 7 15:14:58 2015 -0400

Symlink implementation.
    
    Add a symbolic link ("symlink") object support to Swift. This
    object will reference another object. GET and HEAD
    requests for a symlink object will operate on the referenced object.
    DELETE and PUT requests for a symlink object will operate on the
    symlink object, not the referenced object, and will delete or
    overwrite it, respectively.
    POST requests are *not* forwarded to the referenced object and should
    be sent directly. POST requests sent to a symlink object will
    result in a 307 Error.
    
    Historical information on symlink design can be found here:
    https://github.com/openstack/swift-specs/blob/master/specs/in_progress/symlinks.rst.
    https://etherpad.openstack.org/p/swift_symlinks
    
    Co-Authored-By: Thiago da Silva <thiago@redhat.com>
    Co-Authored-By: Janie Richling <jrichli@us.ibm.com>
    Co-Authored-By: Kazuhiro MIYAHARA <miyahara.kazuhiro@lab.ntt.co.jp>
    Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
    
    Change-Id: I838ed71bacb3e33916db8dd42c7880d5bb9f8e18
    Signed-off-by: Thiago da Silva <thiago@redhat.com>

commit b86bf15a644db4438770801a312fe074a09c91ef
Author: Thiago da Silva <thiago@redhat.com>
Date:   Mon Dec 11 06:59:59 2017 -0500

remove left-over old doc in config file
    
    Removing the rest of fast post documentation
    from proxy configuration file.
    
    Change-Id: I6108be20606f256a020a8583878e2302e3c98ff1

commit 8182fa75a167d034e29c04680502ea9877592c9a
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Fri Dec 8 14:08:25 2017 -0800

Fix small error in a doc string
    
    Change-Id: I1c743fdea637ce047d09a49db0a43b2eb37305fa

commit dd113ab25a3089fa19c8e824c2d89db0ca3db0fa
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Tue Dec 5 10:36:38 2017 -0800

Refactor proxy-server conf loading to a helper function
    
    There were two middlewares using a common pattern to load
    the proxy-server app config section. The existing pattern
    fails to recognise option overrides that are declared using
    paste-deploy's 'set' notation, as illustrated by the change
    to test_dlo.py in this patch.
    
    This patch replaces the existing code with a helper function
    that loads the proxy-server config using the paste-deploy loader.
    The resulting config dict is therefore exactly the same as that
    used to initialise the proxy-server app.
    
    Change-Id: Ib58ce03e2010f41e7eb11f1a6dc78b0b7f55d466

commit 84ea58b8c81814a3c4d450145bfb9e70166dd60b
Author: Christopher Bartz <bartz@dkrz.de>
Date:   Mon Dec 4 14:53:44 2017 +0100

Ringbuilder: Forbid writing empty rings
    
    Swift definitely can't make any use of empty rings, so it should
    not be allowed to write them.
    
    Replace warning with an error message & error exit.
    
    Change-Id: I3a1b86368d363e67d1f91d7d8af4b391a0a53fff
    Closes-Bug: #1396841

commit fc12d63c76ffd51a5d524caa92b78d29bc4e6a7d
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Wed Dec 6 10:29:52 2017 -0800

Remove repeated text from deployment guide
    
    Fix what appears to be a cut and paste error.
    
    Change-Id: Iccf97ebbf75c8f97095a4493ea6a8beb074df099

commit bc2e03d1a6cb1411a2eee8218cce53f951100eb1
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Tue Dec 5 13:45:18 2017 -0800

Add --swift-versions option to swift-recon man page
    
    Related-Change: I3c2e569f0c44168333251bb58bab4b5582e15a45
    
    Change-Id: I9776c0919164e48ac445eacf7d897a23ef8e4572

commit 206f674014fb94a63ebdae91e3eca38956055b07
Author: Peter Lisák <peter.lisak@gmail.com>
Date:   Tue Dec 20 16:27:36 2016 +0100

Added swift version to recon cli
    
    Show swift version and check if the same on hosts by using
    `swift-recon --swift-version`.
    
    Ex:
    $ swift-recon --swift-versions
    
    Versions matched (2.7.1.dev144), 0 error[s] while checking hosts.
    
    or if differs
    
    Versions not matched (2.7.1.dev144, 2.7.1.dev145),
    0 error[s] while checking hosts.
    
    Change-Id: I3c2e569f0c44168333251bb58bab4b5582e15a45

commit 924f0d28e9670163dc47206cc1472ef228fd0082
Author: Tim Burke <tim.burke@gmail.com>
Date:   Wed Nov 22 16:51:06 2017 -0800

dlo: Move conn2 business to the one test that uses it
    
    ...and skip it if we don't have the required user
    
    Change-Id: I8700a587d5b8acff1f0255529b6ddaeadaaa6548

commit 396380f3406bdcd842e85a66cb346bf219cbb4f3
Author: Tim Burke <tim.burke@gmail.com>
Date:   Thu Nov 2 11:43:16 2017 -0700

Better handle missing files in _construct_from_data_file
    
    There's a window between when we list the files on disk and when we actually
    try to open the .data file where another process could delete it. That should
    raise a DiskFileNotExist error, not an IOError.
    
    Change-Id: I1b43fef35949cb6f71997874e4e6b7646eeec8fe
    Closes-Bug: 1712662

commit 31dd95ee91589be937565692a1610ba49db29706
Author: Tim Burke <tim.burke@gmail.com>
Date:   Thu Oct 12 22:27:35 2017 +0000

Add base64decode function to common/utils
    
    Keymaster middleware does some nice input validation on
    base64-encoded strings; pull that out somewhere common so
    other things (like SLOs with inlined data) can use it, too.
    
    Change-Id: I3436bf3724884fe252c6cb603243c1195f67b701

tags:

added: in-feature-deep

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-01-03: Fix merged to swift (master)

#8

Reviewed: https://review.openstack.org/528155
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=7013e70ca67891e94664e9eca70925b61ee8f689
Submitter: Zuul
Branch: master

commit 7013e70ca67891e94664e9eca70925b61ee8f689
Author: Clay Gerrard <email address hidden>
Date: Thu Dec 14 20:03:24 2017 -0800

Represent dispersion worse than one replicanth

    With a sufficiently undispersed ring it's possible to move an entire
    replicas worth of parts and yet the value of dispersion may not get any
    better (even though in reality dispersion has dramatically improved).
    The problem is dispersion will currently only represent up to one whole
    replica worth of parts being undispersed.

    However with EC rings it's possible for more than one whole replicas
    worth of partitions to be undispersed, in these cases the builder will
    require multiple rebalance operations to fully disperse replicas - but
    the dispersion value should improve with every rebalance.

    N.B. with this change it's possible for rings with a bad dispersion
    value to measure as having a significantly smaller dispersion value
    after a rebalance (even though they may not have had their dispersion
    change) because the total amount of bad dispersion we can measure has
    been increased but we're normalizing within a similar range.

Closes-Bug: #1697543

Change-Id: Ifefff0260deac0c3e8b369a1e158686c89936686

Changed in swift:
status:	Confirmed → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-01-19: Fix proposed to swift (feature/s3api)

#9

Fix proposed to branch: feature/s3api
Review: https://review.openstack.org/535623

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-01-19: Fix merged to swift (feature/s3api)

#10

Download full text (33.3 KiB)

Reviewed: https://review.openstack.org/535623
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=271b80d0f51078719de35bf6f75b7e06ac3e5b91
Submitter: Zuul
Branch: feature/s3api

commit 88eea33ccd1875af811b59d15df55e2bffa27f77
Author: Clay Gerrard <email address hidden>
Date: Thu Jan 11 13:36:09 2018 -0800

Recenter builder test expectation around random variance

... in order to make the test pass with more seeds and fail less
frequently in the gate.

Change-Id: I059e80af87fd33a3b6c0731fbad62e035215eca5

commit d924fa759967b7cdca0d91f21112725f6099a254
Author: Samuel Merritt <email address hidden>
Date: Tue Jan 16 22:19:09 2018 -0800

Remove old post-as-copy leftovers from tests.

Since commit 1e79f828, we don't need to test with post_as_copy=True
any more since we haven't got post_as_copy at all.

Change-Id: I9c96ce0b812d877bbe11bdb50eb160d6ffa5933d

commit dfa0c4e604fb931d232395599bd0e7b0f11441ee
Author: Alistair Coles <email address hidden>
Date: Wed Jan 17 12:04:45 2018 +0000

Preserve expiring object behaviour with old proxy-server

    The related change [1] causes expiring object records to no longer be
    created if the X-Delete-At-Container header is not sent to the object
    server, but old proxies prior to [2] (i.e. releases prior to 1.9.0)
    did not send this header.

The goal of [1] can be alternatively achieved by making expiring
object record creation be conditional on the X-Delete-At-Host header.

[1] Related-Change: I20fc2f42f590fda995814a2fa7ba86019f9fddc1
[2] Related-Change: Id0873a3f2198ce285fe0b0c777738eff38bc2438

Change-Id: Ia0081693f01631d3f2a59612308683e939ced76a

commit d707fc7b6d0ceb4556dddfc258c5de8c4baff05c
Author: Clay Gerrard <email address hidden>
Date: Tue Jan 16 16:30:13 2018 -0800

DRY out tests until the stone bleeds

Can we go deeper!?

Change-Id: Ibd3b06542aa1bfcbcb71cc98e6bb21a6a67c12f4

commit ba8f1b1c3786df4e79fc3f9e4747d7cfb9072b6f
Author: Alistair Coles <email address hidden>
Date: Wed Jan 17 15:25:33 2018 +0000

Fix intermittent unit test failure

    test_check_delete_headers_removes_delete_after was
    failing intermittently due to rounding of float time
    values.

Change-Id: Ia126ad6988f387bbd2d1f5ddff0a56d457a1fc9b
Closes-Bug: #1743804

commit e747f94313f315fdf8d8fc01fb0c5aac60c33897
Author: Kota Tsuyuzaki <email address hidden>
Date: Wed Dec 27 14:37:29 2017 +0900

Fix InternalClient to drain response body if the request fails

    If we don't drain the body, the proxy logging in the internal client
    pipeline will log 499 client disconnect instead of actual error response
    code.

    For error responses, we try to do the most helpful thing using swob's
    closing and caching response body attribute. For non-error responses
    which are returned to the client, we endeavour to keep the app_iter
    intact and unconsumed, trusting expecting the caller to do the right
    thing is the only reasonable interface. We must cleanly close any WSGI
    app_iter which we do not return to the client rega...

Reviewed:  https://review.openstack.org/535623
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=271b80d0f51078719de35bf6f75b7e06ac3e5b91
Submitter: Zuul
Branch:    feature/s3api

commit 88eea33ccd1875af811b59d15df55e2bffa27f77
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Thu Jan 11 13:36:09 2018 -0800

Recenter builder test expectation around random variance
    
    ... in order to make the test pass with more seeds and fail less
    frequently in the gate.
    
    Change-Id: I059e80af87fd33a3b6c0731fbad62e035215eca5

commit d924fa759967b7cdca0d91f21112725f6099a254
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Tue Jan 16 22:19:09 2018 -0800

Remove old post-as-copy leftovers from tests.
    
    Since commit 1e79f828, we don't need to test with post_as_copy=True
    any more since we haven't got post_as_copy at all.
    
    Change-Id: I9c96ce0b812d877bbe11bdb50eb160d6ffa5933d

commit dfa0c4e604fb931d232395599bd0e7b0f11441ee
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Wed Jan 17 12:04:45 2018 +0000

Preserve expiring object behaviour with old proxy-server
    
    The related change [1] causes expiring object records to no longer be
    created if the X-Delete-At-Container header is not sent to the object
    server, but old proxies prior to [2] (i.e. releases prior to 1.9.0)
    did not send this header.
    
    The goal of [1] can be alternatively achieved by making expiring
    object record creation be conditional on the X-Delete-At-Host header.
    
    [1] Related-Change: I20fc2f42f590fda995814a2fa7ba86019f9fddc1
    [2] Related-Change: Id0873a3f2198ce285fe0b0c777738eff38bc2438
    
    Change-Id: Ia0081693f01631d3f2a59612308683e939ced76a

commit d707fc7b6d0ceb4556dddfc258c5de8c4baff05c
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Tue Jan 16 16:30:13 2018 -0800

DRY out tests until the stone bleeds
    
    Can we go deeper!?
    
    Change-Id: Ibd3b06542aa1bfcbcb71cc98e6bb21a6a67c12f4

commit ba8f1b1c3786df4e79fc3f9e4747d7cfb9072b6f
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Wed Jan 17 15:25:33 2018 +0000

Fix intermittent unit test failure
    
    test_check_delete_headers_removes_delete_after was
    failing intermittently due to rounding of float time
    values.
    
    Change-Id: Ia126ad6988f387bbd2d1f5ddff0a56d457a1fc9b
    Closes-Bug: #1743804

commit e747f94313f315fdf8d8fc01fb0c5aac60c33897
Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Date:   Wed Dec 27 14:37:29 2017 +0900

Fix InternalClient to drain response body if the request fails
    
    If we don't drain the body, the proxy logging in the internal client
    pipeline will log 499 client disconnect instead of actual error response
    code.
    
    For error responses, we try to do the most helpful thing using swob's
    closing and caching response body attribute.  For non-error responses
    which are returned to the client, we endeavour to keep the app_iter
    intact and unconsumed, trusting expecting the caller to do the right
    thing is the only reasonable interface.  We must cleanly close any WSGI
    app_iter which we do not return to the client regardless of status code
    and allow the logging of the 499 if needed.
    
    Closes-Bug: #1675650
    Change-Id: I455b5c38074ad0e72aa5e0b05771e193208905eb

commit d8f9045518035cbd1a40d0a94227952a384143ec
Author: Christopher Bartz <bartz@dkrz.de>
Date:   Fri Dec 1 11:13:10 2017 +0100

Send correct number of X-Delete-At-* headers
    
    Send just as many requests with X-Delete-At-* as we do X-Container-* to
    the object server.  Furthermore, stop the object server on making an
    update to the expirer queue when it wasn't told to do so and remove the
    log warning which would have been produced.
    
    Reason:
    
    It can be the case that the number of object replicas (OR) is larger
    than the number of container replicas (CR) for a given storage policy
    (most likely in case of EC).  Before this commit, only CR object servers
    received the x-delete-at-* headers, which means that OR - CR object
    servers did not receive the headers.  The servers missing the header
    would produce a log warning and create the x-delete-at-container header
    and async update on their own, which could lead to a bug, if the
    expiring_objects_container_divisor option was misconfigured.
    
    Change-Id: I20fc2f42f590fda995814a2fa7ba86019f9fddc1
    Closes-Bug: #1733588

commit cf1a1e89bbca50e285e99d31209c6eac6c697083
Author: Tim Burke <tim.burke@gmail.com>
Date:   Wed Aug 23 07:25:09 2017 +0000

expirer: unexpected responses don't warrant tracebacks
    
    If you want more information, you need to go check out the *other* node.
    
    Maybe this should be further refined to only log at debug for specific
    statuses like 404 and 412?
    
    Partial-Bug: 1688558
    Related-Bug: 1455221
    Change-Id: Ieefd8841154faba40dcf2a03abc5f056bdccd54f

commit 56b84c9295f3860a1fa0033774a7498105a1597f
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Tue Jan 16 12:02:13 2018 -0800

Minor cleanup in monitoring doc.
    
    Change-Id: Ia21f8743bfd745f2579db8658624f888461c2cc2

commit bc6fb8995123d08e672bdcf55bd35165509d60d9
Author: Monty Taylor <mordred@inaugust.com>
Date:   Tue Jan 16 11:32:12 2018 -0600

Add a note about the cost of COPY for setting metadata
    
    The pointer to using COPY to the same object as a mechanism to set only
    a subset of the metadata, it does not mention that doing so results in
    a full copy of the object in question on the backend.
    
    Add a note so it's clear that there is a tradeoff involved.
    
    Change-Id: I0c20a4909a6c3ff672f753d26cb9fb2f5f33d1f4

commit 0d324c16deacbd025314bd2211c824aa971a65b8
Author: guotao <guotao.bj@inspur.com>
Date:   Tue Jan 16 14:28:41 2018 +0800

Update http with https
    
    Use https instead of http for some links in readme.rst
    
    Change-Id: Idd382f58108e96129c69c6dc149c694fd7833fb3

commit 6e394bba0a783cf6bf06c6f60d4ccda150a87e67
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Mon Jan 15 15:16:08 2018 +0000

Add request_tries option to object-expirer.conf-sample
    
    ...and update the object-expirer man page.
    
    Change-Id: Idca1b8e3b7d5b40481af0d60477510e2557b88c0

commit a1ae142d5bcaf83ddde568ba4b957e23cc2b5e1c
Author: vxlinux <yan.wei7@zte.com.cn>
Date:   Thu Jan 4 16:18:37 2018 +0800

Merge repeat code for rebalance
    
    There are three similar code segments in rebalance process as follows：
    
        tiers = ['cluster', 'regions', 'zones', 'servers', 'devices']
        for i, tier_name in enumerate(tiers):
            replicas_at_tier = sum(weighted_replicas_by_tier[t] for t in
                                   weighted_replicas_by_tier if len(t) == i)
            if abs(self.replicas - replicas_at_tier) > 1e-10:
                raise exceptions.RingValidationError(
                    '%s != %s at tier %s' % (
                        replicas_at_tier, self.replicas, tier_name))
    
    I think we can encapsulate this code segment to a private function and
    replace those code segments with a function call
    
    Change-Id: I89439286b211f2c5ef19deffa77c202f48f07cf8

commit 35ad4e874522dd749582233ead8a023e042493bb
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Fri Jan 12 17:26:26 2018 +0000

Add tests for X-Backend-Clean-Expiring-Object-Queue true
    
    Check that when X-Backend-Clean-Expiring-Object-Queue is true
    the object server does indeed call async_update.
    
    Change-Id: I0a87979147591f15349b868a12ac6dd15ac4e37f
    Related-Change: I4d64f4d1d107c437fd3c23e19160157fdafbcd42

commit 7afc6a06eed1e9e3fdbea756074111b8a209d266
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Thu Jan 11 14:21:39 2018 -0800

Remove un-needed hack in probetest
    
    If you ran this probe test with ssync before the related change it would
    demonstrate the related bug.  The hack isn't harmful, but it isn't
    needed anymore.
    
    Related-Change-Id: I7f90b732c3268cb852b64f17555c631d668044a8
    Related-Bug: 1652323
    
    Change-Id: I09e3984a0500a0f4eceec392e7970b84070a5b39

commit 55a1b63db501f18ba62e86a29db47465dce8eb26
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Wed Jan 10 15:53:06 2018 -0800

Let recon-cron work with conf.d
    
    Change-Id: I862b74e0d9b20ba149581c1add6473dc1e5b2859

commit 48da3c1ed783a2b69cc74b02e8fd45e9d36cf80a
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Tue Jan 9 13:27:48 2018 -0800

Limit object-expirer queue updates on object DELETE, PUT, POST
    
    Currently, on deletion of an expiring object, each object server
    writes an async_pending to update the expirer queue and remove the row
    for that object. Each async_pending is processed by the object updater
    and results in all container replicas being updated. This is also true
    for PUT and POST requests for existing expiring objects.
    
    If you have Rc container replicas and Ro object replicas (or EC
    pieces), then the number of expirer-queue requests made is Rc * Ro [1].
    
    For a 3-replica cluster, that number is 9, which is not terrible. For
    a cluster with 3 container replicas and a 15+4 EC scheme, that number
    is 57, which is terrible.
    
    This commit makes it so at most two object servers will write out the
    async_pending files needed to update the queue, dropping the request
    count to 2 * Rc [2]. The object server now looks for a header
    "X-Backend-Clean-Expiring-Object-Queue: <true|false>" and writes or
    does not write expirer-queue async_pendings as appropriate. The proxy
    sends that header to 2 object servers.
    
    The queue update is not necessary for the proper functioning of the
    object expirer; if the queue update fails, then the object expirer
    will try to delete the object, receive 404s or 412s, and remove the
    queue entry. Removal on object PUT/POST/DELETE is helpful but not
    required.
    
    [1] assuming no retries needed by the object updater
    
    [2] or Rc, if a cluster has only one object replica
    
    Change-Id: I4d64f4d1d107c437fd3c23e19160157fdafbcd42

commit 9754a2ebe3f0ef66efd6fe0ff6c32fbb2e992617
Author: Christian Schwede <cschwede@redhat.com>
Date:   Wed Jan 10 11:35:00 2018 +0100

Change exit code when displaying empty rings
    
    Displaying an empty ring should not be an error, thus
    changing the exit code back to the former value of 0.
    
    Closes-Bug: 1742417
    Change-Id: I779c30cff1b4d24483f993221a8c6d944b7ae98d

commit a41c458c90b12c52688dd8b2b8a818b79b4e9693
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Fri Jan 5 16:54:44 2018 -0800

proxy: make the right number of container updates
    
    When the proxy is putting X-Container headers into object PUT
    requests, it should put out just enough to make the container update
    durable in the worst case. It shouldn't do more, since that results in
    extra work for the container servers; and it shouldn't do less, since
    that results in objects not showing up in listings.
    
    The current code gets the number right as long as you have 3 container
    replicas and an odd number of object replicas, but it comes up with
    some bogus numbers in other cases. The number it computes is
    (object-quorum + 1).
    
    This patch changes the number to (container-quorum +
    max_put_failures).
    
    Example: given an EC 12+5 policy and 3 container replicas, you can
    lose up to 4 connections and still succeed. Since you need to have 2
    container updates happen for durability, you need 6 connections to
    have X-Container headers. That way, you can lose 4 and still have 2
    left. The current code would put X-Container headers on 14 of the
    connections, resulting in more than double the workload on the
    container servers; this patch changes the number to 6.
    
    Example 2: given a (crazy) EC 3+6 policy and 3 container replicas, you
    can lose up to 5 connections, so you need X-Container headers on
    7. The current code only sends 5, giving a worst-case result of a PUT
    succeeds but never reaches the containers. This patch changes the
    number to 7.
    
    Other examples:
                              |  current  |  this change  |
                            --+-----------+---------------+
    EC 10+4, 3x container     |    12     |      5        |
    EC 10+4, 5x container     |    12     |      6        |
    EC 15+4, 3x container     |    17     |      5        |
    EC 15+4, 5x container     |    17     |      6        |
    EC 4+8, 3x container      |    6      |      9        |
    7x object, 3x container   |    5      |      5        |
    6x object, 3x container   |    4      |      5        |
    
    Change-Id: I34efd48655b890340912810ab111bb63445e5c8b

commit e7ffda5d0b75bd85c3b886c3aad0c938c7d476d6
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Fri Jan 5 17:41:21 2018 +0000

Use _update_x_timestamp method in object controller DELETE method
    
    The DELETE method repeats inline the same behaviour as provided by
    _update_x_timestamp, so just call the method.
    
    Also add unit tests for the behaviour of _update_x_timestamp.
    
    Change-Id: I8b6cfdbfb54b6d43ac507f23d84309ab543374aa

commit bf13d64cd07c0957a88687a24a8ca861189fb5ac
Author: Matthew Oliver <matt@oliver.net.au>
Date:   Wed Jan 3 04:48:44 2018 +0000

Show devices marked as deleted on empty rings
    
    This is a follow up patch to 530258 which will show
    extra infromation on empty rings.
    
    This patch goes one step further. On a completely empty ring:
    
      $ swift-ring-builder my.builder create  8 3 1
      $ swift-ring-builder my.builder
      my.builder, build version 0, id 33b4e117056340feae7d40430180c6bb
      256 partitions, 3.000000 replicas, 0 regions, 0 zones, 0 devices, 0.00 balance, 0.00 dispersion
      The minimum number of hours before a partition can be reassigned is 1 (0:00:00 remaining)
      The overload factor is 0.00% (0.000000)
      Ring file my.ring.gz not found, probably it hasn't been written yet
      Devices:   id region zone ip address:port replication ip:port  name weight partitions balance flags meta
      There are no devices in this ring, or all devices have been deleted
    
    It will still start the device list and then say no devices.. Why. let's
    see what happens now on an empty ring with devices still marked as
    deleted:
    
      $ swift-ring-builder my.builder add r1z1-127.0.0.1:6010/sdb1 1
      Device d0r1z1-127.0.0.1:6010R127.0.0.1:6010/sdb1_"" with 1.0 weight got id 0
      $ swift-ring-builder my.builder add r1z1-127.0.0.1:6010/sdb2 1
      Device d1r1z1-127.0.0.1:6010R127.0.0.1:6010/sdb2_"" with 1.0 weight got id 1
      $ swift-ring-builder my.builder remove r1z1-127.0.0.1
      Matched more than one device:
          d0r1z1-127.0.0.1:6010R127.0.0.1:6010/sdb1_""
          d1r1z1-127.0.0.1:6010R127.0.0.1:6010/sdb2_""
      Are you sure you want to remove these 2 devices? (y/N) y
      d0r1z1-127.0.0.1:6010R127.0.0.1:6010/sdb1_"" marked for removal and will be removed next rebalance.
      d1r1z1-127.0.0.1:6010R127.0.0.1:6010/sdb2_"" marked for removal and will be removed next rebalance.
    
      $ swift-ring-builder my.builder
      my.builder, build version 4, id 33b4e117056340feae7d40430180c6bb
      256 partitions, 3.000000 replicas, 1 regions, 1 zones, 2 devices, 0.00 balance, 0.00 dispersion
      The minimum number of hours before a partition can be reassigned is 1 (0:00:00 remaining)
      The overload factor is 0.00% (0.000000)
      Ring file my.ring.gz not found, probably it hasn't been written yet
      Devices:   id region zone ip address:port replication ip:port  name weight partitions balance flags meta
                  0      1    1  127.0.0.1:6010      127.0.0.1:6010  sdb1 0.00          0    0.00   DEL
                  1      1    1  127.0.0.1:6010      127.0.0.1:6010  sdb2 0.00          0    0.00   DEL
      There are no devices in this ring, or all devices have been deleted
    
    Now even when all devices are removed we can still see them as they are still there, only marked as deleted.
    
    Change-Id: Ib39f734deb67ad50bcdad5333cba716161a47e95

commit e343452394780b1f555777bc7083912ac68633d3
Author: Tim Burke <tim.burke@gmail.com>
Date:   Mon Jan 8 20:02:50 2018 +0000

Support existing builders with None _last_part_moves
    
    These were likely written before the first related change, or created
    from an existing ring file.
    
    Also, tolerate missing dispersion when rebalancing -- that may not exist
    in the builder file.
    
    Change-Id: I26e3b4429c747c23206e4671f7c86543bb182a15
    Related-Change: Ib165cf974c865d47c2d9e8f7b3641971d2e9f404
    Related-Change: Ie239b958fc7e0547ffda2bebf61546bd4ef3d829
    Related-Change: I551fcaf274876861feb12848749590f220842d68

commit 94565d9137275f4c6c775835cf2c0b81693137be
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Mon Jan 8 10:14:06 2018 +0000

Disallow x-delete-at equal to x-timestamp
    
    Previously an x-delete-at value equal to the x-timestamp value was
    allowed.  This could only occur when x-timestamp happened to take an
    integer value and would result in an object that was immediately
    unreadable.
    
    Similarly an x-delete-after value of zero may previously have been
    accepted if x-timestamp happened to be an integer value.
    
    With this change an x-delete-at value equal to x-timestamp or an
    x-delete-after value of zero always results in a 400 BadRequest.
    
    Also cleans up check_delete_headers docstring.
    
    Related-Change: Ia8d00fcef8893e3b3dd5720da2c8a5ae1e6e4cb8
    Related-Change: Ib2483444d3999e13ba83ca2edd3a8ef8e5c48548
    Change-Id: I27fdd800d8e149302ff4d6531101e9726a14d471

commit 79ac3a3c311b393d7b64ce4c41d68b52801b52cb
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Mon Jan 8 14:36:53 2018 +0000

Fix intermittent check_delete_headers failure
    
    Use a utils.Timestamp object to set a more realistic x-timestamp
    header to avoid intermittent failure when str(time.time()) results
    in a rounded up value.
    
    Closes-Bug: 1741912
    Change-Id: I0c54d07e30ecb391f9429e7bcfb782f965ece1ea

commit 6151554a89216934c0be242b93b28d87adc421e0
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Fri Jan 5 14:44:52 2018 +0000

Correct 400 response message when x-delete-after is zero
    
    Before an x-delete-after header with value '0' would almost
    certainly result in a 400 response, but the response body would
    report a problem with x-delete-at. Now the response correctly
    blames the x-delete-after header.
    
    Related-Change: I9a1b6826c4c553f0442cfe2bb78cdf49508fa4a5
    Change-Id: Ia8d00fcef8893e3b3dd5720da2c8a5ae1e6e4cb8

commit b22d3c1115609a62b3fce5177be213ed3fa587c5
Author: xhancar <pavel.hancar@gmail.com>
Date:   Sat Jan 6 20:48:10 2018 +0000

fix of type error
    
    There was incorrect path starting /home/swift, but /home/<your-user-name> is correct for common users.
    
    Change-Id: Ia81b2119c87dd88417428e55c82dac1ab7c028b3
    Closes-Bug: 1741378

commit 582460ecf9d503c10fd73c055301634b0b009dbc
Author: Alistair Coles <alistairncoles@gmail.com>
Date:   Fri Jan 5 14:43:12 2018 +0000

Document that x-delete-after takes precedence over x-delete-at
    
    Change-Id: Ib2483444d3999e13ba83ca2edd3a8ef8e5c48548

commit 31c294de797be30f499750ccbed3ec18a717f9b1
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Thu Jan 4 20:28:28 2018 -0800

Fix time skew when using X-Delete-After
    
    When a client sent "X-Delete-After: <n>", the proxy and all object
    servers would each compute X-Delete-At as "int(time.time() +
    n)". However, since they don't all compute it at exactly the same
    time, the objects stored on disk can end up with differing values for
    X-Delete-At, and in that case, the object-expirer queue has multiple
    entries for the same object (one for each distinct X-Delete-At value).
    
    This commit makes two changes, either one of which is sufficient to
    fix the bug.
    
    First, after computing X-Delete-At from X-Delete-After, X-Delete-After
    is removed from the request's headers. Thus, the proxy computes
    X-Delete-At, and the object servers don't, so there's only a single
    value.
    
    Second, computation of X-Delete-At now uses the request's X-Timestamp
    instead of time.time(). In the proxy, these values are essentially the
    same; the proxy is responsible for setting X-Timestamp. In the object
    server, this ensures that all computed X-Delete-At values are
    identical, even if the object servers' clocks are not, or if one
    object server takes an extra second to respond to a PUT request.
    
    Co-Authored-By: Alistair Coles <alistairncoles@gmail.com>
    Change-Id: I9a1b6826c4c553f0442cfe2bb78cdf49508fa4a5
    Closes-Bug: 1741371

commit a13e0ee76b1b5e56096878930fd24c638148155f
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Thu Jan 4 20:40:11 2018 -0800

Ignore directory .stestr
    
    After running the functional tests, this directory shows up. I don't
    know what's in it, but I'm fairly certain I don't want to commit it.
    
    Change-Id: If9179330c337daf2ae0a01e6c8aa8d349969e737

commit 49de7db532ffaba9fbd3c7e912e007b9d8a36d7c
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Fri Dec 29 08:53:21 2017 -0800

add swift-ring-builder option to recalculate dispersion
    
    Since dispersion info is cached, this can easily happen if we make
    changes to how dispersion info is calculated or stored (e.g. we extend
    the dispersion calculation to consider dispersion of all part-replicas
    in the related change)
    
    Related-Change-Id: Ifefff0260deac0c3e8b369a1e158686c89936686
    
    Change-Id: I714deb9e349cd114a21ec591216a9496aaf9e0d1

commit 9189f51d7601254181b2458f8c3e64c74d6dfad0
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Wed Dec 27 12:10:57 2017 -0800

Display more info on empty rings
    
    Related-Bug: #1737068
    Related-Change-Id: Ibadaf64748728a47a8f3f861ec1af601dbfeb9e0
    Change-Id: I683677f33764fa56dadfb7f6208f7f6ee25c8557

commit 56126b28392986636e99f47c3003ed6058b62fd9
Author: vxlinux <yan.wei7@zte.com.cn>
Date:   Fri Dec 22 11:40:40 2017 +0800

Handle EmptyRingError in swift-ring-builder's default command
    
    When the default display command for swift-ring-error encounters a
    EmptyRingError trying to calculate balance it should not raise exception
    and display the traceback in a command line environment.
    
    Instead handle the exceptional condition and provide the user with
    useful feedback.
    
    Closes-Bug: #1737068
    Change-Id: Ibadaf64748728a47a8f3f861ec1af601dbfeb9e0

commit f709eed41b9579cf5a8ca9180143301a2b452d47
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Thu Dec 28 14:56:08 2017 -0800

Fix socket leak on 416 EC GET responses.
    
    Sometimes, when handling an EC GET request with a Range header, the
    object servers reply 206 to the proxy, but the proxy (correctly)
    replies 416 to the client[1]. In that case, the connections to the object
    servers were not being closed. This was due to improper error handling
    in ECAppIter.
    
    Since ECAppIter is intended to be a WSGI iterable, it expects to have
    its close() method called when the caller is done with it. In this
    particular case, the caller (ECAppIter.kickoff()) was not calling
    close() when an exception was raised. Now it is.
    
    [1] consider a 4+2 EC policy with segment size 1024, an 20 byte
    object, and a request with "Range: bytes=21-50". The proxy needs whole
    fragments to decode, so it asks the object server for "Range:
    bytes=0-255" [2], the object server says 206, and then the proxy
    realizes that the client's request is unsatisfiable and tells the
    client 416.
    
    [2] segment size 1024 and 4 data fragments means the fragments have
    size 1024 / 4 = 256, hence "bytes=0-255" asks for the first whole
    fragment
    
    Change-Id: Ide2edf8c449c97d45f48c2dbbbff7aebefa4b158
    Closes-Bug: 1738804

commit 7013e70ca67891e94664e9eca70925b61ee8f689
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Thu Dec 14 20:03:24 2017 -0800

Represent dispersion worse than one replicanth
    
    With a sufficiently undispersed ring it's possible to move an entire
    replicas worth of parts and yet the value of dispersion may not get any
    better (even though in reality dispersion has dramatically improved).
    The problem is dispersion will currently only represent up to one whole
    replica worth of parts being undispersed.
    
    However with EC rings it's possible for more than one whole replicas
    worth of partitions to be undispersed, in these cases the builder will
    require multiple rebalance operations to fully disperse replicas - but
    the dispersion value should improve with every rebalance.
    
    N.B. with this change it's possible for rings with a bad dispersion
    value to measure as having a significantly smaller dispersion value
    after a rebalance (even though they may not have had their dispersion
    change) because the total amount of bad dispersion we can measure has
    been increased but we're normalizing within a similar range.
    
    Closes-Bug: #1697543
    
    Change-Id: Ifefff0260deac0c3e8b369a1e158686c89936686

commit 61fe6aae81d00597c777a64ac337a8dfb990f0c2
Author: Tim Burke <tim.burke@gmail.com>
Date:   Tue Aug 22 22:40:58 2017 +0000

Better mock out OSErrors in test_replicator before raising them
    
    Also, provide a return value for resp.read() so we hit a
    pickle error instead of a type error.
    
    Change-Id: I56141eee63ad1ceb2edf807432fa2516fabb15a6

commit 0bdec4661b5609ca1bf813a7ccd514e5d444b07f
Author: Kazuhiro MIYAHARA <miyahara.kazuhiro@lab.ntt.co.jp>
Date:   Mon Dec 25 09:13:17 2017 +0000

Skip symlink + vw functional tests if symlink is not enabled
    
    Functional tests for symlink and versioned writes run and result in
    falure even if symlink is not enabled.
    
    This patch fixes the functional tests to run only if both of
    symlink and versioned writes are enabled.
    
    Change-Id: I5ffd0b6436e56a805784baf5ceb722effdf74884

commit 1449532fb82b4fe3a5484b547d425dcda82df259
Author: Kazuhiro MIYAHARA <miyahara.kazuhiro@lab.ntt.co.jp>
Date:   Mon Dec 25 07:09:49 2017 +0000

Allow InternalClient to container/object listing with prefix
    
    This patch adds 'prefix' argument to iter_containers/iter_objects
    method of InternalClient.
    This change will be used in general task queue feature [1].
    
    [1]: https://review.openstack.org/#/c/517389/
    
    Change-Id: I8c2067c07fe35681fdc9403da771f451c21136d3