Bug #1452431 “some parts replicas assigned to duplicate devices ...” : Bugs : OpenStack Object Storage (swift)

Revision history for this message

clayg (clay-gerrard) wrote on 2015-05-06:

#1

given a ring watch the number of unique device ids for each replica of every part Edit (960 bytes, text/x-python)

clayg (clay-gerrard) on 2015-05-19

summary:

- some parts replicas assigned to duplicate devices
+ some parts replicas assigned to duplicate devices in the ring

John Dickinson (notmyname) on 2015-06-03

tags:

added: ec

Revision history for this message

clayg (clay-gerrard) wrote on 2015-06-25:

#2

@torgomatic should be able to confirm this bug

Revision history for this message

Samuel Merritt (torgomatic) wrote on 2015-06-25:

#3

Yeah, this'll happen.

Note that fixing it means you can't have an N-replica ring with fewer than N devices. That's probably not important any more now that you can change a builder's replica count.

Changed in swift:
status:	New → Confirmed

Samuel Merritt (torgomatic) on 2015-07-03

Changed in swift:
assignee:	nobody → Samuel Merritt (torgomatic)

John Dickinson (notmyname) on 2015-07-24

Changed in swift:
importance:	Undecided → Medium

Revision history for this message

paul luse (paul-e-luse) wrote on 2015-07-27:

#4

Sam, are you going to knock this one out?

Revision history for this message

Samuel Merritt (torgomatic) wrote on 2015-07-28:

#5

I don't know about "knock out"; I've gone six rounds with it and it's winning.

John Dickinson (notmyname) on 2015-09-09

tags:

removed: ec

Revision history for this message

clayg (clay-gerrard) wrote on 2015-09-10:

#6

I don't know if we "fix" this anytime soon - but we have got to stop letting people deploy rings like this - at a minimum add the validator so we can fail early and fail loudly

Changed in swift:
importance:	Medium → Critical

Revision history for this message

clayg (clay-gerrard) wrote on 2015-09-10:

#7

I think if we can get the validator in we can reduce the importance of this bug to high.

Revision history for this message

John Dickinson (notmyname) wrote on 2015-10-02:

#8

with patch https://review.openstack.org/#/c/222799/ we can lower this priority

Changed in swift:
importance:	Critical → High

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-10-03: Related fix merged to swift (master)

#9

Download full text (4.0 KiB)

Reviewed: https://review.openstack.org/222799
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=5070869ac0e6a2d577dd4054ffbcbffd06db3c5b
Submitter: Jenkins
Branch: master

commit 5070869ac0e6a2d577dd4054ffbcbffd06db3c5b
Author: Clay Gerrard <email address hidden>
Date: Fri Sep 11 16:24:52 2015 -0700

Validate against duplicate device part replica assignment

    We should never assign multiple replicas of the same partition to the
    same device - our on-disk layout can only support a single replica of a
    given part on a single device. We should not do this, so we validate
    against it and raise a loud warning if this terrible state is ever
    observed after a rebalance.

Unfortunately currently there's a couple not necessarily uncommon
scenarios which will trigger this observed state today:

     1. If we have less devices than replicas
     2. If a server or zones aggregate device weight make it the most
        appropriate candidate for multiple replicas and you're a bit unlucky

    Fixing #1 would be easy, we should just not allow that state anymore.
    Really we never did - if you have a 3 replica ring with one device - you
    have one replica. Everything that iter_nodes'd would de-dupe. We
    should just be insisting that you explicitly acknowledge your replica
    count with set_replicas.

    I have been lost in the abyss for days searching for a general solutions
    to #2. I'm sure it exists, but I will not have wrestled it to
    submission by RC1. In the meantime we can eliminate a great deal of the
    luck required simply by refusing to place more than one replica of a
    part on a device in assign_parts.

    The meat of the change is a small update to the .validate method in
    RingBuilder. It basically unrolls a pre-existing (part, replica) loop
    so that all the replicas of the part come out in order so that we can
    build up the set of dev_id's for which all the replicas of a given part
    are assigned part-by-part.

If we observe any duplicates - we raise a warning.

    To clean the cobwebs out of the rest of the corner cases we're going to
    delay get_required_overload from kicking in until we achive dispersion,
    and a small check was added when selecting a device subtier to validate
    if it's already being used - picking any other device in the tier works
    out much better. If no other devices are available in the tier - we
    raise a warning. A more elegant or optimized solution may exist.

Many unittests did not meet the criteria #1, but the fix was straight
forward after being identified by the pigeonhole check.

    However, many more tests were affected by #2 - but again the fix came to
    be simply adding more devices. The fantasy that all failure domains
    contain at least replica count devices is prevalent in both our ring
    placement algorithm and it's tests. These tests were trying to
    demonstrate some complex characteristics of our ring placement algorithm
    and I believe we just got a bit too carried away trying to find the
    simplest possible example to demonstrate the...

Reviewed:  https://review.openstack.org/222799
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=5070869ac0e6a2d577dd4054ffbcbffd06db3c5b
Submitter: Jenkins
Branch:    master

commit 5070869ac0e6a2d577dd4054ffbcbffd06db3c5b
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Fri Sep 11 16:24:52 2015 -0700

Validate against duplicate device part replica assignment
    
    We should never assign multiple replicas of the same partition to the
    same device - our on-disk layout can only support a single replica of a
    given part on a single device.  We should not do this, so we validate
    against it and raise a loud warning if this terrible state is ever
    observed after a rebalance.
    
    Unfortunately currently there's a couple not necessarily uncommon
    scenarios which will trigger this observed state today:
    
     1. If we have less devices than replicas
     2. If a server or zones aggregate device weight make it the most
        appropriate candidate for multiple replicas and you're a bit unlucky
    
    Fixing #1 would be easy, we should just not allow that state anymore.
    Really we never did - if you have a 3 replica ring with one device - you
    have one replica.  Everything that iter_nodes'd would de-dupe.  We
    should just be insisting that you explicitly acknowledge your replica
    count with set_replicas.
    
    I have been lost in the abyss for days searching for a general solutions
    to #2.  I'm sure it exists, but I will not have wrestled it to
    submission by RC1.  In the meantime we can eliminate a great deal of the
    luck required simply by refusing to place more than one replica of a
    part on a device in assign_parts.
    
    The meat of the change is a small update to the .validate method in
    RingBuilder.  It basically unrolls a pre-existing (part, replica) loop
    so that all the replicas of the part come out in order so that we can
    build up the set of dev_id's for which all the replicas of a given part
    are assigned part-by-part.
    
    If we observe any duplicates - we raise a warning.
    
    To clean the cobwebs out of the rest of the corner cases we're going to
    delay get_required_overload from kicking in until we achive dispersion,
    and a small check was added when selecting a device subtier to validate
    if it's already being used - picking any other device in the tier works
    out much better.  If no other devices are available in the tier - we
    raise a warning.  A more elegant or optimized solution may exist.
    
    Many unittests did not meet the criteria #1, but the fix was straight
    forward after being identified by the pigeonhole check.
    
    However, many more tests were affected by #2 - but again the fix came to
    be simply adding more devices.  The fantasy that all failure domains
    contain at least replica count devices is prevalent in both our ring
    placement algorithm and it's tests.  These tests were trying to
    demonstrate some complex characteristics of our ring placement algorithm
    and I believe we just got a bit too carried away trying to find the
    simplest possible example to demonstrate the desirable trait.  I think
    a better example looks more like a real ring - with many devices in each
    server and many servers in each zone - I think more devices makes the
    tests better.  As much as possible I've tried to maintain the original
    intent of the tests - when adding devices I've either spread the weight
    out amongst them or added proportional weights to the other tiers.
    
    I added an example straw man test to validate that three devices with
    different weights in three different zones won't blow up.  Once we can
    do that without raising warnings and assigning duplicate device part
    replicas - we can add more.  And more importantly change the warnings to
    errors - because we would much prefer to not do that #$%^ anymore.
    
    Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
    Related-Bug: #1452431
    Change-Id: I592d5b611188670ae842fe3d030aa3b340ac36f9

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-10-13: Related fix proposed to swift (feature/crypto)

#10

Related fix proposed to branch: feature/crypto
Review: https://review.openstack.org/234065

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-10-14: Related fix merged to swift (feature/crypto)

#11

Download full text (36.8 KiB)

Reviewed: https://review.openstack.org/234065
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=c80229fd86853f5f8541aeef4b5044170572640d
Submitter: Jenkins
Branch: feature/crypto

commit 9cafa472a336f66d149a20c12f4251703d96f04d
Author: Ondřej Nový <email address hidden>
Date: Sat Oct 10 20:57:07 2015 +0200

Autodetect systemctl in SAIO and use it on systemd distros

Change-Id: I84a9b27baac89327749d8774032860f8ad5166f2

commit 92767f28d668643bc2affee7b2fd46fd9349656a
Author: Emile Snyder <email address hidden>
Date: Sun Oct 11 21:24:54 2015 -0700

Fix 'swift-ring-builder write_builder' after you remove a device

clayg already posted the code fix in the bug, but noted it needs a test.

Closes-Bug: #1487280
Change-Id: I07317754afac7165baac4e696f07daeba2e72adc

commit a48649002970b2150d24d0622a100f54045443c5
Author: Lisak, Peter <email address hidden>
Date: Mon Oct 12 14:42:01 2015 +0200

swift-recon fails with socket.timeout exception

If some server is overloaded or timeout set too low, swift-recon fails with
raised socket.timeout exception.

This error should be processed the same way as HTTPError/URLError.

Change-Id: Ide8843977ab224fa866097d0f0b765d6899c66b8

commit 767fac8186ea4541f4466ac9a55c03abea6a878b
Author: Christian Schwede <email address hidden>
Date: Mon Oct 12 07:09:00 2015 +0000

Enable H234 check (assertEquals is deprecated, use assertEqual)

All usages of assertEquals and assertNotEquals are fixed now, so let's enable
the H234 check to avoid regressions in the future.

Change-Id: I2c2ccb3b268cf9eb11f2db045378ab125a02bc31

commit 1882801be1d8983cd718786bd409cf09f65a00b0
Author: janonymous <email address hidden>
Date: Mon Aug 31 21:49:49 2015 +0530

pep8 fix: assertNotEquals -> assertNotEqual

assertNotEquals is deprecated in py3

Change-Id: Ib611351987bed1199fb8f73a750955a61d022d0a

commit f5f9d791b0b8b32350bd9a47fbc00ff86a65f09d
Author: janonymous <email address hidden>
Date: Wed Aug 5 23:58:14 2015 +0530

pep8 fix: assertEquals -> assertEqual

assertEquals is deprecated in py3, replacing it.

Change-Id: Ida206abbb13c320095bb9e3b25a2b66cc31bfba8
Co-Authored-By: Ondřej Nový <email address hidden>

commit 1ba7641c794104de57e5010f76cecbf146a2a63b
Author: Zack M. Davis <email address hidden>
Date: Thu Oct 8 16:16:18 2015 -0700

minutæ: port ClientException tweaks from swiftclient; dict .pop

    openstack/python-swiftclient@5ae4b423 changed python-swiftclient's
    ClientException to have its http_status attribute default to
    None (rather than 0) and to use super in its __init__ method. For
    consistency's sake, it's nice for Swift's inlined copy of
    ClientException to receive the same patch. Also, the retry function in
    direct_client (a major user of ClientException) was using a somewhat
    awkward conditional-assignment-and-delete construction where the .pop
    method of dictionaries would be more idiomatic.

Change-Id: I70a12f934f84f57549617af28b86f7f5637bd8fa

commit 01f9d15045129d09...

Reviewed:  https://review.openstack.org/234065
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=c80229fd86853f5f8541aeef4b5044170572640d
Submitter: Jenkins
Branch:    feature/crypto

commit 9cafa472a336f66d149a20c12f4251703d96f04d
Author: Ondřej Nový <ondrej.novy@firma.seznam.cz>
Date:   Sat Oct 10 20:57:07 2015 +0200

Autodetect systemctl in SAIO and use it on systemd distros
    
    Change-Id: I84a9b27baac89327749d8774032860f8ad5166f2

commit 92767f28d668643bc2affee7b2fd46fd9349656a
Author: Emile Snyder <emile.snyder@gmail.com>
Date:   Sun Oct 11 21:24:54 2015 -0700

Fix 'swift-ring-builder write_builder' after you remove a device
    
    clayg already posted the code fix in the bug, but noted it needs a test.
    
    Closes-Bug: #1487280
    Change-Id: I07317754afac7165baac4e696f07daeba2e72adc

commit a48649002970b2150d24d0622a100f54045443c5
Author: Lisak, Peter <peter.lisak@firma.seznam.cz>
Date:   Mon Oct 12 14:42:01 2015 +0200

swift-recon fails with socket.timeout exception
    
    If some server is overloaded or timeout set too low, swift-recon fails with
    raised socket.timeout exception.
    
    This error should be processed the same way as HTTPError/URLError.
    
    Change-Id: Ide8843977ab224fa866097d0f0b765d6899c66b8

commit 767fac8186ea4541f4466ac9a55c03abea6a878b
Author: Christian Schwede <cschwede@redhat.com>
Date:   Mon Oct 12 07:09:00 2015 +0000

Enable H234 check (assertEquals is deprecated, use assertEqual)
    
    All usages of assertEquals and assertNotEquals are fixed now, so let's enable
    the H234 check to avoid regressions in the future.
    
    Change-Id: I2c2ccb3b268cf9eb11f2db045378ab125a02bc31

commit 1882801be1d8983cd718786bd409cf09f65a00b0
Author: janonymous <janonymous.codevulture@gmail.com>
Date:   Mon Aug 31 21:49:49 2015 +0530

pep8 fix: assertNotEquals -> assertNotEqual
    
    assertNotEquals is deprecated in py3
    
    Change-Id: Ib611351987bed1199fb8f73a750955a61d022d0a

commit f5f9d791b0b8b32350bd9a47fbc00ff86a65f09d
Author: janonymous <janonymous.codevulture@gmail.com>
Date:   Wed Aug 5 23:58:14 2015 +0530

pep8 fix: assertEquals -> assertEqual
    
    assertEquals is deprecated in py3, replacing it.
    
    Change-Id: Ida206abbb13c320095bb9e3b25a2b66cc31bfba8
    Co-Authored-By: Ondřej Nový <ondrej.novy@firma.seznam.cz>

commit 1ba7641c794104de57e5010f76cecbf146a2a63b
Author: Zack M. Davis <zdavis@swiftstack.com>
Date:   Thu Oct 8 16:16:18 2015 -0700

minutæ: port ClientException tweaks from swiftclient; dict .pop
    
    openstack/python-swiftclient@5ae4b423 changed python-swiftclient's
    ClientException to have its http_status attribute default to
    None (rather than 0) and to use super in its __init__ method. For
    consistency's sake, it's nice for Swift's inlined copy of
    ClientException to receive the same patch. Also, the retry function in
    direct_client (a major user of ClientException) was using a somewhat
    awkward conditional-assignment-and-delete construction where the .pop
    method of dictionaries would be more idiomatic.
    
    Change-Id: I70a12f934f84f57549617af28b86f7f5637bd8fa

commit 01f9d15045129d094138d1ec6c84ee4df6944f82
Author: Alistair Coles <alistair.coles@hp.com>
Date:   Thu Oct 8 18:50:20 2015 +0100

Fix EC documentation of .durable quorum
    
    Update the doc to reflect the change [1] to ndata + 1
    .durable files being committed before a success response
    is returned for a PUT.
    
    [1] Ifd36790faa0a5d00ec79c23d1f96a332a0ca0f0b
    
    Change-Id: I1744d457bda8a52eb2451029c4031962e92c2bb7

commit a5d2faab900217914899af9b96d3ced0f321f56b
Author: Lisak, Peter <peter.lisak@firma.seznam.cz>
Date:   Thu Oct 8 14:39:32 2015 +0200

swift-ring-builder can't select id=0
    
    Currently, it is not possible to change weight of device with id=0
    by swift-ring-builder cli. Instead of change the help is shown.
    Example:
    $ swift-ring-builder object.builder set_weight --id 0 1.00
    
    But id=0 is generated by swift for the first device if not provided.
    Also --weight, --zone and --region cause the same bug.
    
    There is problem to detect new command format in validate_args
    function if zero is as valid value for some args.
    
    Change-Id: I4ee379c242f090d116cd2504e21d0e1904cdc2fc

commit 8f8542793913e87f41d4789aa5327d611101b2cb
Author: Victor Stinner <vstinner@redhat.com>
Date:   Thu Oct 8 15:38:36 2015 +0200

py3: Replace gen.next() with next(gen)
    
    The next() method of Python 2 generators was renamed to __next__().
    Call the builtin next() function instead which works on Python 2 and
    Python 3.
    
    The patch was generated by the next operation of the sixer tool.
    
    Change-Id: Id12bc16cba7d9b8a283af0d392188a185abe439d

commit 6a82097b0e92d0b9b70c52217322ca33b5da8b28
Author: Victor Stinner <vstinner@redhat.com>
Date:   Thu Oct 8 15:32:33 2015 +0200

py3: Use six.reraise() to reraise an exception
    
    Replace "raise exc_type, exc_value, exc_tb" with
    "six.reraise(exc_type, exc_value, exc_tb)".
    
    The patch was generated by the raise operation of the sixer tool on:
    bin/* swift/ test/.
    
    Change-Id: Ic4ca6d7f26d1e0075bd2a8a26d6e408b59b17fbb

commit c0af385173658fa149bddf155aeb1ae0bbd4eb7e
Author: Victor Stinner <vstinner@redhat.com>
Date:   Thu Oct 8 15:03:52 2015 +0200

py3: Replace urllib imports with six.moves.urllib
    
    The urllib, urllib2 and urlparse modules of Python 2 were reorganized
    into a new urllib namespace on Python 3. Replace urllib, urllib2 and
    urlparse imports with six.moves.urllib to make the modified code
    compatible with Python 2 and Python 3.
    
    The initial patch was generated by the urllib operation of the sixer
    tool on: bin/* swift/ test/.
    
    Change-Id: I61a8c7fb7972eabc7da8dad3b3d34bceee5c5d93

commit f2cac20d17b081e7b9b6285546414902aa2bdbec
Author: Victor Stinner <vstinner@redhat.com>
Date:   Thu Oct 8 13:08:45 2015 +0200

py3: Replace unicode with six.text_type
    
    The unicode type was renamed to str in Python 3. Use six.text_type to
    make the modified code compatible with Python 2 and Python 3.
    
    The initial patch was generated by the unicode operation of the sixer
    tool on: bin/* swift/ test/.
    
    Change-Id: I9e13748ccde36ee8110756202d55d3ae945d4860

commit c30ceec6f1cfde4da53ab1854d7b2d6c4d7f3a2b
Author: Christian Schwede <cschwede@redhat.com>
Date:   Wed Oct 7 19:48:17 2015 +0000

Fix ring device checks in probetests
    
    If a device has been removed from one of the rings, it actually is set as None
    within the ring. In that case the length of the devices is not True without
    filtering the None devices. However, if the length matched the condition but
    included a removed device the probetests would fail with a TypeError.
    
    This fix could be done also in swift/common/ring/ring.py, but it seems it only
    affects probetests right now, thus fixing it there and not changing the current
    behavior.
    
    Change-Id: I8ccf9b32a51957e040dd370bc9f711d4328d17b1

commit d01cd425094c2e56e4e89dbf3eaf887815dd5b62
Author: Charles Hsu <charles0126@gmail.com>
Date:   Tue Oct 6 16:28:34 2015 +0800

Fix replicator intersection exception when sync data to remote regions.
    
    This patch fixed the exception (AttributeError: 'list' object has no
    attribute 'intersection') when replicator try to sync data from
    handoff to primary partition in more than one remote region.
    
    Change-Id: I565c45dda8c99d36e24dbf1145f2d2527d593ac0
    Closes-Bug: 1503152

commit 06bede894227062eabceb0fdfac8146cdabf4c8b
Author: Zack M. Davis <zdavis@swiftstack.com>
Date:   Tue Sep 8 19:21:39 2015 -0700

replace use of deprecated rfc822.Message with a helper utility
    
    The rfc822 module has been deprecated since Python 2.3, and in
    particular is absent from the Python 3 standard library. However, Swift
    uses instances of rfc822.Message in a number of places, relying on its
    behavior of immediately parsing the headers of a file-like object
    without consuming the body, leaving the position of the file at the
    start of the body. Python 3's http.client has an undocumented
    parse_headers function with the same behavior, which inspired the new
    parse_mime_headers utility introduced here. (The HeaderKeyDict returned
    by parse_mime_headers doesn't have a `.getheader(key)` method like
    rfc822.Message did; the dictionary-like `[key]` or `.get(key)` interface
    should be used exclusively.)
    
    The implementation in this commit won't actually work with Python 3, the
    email.parser.Parser().parsestr of which expects a Unicode string, but it
    is believed that this can be addressed in followup work.
    
    Change-Id: Ia5ee2ead67e36e8c6416183667f64ae255887736

commit 47eb6a37f86f29c355297b556c2ff898c98da9b2
Author: John Dickinson <me@not.mn>
Date:   Fri Oct 2 17:06:49 2015 -0700

authors and changelog update for 2.5.0
    
    Change-Id: Id20b9340a017922b29d8bf9558825697a7f1f6f1

commit a8c2978707ac727c870a5707932f523a063d8b3c
Author: Victor Stinner <vstinner@redhat.com>
Date:   Thu Aug 27 01:03:27 2015 +0200

py3: Fix Python 3 issues in utils
    
    * Replace urllib imports with six.moves.urllib
    * Don't access private logging._levelNames attribute, but use the
      public function logging.addLevelName() instead.
    * Replace basestring with six.string_types
    
    Change-Id: I4cd5dd71ffb40f84e8844b5808b38630795ad520

commit 5070869ac0e6a2d577dd4054ffbcbffd06db3c5b
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Fri Sep 11 16:24:52 2015 -0700

Validate against duplicate device part replica assignment
    
    We should never assign multiple replicas of the same partition to the
    same device - our on-disk layout can only support a single replica of a
    given part on a single device.  We should not do this, so we validate
    against it and raise a loud warning if this terrible state is ever
    observed after a rebalance.
    
    Unfortunately currently there's a couple not necessarily uncommon
    scenarios which will trigger this observed state today:
    
     1. If we have less devices than replicas
     2. If a server or zones aggregate device weight make it the most
        appropriate candidate for multiple replicas and you're a bit unlucky
    
    Fixing #1 would be easy, we should just not allow that state anymore.
    Really we never did - if you have a 3 replica ring with one device - you
    have one replica.  Everything that iter_nodes'd would de-dupe.  We
    should just be insisting that you explicitly acknowledge your replica
    count with set_replicas.
    
    I have been lost in the abyss for days searching for a general solutions
    to #2.  I'm sure it exists, but I will not have wrestled it to
    submission by RC1.  In the meantime we can eliminate a great deal of the
    luck required simply by refusing to place more than one replica of a
    part on a device in assign_parts.
    
    The meat of the change is a small update to the .validate method in
    RingBuilder.  It basically unrolls a pre-existing (part, replica) loop
    so that all the replicas of the part come out in order so that we can
    build up the set of dev_id's for which all the replicas of a given part
    are assigned part-by-part.
    
    If we observe any duplicates - we raise a warning.
    
    To clean the cobwebs out of the rest of the corner cases we're going to
    delay get_required_overload from kicking in until we achive dispersion,
    and a small check was added when selecting a device subtier to validate
    if it's already being used - picking any other device in the tier works
    out much better.  If no other devices are available in the tier - we
    raise a warning.  A more elegant or optimized solution may exist.
    
    Many unittests did not meet the criteria #1, but the fix was straight
    forward after being identified by the pigeonhole check.
    
    However, many more tests were affected by #2 - but again the fix came to
    be simply adding more devices.  The fantasy that all failure domains
    contain at least replica count devices is prevalent in both our ring
    placement algorithm and it's tests.  These tests were trying to
    demonstrate some complex characteristics of our ring placement algorithm
    and I believe we just got a bit too carried away trying to find the
    simplest possible example to demonstrate the desirable trait.  I think
    a better example looks more like a real ring - with many devices in each
    server and many servers in each zone - I think more devices makes the
    tests better.  As much as possible I've tried to maintain the original
    intent of the tests - when adding devices I've either spread the weight
    out amongst them or added proportional weights to the other tiers.
    
    I added an example straw man test to validate that three devices with
    different weights in three different zones won't blow up.  Once we can
    do that without raising warnings and assigning duplicate device part
    replicas - we can add more.  And more importantly change the warnings to
    errors - because we would much prefer to not do that #$%^ anymore.
    
    Co-Authored-By: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
    Related-Bug: #1452431
    Change-Id: I592d5b611188670ae842fe3d030aa3b340ac36f9

commit a31ee07bda1a5ba8034277af9647aa21ad10be32
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Fri Oct 2 13:53:27 2015 -0700

Make sure we have enough .durable's for GETs
    
    Increase the number of nodes from which we require a final successful
    HTTP responses before we return success to the client on a write - to
    the same number of nodes we'll require successful responses from to
    service a client request for a read.
    
    Change-Id: Ifd36790faa0a5d00ec79c23d1f96a332a0ca0f0b
    Related-Bug: #1469094

commit 29c10db0cbb1369a99c3c63d6f583951ba828b8e
Author: Alistair Coles <alistair.coles@hp.com>
Date:   Wed Apr 22 12:56:50 2015 +0100

Add POST capability to ssync for .meta files
    
    ssync currently does the wrong thing when replicating object dirs
    containing both a .data and a .meta file. The ssync sender uses a
    single PUT to send both object content and metadata to the receiver,
    using the metadata (.meta file) timestamp. This results in the object
    content timestamp being advanced to the metadata timestamp,
    potentially overwriting newer object data on the receiver and causing
    an inconsistency with the container server record for the object.
    
    For example, replicating an object dir with {t0.data(etag=x), t2.meta}
    to a receiver with t1.data(etag=y) will result in the creation of
    t2.data(etag=x) on the receiver. However, the container server will
    continue to list the object as t1(etag=y).
    
    This patch modifies ssync to replicate the content of .data and .meta
    separately using a PUT request for the data (no change) and a POST
    request for the metadata. In effect, ssync replication replicates the
    client operations that generated the .data and .meta files so that
    the result of replication is the same as if the original client requests
    had persisted on all object servers.
    
    Apart from maintaining correct timestamps across sync'd nodes, this has
    the added benefit of not needing to PUT objects when only the metadata
    has changed and a POST will suffice.
    
    Taking the same example, ssync sender will no longer PUT t0.data but will
    POST t2.meta resulting in the receiver having t1.data and t2.meta.
    
    The changes are backwards compatible: an upgraded sender will only sync
    data files to a legacy receiver and will not sync meta files (fixing the
    erroneous behavior described above); a legacy sender will operate as
    before when sync'ing to an upgraded receiver.
    
    Changes:
    - diskfile API provides methods to get the data file timestamp
      as distinct from the diskfile timestamp.
    
    - diskfile yield_hashes return tuple now passes a dict mapping data and
      meta (if any) timestamps to their respective values in the timestamp
      field.
    
    - ssync_sender will encode data and meta timestamps in the
      (hash_path, timestamp) tuple sent to the receiver during
      missing_checks.
    
    - ssync_receiver compares sender's data and meta timestamps to any
      local diskfile and may specify that only data or meta parts are sent
      during updates phase by appending a qualifier to the hash returned
      in its 'wanted' list.
    
    - ssync_sender now sends POST subrequests when a meta file
      exists and its content needs to be replicated.
    
    - ssync_sender may send *only* a POST if the receiver indicates that
      is the only part required to be sync'd.
    
    - object server will allow PUT and DELETE with earlier timestamp than
      a POST
    
    - Fixed TODO related to replicated objects with fast-POST and ssync
    
    Related spec change-id: I60688efc3df692d3a39557114dca8c5490f7837e
    
    Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
    Closes-Bug: 1501528
    Change-Id: I97552d194e5cc342b0a3f4b9800de8aa6b9cb85b

commit 969f1ea9392589c22266ecd5cba326cbd4630fa3
Author: Tim Burke <tim.burke@gmail.com>
Date:   Thu Oct 1 21:10:43 2015 +0000

Fix slorange on-disk format when including whole object
    
    Not that the current implementation is broken, just wasteful.
    
    When a client specifies a range for an SLO segment that includes the
    entire referenced object, we should drop the 'range' key from the
    manifest that's stored on disk.
    
    Previously, we would do this if the uploaded manifest included the
    object-length for validation, but not if it didn't. Now we will
    always drop the 'range' key if the entire segment is being used.
    
    Change-Id: I69d2fff8c7c59b81e9e4777bdbefcd3c274b59a9
    Related-Change: Ia21d51c2cef4e2ee5162161dd2c1d3069009b52c

commit 8f6fd855a1ece5c23d9795a066e305532d56d304
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Wed Sep 23 16:58:54 2015 -0700

Add search filter examples to swift-ring-builder dispersion help
    
    ... because I always forget how to regex and have to independently
    discover it everytime I want to use the tool!
    
    Change-Id: I00d5ab6f573ef26e7e10502493c0066623583b00

commit 4b8f52b1536ad6ea36719b2bfd4cce29fbc843c6
Author: Christian Schwede <cschwede@redhat.com>
Date:   Mon Aug 10 08:44:20 2015 +0000

Fix copy requests to service accounts in Keystone
    
    In case of a COPY request the swift_owner was already set to True, and the
    following PUT request was granted access no matter if a service token was used
    or not.  This allowed to copy data to service accounts without any service
    token.
    
    Service token unit tests have been added to verify that when
    swift_owner is set to True in a request environ, this setting is
    ignored when authorizing another request based on the same
    environ. Applying only this test change on master fails currently, and
    only passes with the fix in this patch.
    
    Tempauth seems to be not affected, however a small doc update has been added to
    make it more clear that a service token is not needed to access a service account
    when an ACL is used.
    
    Further details with an example are available in the bug report
    (https://bugs.launchpad.net/swift/+bug/1483007).
    
    Co-Authored-By: Alistair Coles <alistair.coles@hp.com>
    Co-Authored-By: Hisashi Osanai <osanai.hisashi@jp.fujitsu.com>
    Co-Authored-By: Donagh McCabe <donagh.mccabe@hp.com>
    
    Closes-Bug: 1483007
    Change-Id: I1207b911f018b855362b1078f68c38615be74bbd

commit 167f3c8cbdc09b8161d3a4fcc5b5c57091f0845c
Author: Alistair Coles <alistair.coles@hp.com>
Date:   Wed Sep 30 09:45:57 2015 +0100

Update EC overview doc for PUT path
    
    Update the EC overview docs 'under the hood' section to reflect the
    change in durable file parity from 2 to ec_nparity + 1 [1].
    
    Also fix some typos and cleanup the text.
    
    [1] change id I80d666f61273e589d0990baa78fd657b3470785d
    
    Change-Id: I23f6299da59ba8357da2bb5976d879d9a4bb173e

commit 590e80870c7b7eac6ee389b532bfb2c5029577b2
Author: John Dickinson <me@not.mn>
Date:   Tue Sep 29 15:00:53 2015 -0700

fix docstring: s/2xx/1xx/
    
    Change-Id: If863eb4e66e400081d2402ec8fbf0f9fe8f55b7c

commit 8f1c7409e7b6a854125a234b8a2b969075d26dae
Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Date:   Thu Sep 3 00:40:41 2015 -0700

Don't send commits for quorum *BAD* requests on EC
    
    In EC PUT request case, proxy-server may send commits to object-servers
    it may make .durable file even though the request failed due to a lack
    of quorum number.
    
    For example:
    - Considering the case that almost all object-servers fail by 422
      Unprocessable Entity
    - Using ec scheme 4 + 2
    - 5 (quorum size) object-server failed with 422, 1 object-servers
      succeeded as 201 created
    
    How it works:
    - Client creates a PUT request
    - Proxy will open connections to backend object-servers
    - Proxy will send whole encoded chunks to object-servers
    - Proxy will send content-md5 as footers.
    - Proxy will get responses [422, 422, 422, 422, 422, 201] (currently
      this list will be regarded as "we have quorum response")
    - And then proxy will send commits to object-servers (the only
      object-server with 201 will create .durable file)
    - Proxy will return 503 because the commits results in no response
      statuses from object-servers except the 201 node.
    
    This patch fixes the quorum handling at ObjectController to check
    that it has *successful* quorum responses before sending durable commits.
    
    Closes-Bug: #1491748
    Change-Id: Icc099993be76bcc687191f332db56d62856a500f

commit 224c40fa6730ed5f3ae9fa049959433901b23889
Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Date:   Mon Sep 28 04:50:46 2015 -0700

Fix inlines for test/unit/obj/test_server.py
    
    This patch fixes small nits for inline comments for
    
    https://review.openstack.org/#/c/211338
    
    as a follow-up patch, plus some other typos in comments.
    
    Change-Id: Ibf7dc5683b39d6662573dbb036da146174a965fd

commit be9a8fb56eb3579a20b18d2e0c8478561bea93ca
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Mon Sep 28 06:27:21 2015 +0000

Imported Translations from Zanata
    
    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure
    
    Change-Id: I59e314778d95bce32ab05bfeca2067819180dd30

commit 3f943cfcf2de26e51f1ace96f2c28a36ab105887
Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Date:   Fri May 22 16:53:12 2015 -0700

Fix missing container update
    
    At PUT object request, proxy server makes backend headers (e.g.
    X-Container-Partition) which help object-servers to determine
    the container-server they should update. In addition, the backend
    headers are created as many as the number of container replicas.
    (i.e. 3 replica in container ring, 3 backend headers will be created)
    
    On EC case, Swift fans out fragment archives to backend object-servers.
    Basically the number of fragment archives will be more than the container
    replica number and proxy-server assumes a request as success when quorum
    number of object-server succeeded to store. That would cause to make an
    orphaned object which is stored but not container updated.
    
    For example, assuming k=10, m=4, container replica=3 case:
    
    Assuming, proxy-server attempts to make 14 backend streams but
    unfortunately first 3 nodes returns 507 (disk failure) and then
    the Swift doesn't have any other disks.
    
    In the case, proxy keeps 11 backend streams to store and current Swift
    assumes it as sufficient because it is more than or equals quorum (right
    now k+1 is sufficient i.e. 11 backend streams are enough to store)
    However, in the case, the 11 streams doesn't have the container update
    header so that the request will succeed but container will be never updated.
    
    This patch allows to extract container updates up to object quorum_size
    + 1 to more nodes to ensure the updates. This approach sacrifices the
    container update cost a bit because duplicated updates will be there but
    quorum sizes + 1 seems reasonable (even if it's reaplicated case) to pay
    to ensure that instead of whole objects incude the update headers.
    
    Now Swift will work like as follows:
    
    For example:
    k=10, m=4, qurum_size=11 (k+1), 3 replica for container.
    CU: container update
    CA: commit ack
    
    That result in like as
     CU   CU   CU   CU   CU   CU   CU   CU   CU   CU   CU   CU
    [507, 507, 507, 201, 201, 201, 201, 201, 201, 201, 201, 201, 201, 201]
                                                  CA   CA   CA   CA   CA
    
    In this case, at least 3 container updates are saved.
    
    For another example:
    7 replicated objects, qurum_size=4 (7//2+1), 3 replica for container.
    CU: container update
    CA: commit ack (201s for successful PUT on replicated)
    
     CU   CU   CU   CU   CU
    [507, 507, 507, 201, 201, 201, 201]
                     CA   CA   CA   CA
    
    In this replicated case, at least 2 container updates are saved.
    
    Cleaned up some unit tests so that modifying policies doesn't leak
    between tests.
    
    Co-Authored-By: John Dickinson <me@not.mn>
    Co-Authored-By: Sam Merritt <sam@swiftstack.com>
    
    Closes-Bug: #1460920
    Change-Id: I04132858f44b42ee7ecf3b7994cb22a19d001d70

commit 2a8b455c4a478f7e3e0fda13e6b86a90560ed6b8
Author: Tim Burke <tim.burke@gmail.com>
Date:   Thu Sep 24 21:42:20 2015 +0000

Only yield the pending segment on error if it's a SegmentListingError
    
    (This happens, for example, when we can't read a submanifest for SLO,
    or can't get the next page of objects for DLO)
    
    This should prevent warnings like "Exception RuntimeError: 'generator
    ignored GeneratorExit' in <generator object _coalesce_requests at
    0x4f9bf50> ignored" when running .unittests
    
    Change-Id: I5000053827369553669b1fb5fc9752b46c2cc782

commit 696186c680540606a8cea6495b39e23085b6863f
Author: paul luse <paul.e.luse@intel.com>
Date:   Mon Aug 10 14:37:10 2015 -0700

Better error handling for EC PUT path when client goes away
    
    There are a few places in the PUT path where the object server is
    reading WSGI input and can find that there's nothing there.  e.g. in the
    middle of a 2 phase commit and the proxy goes away for whatever reason,
    like maybe it timed out because things are really busy.  Anyway, this
    results in the ugly ValueError coming out of eventlet.wsgi about a
    zillion levels away from the PUT path.
    
    Expanding on the test cases from lp bug #1496205 and lp bug #1469094
    this change carefully narrows into our read/readline calls to
    wsgi_input and makes sure to tranlsate the ValueError to a
    ChunkReadError - which the object.server can handle along with
    ChunkReadTimeout.  When it made sense, this change attempts to stay
    consistent throughout the code path in logging/raising client disconnect
    instead of timeout.
    
    It's unfortunate the error coming out of eventlet is so generic, but
    that will be improved in future versions [1].
    
    1. https://github.com/eventlet/eventlet/commit/c3ce3eef0b4d0dfdbfb1ec0186d4bb204fb8ecd5
    
    Related-Bug: #1469094
    Related-Bug: #1496205
    Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>
    Change-Id: I9e4dbf26623c0c6fc5c87afd14349466aa157385

commit 25d5e686a172a2418fc6e4028174a354df29e0f1
Author: Tim Burke <tim.burke@gmail.com>
Date:   Tue Aug 11 00:42:30 2015 -0500

Add the ability to specify ranges for SLO segments
    
    Users can now include an optional 'range' field in segment descriptions
    to specify which bytes from the underlying object should be used for the
    segment data. Only one range may be specified per segment. Note that the
    'etag' and 'size_bytes' fields still describe the backing object as a
    whole. So, if a user uploads a manifest like:
    
        [{"path": "/con/obj_seg_1", "etag": null, "size_bytes": 1048576,
          "range": "0-1023"},
         {"path": "/con/obj_seg_2", "etag": null, "size_bytes": 1048576,
          "range": "512-4095"},
         {"path": "/con/obj_seg_1", "etag": null, "size_bytes": 1048576,
          "range": "-2048"}]
    
    then the segment will consist of the first 1024 bytes of /con/obj_seg_1,
    followed by bytes 513 through 4096 (inclusive) of /con/obj_seg_2, and
    finally bytes 1046528 through 1048576 (i.e., the last 2048 bytes) of
    /con/obj_seg_1.
    
    ETag generation for SLOs had been updated to prevent collisions when
    using different ranges for the same set of objects.
    
    Additionally, there are two performance enhancements:
    
     * On download, multiple sequential requests for segments from the same
       underlying object will be coalesced into a single ranged request,
       provided it still does not meet Swift's "egregious range requests"
       critieria.
    
     * On upload, multiple sequential segments referencing the same object
       will be validated against the response from a single HEAD request.
    
    Change-Id: Ia21d51c2cef4e2ee5162161dd2c1d3069009b52c
    DocImpact

commit 3afdcf6b8fa74051fb6da0bf24799c80399624fd
Author: Clay Gerrard <clay.gerrard@gmail.com>
Date:   Thu Sep 17 09:54:30 2015 -0700

Fix proxy handling of EC client disconnect
    
    The ECObjectController was unconditionally sending down the frag archive
    commit document after the client source stream terminated - even if the
    client disconnected early.
    
    We can detect early disconnect in two ways:
    
      1. Content-Length and not enough bytes_transfered
    
         When eventlet.wsgi is reading from a Content-Length body the
         readable returns the empty string and our iterable raises
         StopIteration - but we can check content-length against
         bytes_transfered and know if the client disconnected.
    
      2. Transfer-Encoding: chunked - w/o a 0\r\n\r\n
    
         When eventlet.wsgi is reading from a Transfer-Encoding: chunked
         body the socket read returns the empty string, eventlet.wsgi's
         chunked parser raises ValueError (which we translate to
         ChunkReadError*) and we know we know the client disconnected.
    
    ... if we detect either of these conditions the proxy should:
    
      1. *not* send down the commit document to object servers
      2. disconnect from backend servers
      3. log the client disconnect
    
    Oddly the code on master was only messing up the first part.  Backend
    connections were terminated (gracefully after the commit document), and
    then the disconnect was being logged as 499.
    
    So now we only send down the commit document on a successful complete
    client HTTP request (either whole Content-Length, or clean
    Transfer-Encoding: chunked 0\r\n\r\n).
    
     * To detect the early disconnect on Transfer-Encoding: chunked a new
    swift.common.exceptions.ChunkReadError is used to translate
    eventlet.wsgi's more general IOError and ValueErrors into something
    more appropriate to catch and handle closer to our generic
    ChunkReadTimeout handling.
    
    Co-Author: Alistair Coles <alistair.coles@hp.com>
    Closes-Bug: #1496205
    Change-Id: I028a530aba82d50baa4ee1d05ddce18d4cce4e81

commit 371cd40f90e773ba74170c2000b1d9b4dc870c73
Author: OpenStack Proposal Bot <openstack-infra@lists.openstack.org>
Date:   Tue Sep 22 07:36:39 2015 +0000

Imported Translations from Zanata
    
    For more information about this automatic import see:
    https://wiki.openstack.org/wiki/Translations/Infrastructure
    
    Change-Id: I800cbf3d5cbc510b60cdf4f407d91b9216830094

commit 9046676968a27191a9a96b19924e3fd484b6653b
Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Date:   Wed Sep 16 06:54:57 2015 -0700

Fix client disconnect during multhiphase commit
    
    This patch add a test to figure out the failure case behavior of
    object-server when the connection from proxy-server disconnected
    during commit phase. Especially, this patch was made to focus on
    making sure whether or not contaienr updates occurs in the situation.
    
    In the process of working on that test we made the behavior of the
    object-server when the connection from the proxy-server disconnected
    during the commit phase - reasonable.
    
    We capture the IOError/ValueError's that eventlet.wsgi might barf out
    really close to the wsgi_input read and translate them to a
    swift.common.exceptions.ChunkReadError so we can handle them at a higher
    level in the ObjectController's generic PUT disconnect handling.
    
    Since that test went so well, we refactored the other ones to use some
    common context management and wrote a few more.
    
    Co-Author: Clay Gerrard <clay.gerrard@gmail.com>
    Change-Id: I60c98172e524869b06bdf23fd1c4e1bce7a98f80

commit db0e0b8f394d1714679e274fdad23225b6dcf716
Author: Alistair Coles <alistair.coles@hp.com>
Date:   Fri Sep 11 16:04:08 2015 +0100

diskfile: make get_ondisk_files return a dict
    
    get_ondisk_files gathers a dict with more info than just the set
    of data, ts and/or meta files, but only returns the
    set of files to the DiskFile class. The other info
    in the dict may be useful policy specific implementations, so
    return the whole dict.
    
    Also, refactor _get_ondisk_file to move duplicated code into the
    BaseDiskFile open() method.
    
    Change-Id: I7a17d26b7ed7f7c593f577332937793419c03cfa
    Co-Authored-By: Paul Luse <paul.e.luse@intel.com>
    Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

commit 36bc4cd9ed531973e087859e74574d4562ee6759
Author: janonymous <janonymous.codevulture@gmail.com>
Date:   Sun Jul 12 22:18:58 2015 +0530

Python 3 using builtins instead of __builtin__, rename raw_input() to input() from six.moves
    
    Using these modules from six
    
    Replacements:
    *input() from six.moves could be used instead of raw_input()
    *__builtin__ with six.moves.builtins
    
    Change-Id: I8749dd33496ed8718bd2c6620c9dc5240c2ce683

commit 4feaccf0145a908d7b93e73f9d97d3cf0400f9e4
Author: Victor Stinner <vstinner@redhat.com>
Date:   Thu Aug 27 00:56:08 2015 +0200

Input validation must not depend on the locale
    
    storage_policy.py: replace string.letters with string.ascii_letters.
    
    This change does not change the behaviour on Python 2. It is only needed
    for Python 3 because string.letters was removed.
    
    On Python 2, string.letters is modified when locale.setlocale() is
    called. Hopefully, Swift doesn't call setlocale() and so it's safe to
    replace it.
    
    Change-Id: Ifbf9332ae739b1bfc9e6d2831f4e7581e69f233d

commit 5fbfbba7876fff74f53b49ee4a6c1ab76ae2552c
Author: Victor Stinner <vstinner@redhat.com>
Date:   Thu Aug 27 01:00:26 2015 +0200

Port swob to Python 3
    
    * HeaderEnvironProxy: replace UserDict.DictMixin with
      collections.MutableMapping, add __iter__ and __len__ methods, and
      add more unit tests
    * Replace url* imports with six.moves.urllib
    
    Change-Id: I9ed22d0dd52ee7ac8fa16571f82c45975cfdffff

commit 4de6e32bcaaa1ae1ac437c8d4f0ebd722b91f712
Author: Kota Tsuyuzaki <tsuyuzaki.kota@lab.ntt.co.jp>
Date:   Wed Jul 1 00:11:41 2015 -0700

Change POST container semantics
    
    To achieve last modified header on GETorHEAD container
    we should allow POST container to overwrite put-timestamp.
    That is because following patches will change the the put-timestamp
    value semantics in both container db and account db as "container
    last modified".
    
    If we achieved this and followings, we will be able to retrieve the
    container timestamp which is suggested when the container metadata
    modified.
    
    Example:
    - Create a container.
    - Change the ACL with POST container. (e.g. x-container-read)
    - After that, we can know the container was changed by comparing
      between the timestamp from container creation and the last modified
      generated from put-timestamp.
    
    Change-Id: I1a545fcd1896798dfa7cb5e5af97c78f5d7d7e4d

commit 929089c04bda9466bacf3879ba7020da0c07ca7b
Author: janonymous <janonymous.codevulture@gmail.com>
Date:   Fri Jul 24 19:46:14 2015 +0530

Add __bool__ for classes that implement __nonzero__
    
    __nonzero__ has been renamed to __bool__ in python3.4. So add the
    __bool__ magic method that will in turn call the existing __nonzero__
    method.
    
    Change-Id: I3479564c57316733a9c30d97504ceac74771bfd4

commit 78cb608ff7bf3a589fcb4d83530393c516e652cb
Author: janonymous <janonymous.codevulture@gmail.com>
Date:   Sun Jul 5 11:49:54 2015 +0530

Python3: Fix Remaining issues of python3 compatibility in bin directory
    
    Changes Of py3 in bin :
    
    * https://review.openstack.org/#/c/196835/
    * ConfigParser from six.moves
    
    Change-Id: Ic0374c8e09dfd595ec12c4d31b17dad30eaa803c

tags:

added: in-feature-crypto

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-10-16: Related fix proposed to swift (feature/hummingbird)

#12

Related fix proposed to branch: feature/hummingbird
Review: https://review.openstack.org/236162

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2015-10-17: Related fix merged to swift (feature/hummingbird)

#13

Download full text (52.0 KiB)

Reviewed: https://review.openstack.org/236162
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=18ddcaf0d6b67fcbb6b0a4cf4a9a99c72f6f3a08
Submitter: Jenkins
Branch: feature/hummingbird

commit a9ddc2d9ea402eaac7ccd8992387f77855968ab5
Author: Mahati Chamarthy <email address hidden>
Date: Fri Oct 16 18:18:33 2015 +0530

Hyperlink fix to first contribution doc

Change-Id: I19fc1abc89f888233b80a57c68a152c1c1758640

commit 83a1151d13e096b480aefe6ec18259f2d7d021db
Author: Pete Zaitcev <email address hidden>
Date: Fri Oct 9 16:45:20 2015 -0600

Interpolate the explanation string not whole HTML body

The only reason this exists is that I promised to do it.
But in our case, there's no big advantage, and here's why.

    The general thinking goes that strings must be interpolated
    first because the body may contain a syntax that confuses the
    interpolation. So this patch makes the code "more correct".
    However, our HTML template is tightly controlled. It's not
    like it contains additional percents.

So I'll just leave this here for now while I'm asking if
the content type is set correctly.

Change-Id: Ia18aeb0f94ef389f8b95450986a566e5fa06aa10

commit 384b91eb824376659989b904f9396cbf2e02d2bd
Author: asettle <email address hidden>
Date: Thu Sep 3 15:11:46 2015 +1000

Moving DLO functionality doc to the middleware code

    This change moves the RST DLO documentation from
    statically inside overview_large_objects.rst and moves it
    to middleware/dlo.py.
    This is where all middleware RST documentation is defined.

    The overview_large_objects.rst is still the main page
    for information on large objects, so now dynamically
    points to both the DLO and SLO middleware RST
    documentation and the relevant middleware.rst page
    simply points to it.

Change-Id: I40d918c8b7bc608ab945805d69fe359521df038a
Closes-bug: #1276852

commit 2996974e5d48b4efaa1b271b8fbd0387bced7242
Author: Ondřej Nový <email address hidden>
Date: Sat Oct 10 14:56:30 2015 +0200

Script for running unit, func and probe tests at once

When developing Swift it's often needed to run all tests.
This script makes it much simpler.

Change-Id: I67e6f7cc05ebd0475001c1b56e8f6fd09c8c644f

commit c2182fd4163050a5f76eb3dedb7703dc821fa83d
Author: janonymous <email address hidden>
Date: Fri Jul 17 20:20:15 2015 +0530

Python3: do not use im_self/im_func/func_closure

Use __self__, __func__ and __closure__ instead, as they work
with both Python 2 and 3.

Modifying usage of __func__ in codebase.

Change-Id: I57e907c28c1d4646605e70194ea3650806730b83

commit c0866ceaac2f69ae01345a795520141f59ec64f5
Author: Samuel Merritt <email address hidden>
Date: Fri Sep 25 17:26:37 2015 -0700

Improve SLO PUT error checking

    This commit tries to give the user a reason that their SLO manifest
    was invalid instead of just saying "Invalid SLO Manifest File". It
    doesn't get every error condition, but it's better than before.

Examples of things that now have real error...

Reviewed:  https://review.openstack.org/236162
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=18ddcaf0d6b67fcbb6b0a4cf4a9a99c72f6f3a08
Submitter: Jenkins
Branch:    feature/hummingbird

commit a9ddc2d9ea402eaac7ccd8992387f77855968ab5
Author: Mahati Chamarthy <mahati.chamarthy@gmail.com>
Date:   Fri Oct 16 18:18:33 2015 +0530

Hyperlink fix to first contribution doc
    
    Change-Id: I19fc1abc89f888233b80a57c68a152c1c1758640

commit 83a1151d13e096b480aefe6ec18259f2d7d021db
Author: Pete Zaitcev <zaitcev@kotori.zaitcev.us>
Date:   Fri Oct 9 16:45:20 2015 -0600

Interpolate the explanation string not whole HTML body
    
    The only reason this exists is that I promised to do it.
    But in our case, there's no big advantage, and here's why.
    
    The general thinking goes that strings must be interpolated
    first because the body may contain a syntax that confuses the
    interpolation. So this patch makes the code "more correct".
    However, our HTML template is tightly controlled. It's not
    like it contains additional percents.
    
    So I'll just leave this here for now while I'm asking if
    the content type is set correctly.
    
    Change-Id: Ia18aeb0f94ef389f8b95450986a566e5fa06aa10

commit 384b91eb824376659989b904f9396cbf2e02d2bd
Author: asettle <alexandra.settle@rackspace.com>
Date:   Thu Sep 3 15:11:46 2015 +1000

Moving DLO functionality doc to the middleware code
    
    This change moves the RST DLO documentation from
    statically inside overview_large_objects.rst and moves it
    to middleware/dlo.py.
    This is where all middleware RST documentation is defined.
    
    The overview_large_objects.rst is still the main page
    for information on large objects, so now dynamically
    points to both the DLO and SLO middleware RST
    documentation and the relevant middleware.rst page
    simply points to it.
    
    Change-Id: I40d918c8b7bc608ab945805d69fe359521df038a
    Closes-bug: #1276852

commit 2996974e5d48b4efaa1b271b8fbd0387bced7242
Author: Ondřej Nový <ondrej.novy@firma.seznam.cz>
Date:   Sat Oct 10 14:56:30 2015 +0200

Script for running unit, func and probe tests at once
    
    When developing Swift it's often needed to run all tests.
    This script makes it much simpler.
    
    Change-Id: I67e6f7cc05ebd0475001c1b56e8f6fd09c8c644f

commit c2182fd4163050a5f76eb3dedb7703dc821fa83d
Author: janonymous <janonymous.codevulture@gmail.com>
Date:   Fri Jul 17 20:20:15 2015 +0530

Python3: do not use im_self/im_func/func_closure
    
    Use __self__, __func__ and __closure__ instead, as they work
    with both Python 2 and 3.
    
    Modifying usage of __func__ in codebase.
    
    Change-Id: I57e907c28c1d4646605e70194ea3650806730b83

commit c0866ceaac2f69ae01345a795520141f59ec64f5
Author: Samuel Merritt <sam@swiftstack.com>
Date:   Fri Sep 25 17:26:37 2015 -0700

Improve SLO PUT error checking
    
    This commit tries to give the user a reason that their SLO manifest
    was invalid instead of just saying "Invalid SLO Manifest File". It
    doesn't get every error condition, but it's better than before.
    
    Examples of things that now have real error messages include:
     * bad keys in manifest (e.g. using "name" instead of "path")
     * bogus range (e.g. "bytes=123-taco")
     * multiple ranges (e.g. "bytes=10-20,30-40")
     * bad JSON structure (i.e. not a list of objects)
     * non-integer size_bytes
    
    Also fixed an annoyance with unspecified-size segments that are too
    small. Previously, if you uploaded a segment reference with
    '{"size_bytes": null, ...}' in it and the referenced segment was less
    than 1 MiB, you'd get a response that looked like this:
    
        HTTP/1.1 400 Bad Request
        Content-Length: 62
        Content-Type: text/html; charset=UTF-8
        X-Trans-Id: txd9ee3b25896642098e4d9-0055dd095a
        Date: Wed, 26 Aug 2015 00:33:30 GMT
    
        Each segment, except the last, must be at least 1048576 bytes.
    
    This is true, but not particularly helpful, since it doesn't tell you
    which of your segments violated the rule.
    
    Now you get something more like this:
    
        HTTP/1.1 400 Bad Request
        Content-Length: 49
        Content-Type: text/plain
        X-Trans-Id: tx586e52580bac4956ad8e2-0055dd09c2
        Date: Wed, 26 Aug 2015 00:35:14 GMT
    
        Errors:
        /segs/small, Too Small; each segment, except the last...
    
    It's not exactly a tutorial on SLO manifests, but at least it names
    the problematic segment.
    
    This also changes the status code for a self-referential manifest from
    409 to 400. The rest of the error machinery was using 400, and
    special-casing self-reference would be really annoying. Besides, now
    that we're showing more than one error message at a time, what would
    the right status code be for a manifest with a self-referential
    segment *and* a segment with a bad range? 400? 409? 404.5? It's much
    more consistent to just say invalid manifest --> 400.
    
    Change-Id: I2275683230b36bc273319254e37c16b9e9b9d69c