container-sharder should keep cleaving when there are no rows

Bug #1839355 reported by Tim Burke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Undecided
Matthew Oliver

Bug Description

Suppose you've got a sharding or sharded container. It's pretty overloaded (that's why we enabled sharding!), so if a client issues a container PUT before uploading objects (such as python-swiftclient does), it's entirely reasonable that we may create some new, empty DBs on handoffs. (Side note, this was also a decent mitigation for https://bugs.launchpad.net/swift/+bug/1833612 and https://bugs.launchpad.net/swift/+bug/1833616)

That's mostly OK -- the replicator will get all of the sharding state loaded into the new guy; from there:

 - handoff stops confusing the proxy into thinking the container's unsharded
 - handoff can even respond to proxy requests for shard ranges!
 - sharder will work through cleaving the shard ranges on the handoff,
   eventually moving the DB from sharding to sharded
 - once sharded, the replicator finally cleans up the handoff.

But... for very large root containers (think hundreds or even a thousand shard ranges!), this process can take a *long* time since we only handle cleave_batch_size shards per cycle. We don't really want to crank that up -- we still want to make steady progress across all sharding containers instead of chewing on one for a long time before ever looking at the next one.

At the same time, we should recognize that not all DBs represent the same amount of work to shard: we probably *could* work through the entirety of that empty DB in the same amount of time that it takes us to cleave a million rows from a full one.

To put some numbers on the problem: I've seen an empty DB with 682 ranges done, 822 to go. Surely in part because of needing to send stats for 1500 shards (see also: https://bugs.launchpad.net/swift/+bug/1834097), the sharder cycle time was somewhere around 1300s or 22mins. At 2 shard ranges being cleaved per cycle, that's *11 days* to get rid of the (empty!) DB.

Revision history for this message
Matthew Oliver (matt-0) wrote :

I have almost finished the first version of a patch to address this.

Changed in swift:
assignee: nobody → Matthew Oliver (matt-0)
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.opendev.org/675820

Changed in swift:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.opendev.org/675820
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=e9cd9f74a5264f396783ca2a4548a3da7cee7bff
Submitter: Zuul
Branch: master

commit e9cd9f74a5264f396783ca2a4548a3da7cee7bff
Author: Matthew Oliver <email address hidden>
Date: Mon Aug 12 16:16:17 2019 +1000

    sharder: Keep cleaving on empty shard ranges

    When a container is being cleaved there is a possiblity that we're
    dealing with an empty or near empty container created on a handoff node.
    These containers may have a valid list of shard ranges, so would need
    to cleave to the new shards.
    Currently, when using a `cleave_batch_size` that is smaller then the
    number of shard ranges on the cleaving container, these containers will
    have to take a few shard passes to shard, even though there maybe
    nothing in them.

    This is worse if a really large container is sharding, and due to being
    slow, error limitted a node causing a new container on a handoff
    location. This empty container would have a large number of shard ranges
    and could take a _very_ long time to shard away, slowing the process
    down.

    This patch eliminates the issue by detecting when no objects are
    returned for a shard range. The `_cleave_shard_range` method now
    returns 3 possible results:

      - CLEAVE_SUCCESS
      - CLEAVE_FAILED
      - CLEAVE_EMPTY

    They all are pretty self explanitory. When `CLEAVE_EMPTY` is returned
    the code will:

      - Log
      - Not replicate the empty temp shard container sitting in a
        handoff location
      - Not count the shard range in the `cleave_batch_size` count
      - Update the cleaving context so sharding can move forward

    If there already is a shard range DB existing on a handoff node to use
    then the sharder wont skip it, even if there are no objects, it'll
    replicate it and treat it as normal, including using a `cleave_batch_size`
    slot.

    Change-Id: Id338f6c3187f93454bcdf025a32a073284a4a159
    Closes-Bug: #1839355

Changed in swift:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/swift 2.23.0

This issue was fixed in the openstack/swift 2.23.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/losf)

Fix proposed to branch: feature/losf
Review: https://review.opendev.org/686864

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/losf)
Download full text (26.4 KiB)

Reviewed: https://review.opendev.org/686864
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=bfa8e9feb51f2b10adfec3a741661a76fcf73216
Submitter: Zuul
Branch: feature/losf

commit cb76e00e90aea834c8f3dd8a6ca5131acd43663b
Author: OpenStack Proposal Bot <email address hidden>
Date: Fri Oct 4 07:05:07 2019 +0000

    Imported Translations from Zanata

    For more information about this automatic import see:
    https://docs.openstack.org/i18n/latest/reviewing-translation-import.html

    Change-Id: I40ce1d36f1c207a0d3e99a3a84a162b21b3c57cf

commit 527a57ffcdefc03a5080b07d63f0ded319e08dfe
Author: OpenStack Release Bot <email address hidden>
Date: Thu Oct 3 16:35:36 2019 +0000

    Update master for stable/train

    Add file to the reno documentation build to show release notes for
    stable/train.

    Use pbr instruction to increment the minor version number
    automatically so that master versions are higher than the versions on
    stable/train.

    Change-Id: Ia93e0b690f47c6231423a25dfd6a108a60378a21
    Sem-Ver: feature

commit 8a4becb12fbe3d4988ddee73536673d6f55682dd
Author: Tim Burke <email address hidden>
Date: Fri Sep 27 15:18:59 2019 -0700

    Authors/changelog for 2.23.0

    Also, make some CHANGELOG formatting more consistent.

    Change-Id: I380ee50e075a8676590e755f24a3fd7a7a331029

commit bf9346d88de2aeb06da3b2cde62ffa6200936367
Author: Tim Burke <email address hidden>
Date: Thu Aug 15 14:33:06 2019 -0700

    Fix some request-smuggling vectors on py3

    A Python 3 bug causes us to abort header parsing in some cases. We
    mostly worked around that in the related change, but that was *after*
    eventlet used the parsed headers to determine things like message
    framing. As a result, a client sending a malformed request (for example,
    sending both Content-Length *and* Transfer-Encoding: chunked headers)
    might have that request parsed properly and authorized by a proxy-server
    running Python 2, but the proxy-to-backend request could get misparsed
    if the backend is running Python 3. As a result, the single client
    request could be interpretted as multiple requests by an object server,
    only the first of which was properly authorized at the proxy.

    Now, after we find and parse additional headers that weren't parsed by
    Python, fix up eventlet's wsgi.input to reflect the message framing we
    expect given the complete set of headers. As an added precaution, if the
    client included Transfer-Encoding: chunked *and* a Content-Length,
    ensure that the Content-Length is not forwarded to the backend.

    Change-Id: I70c125df70b2a703de44662adc66f740cc79c7a9
    Related-Change: I0f03c211f35a9a49e047a5718a9907b515ca88d7
    Closes-Bug: 1840507

commit 0217b12b6d7d6f3727a54db65614ff1ef52d6286
Author: Matthew Oliver <email address hidden>
Date: Wed Sep 4 14:30:33 2019 +1000

    PDF Documentation Build tox target

    This patch adds a `pdf-docs` tox target that will build
    PDF versions of our docs. As per the Train community goal:

      https://governance.openstack.org/tc/goals/selected/train/pdf-doc-...

tags: added: in-feature-losf
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.