OpenStack Object Storage (swift)

get_container_info should have some probability of bypassing memcache and going to disk

Bug #1883324 reported by Tim Burke on 2020-06-12

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Object Storage (swift)	Fix Released	Undecided	Unassigned

Bug Description

If you've got thousands of requests per second for objects in a single container, you basically NEVER want that container's info to ever fall out of memcache. If it *does*, all those clients are going to overload the container -- most will fail, and with bug #1883211 the ones that succeed may not get their info into memcache long enough to help much. You can try increasing recheck_container_existence, but eventually it's still going to fall out.

The solution (in my mind, anyway) is to have some small probability -- 0.01%, say -- of get_container_info skipping memcache all together and going out to the container-server any way; then we'll refresh the TTL in memcache and we're good for another minute.

Revision history for this message

clayg (clay-gerrard) wrote on 2020-06-18:

Does any HEAD to the container push out the TTL - or does it specifically need to get a get_container_info call that misses?

Revision history for this message

Tim Burke (1-tim-z) wrote on 2020-06-18:

Any GET or HEAD should do it -- the probe test in https://review.opendev.org/#/c/735359/ depends on that behavior.

Revision history for this message

Christian Schwede (cschwede) wrote on 2021-10-20:

Related: https://review.opendev.org/c/openstack/swift/+/736802

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2021-12-15: Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/swift/+/821921

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-01-27: Related fix merged to swift (master)

Reviewed: https://review.opendev.org/c/openstack/swift/+/736802
Committed: https://opendev.org/openstack/swift/commit/8c6ccb5fd41864155a043856ff9240e84999e4bf
Submitter: "Zuul (22348)"
Branch: master

commit 8c6ccb5fd41864155a043856ff9240e84999e4bf
Author: Tim Burke <email address hidden>
Date: Thu Jun 18 11:48:14 2020 -0700

proxy: Add a chance to skip memcache when looking for shard ranges

    By having some small portion of calls skip cache and go straight to
    disk, we can ensure the cache is always kept fresh and never expires (at
    least, for active containers). Previously, when shard ranges fell out of
    cache there would frequently be a thundering herd that could overwhelm
    the container server, leading to 503s served to clients or an increase
    in async pendings.

Include metrics for hit/miss/skip rates.

Change-Id: I6d74719fb41665f787375a08184c1969c86ce2cf
Related-Bug: #1883324

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-07-25: Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/swift/+/850954

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-07-25: Change abandoned on swift (master)

Change abandoned by "Tim Burke <email address hidden>" on branch: master
Review: https://review.opendev.org/c/openstack/swift/+/821921
Reason: Re-proposed as https://review.opendev.org/c/openstack/swift/+/850954 since CI went crazy here, and now Gerrit won't let me even leave a comment.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2022-09-26: Fix merged to swift (master)

Reviewed: https://review.opendev.org/c/openstack/swift/+/850954
Committed: https://opendev.org/openstack/swift/commit/5c6407bf591121fa10f8a8b10d22b3a64b9c4fe9
Submitter: "Zuul (22348)"
Branch: master

commit 5c6407bf591121fa10f8a8b10d22b3a64b9c4fe9
Author: Tim Burke <email address hidden>
Date: Thu Jan 6 12:09:58 2022 -0800

proxy: Add a chance to skip memcache for get_*_info calls

    Avoid this by allowing some small fraction of requests to bypass and
    refresh the cache, pushing out the TTL as long as there continue to be
    requests to the container. The likelihood of skipping the cache is
    configurable, similar to what we did for shard range sets.

Change-Id: If9249a42b30e2a2e7c4b0b91f947f24bf891b86f
Closes-Bug: #1883324

Changed in swift:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2023-01-31: Fix included in openstack/swift 2.31.0

This issue was fixed in the openstack/swift 2.31.0 release.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.