setting reclaim_age longer than default doesn't work

Bug #1626296 reported by clayg
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Wishlist
Unassigned

Bug Description

You can set it shorter and it *mostly* works - but if you try to set it *longer* than one week you'll find that every time it makes a REPLICATE call to an object-server that triggers a suffix rehash it'll call clean_ondisk_files with that manager reclaim_age which will be the default ONE_WEEK.

The work around is of course to set reclaim_age in the [DEFAULT] section since the object-server already passes it's conf through to the DiskFileManager.

This should be the documented deployment configuration.

Revision history for this message
Mahati Chamarthy (mahati-chamarthy) wrote :
Changed in swift:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/374419
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=69f7be99a6e090a251412a6925e0f29946818c6a
Submitter: Jenkins
Branch: master

commit 69f7be99a6e090a251412a6925e0f29946818c6a
Author: Mahati Chamarthy <email address hidden>
Date: Mon Jul 25 20:10:44 2016 +0530

    Move documented reclaim_age option to correct location

    The reclaim_age is a DiskFile option, it doesn't make sense for two
    different object services or nodes to use different values.

    I also driveby cleanup the reclaim_age plumbing from get_hashes to
    cleanup_ondisk_files since it's a method on the Manager and has access
    to the configured reclaim_age. This fixes a bug where finalize_put
    wouldn't use the [DEFAULT]/object-server configured reclaim_age - which
    is normally benign but leads to weird behavior on DELETE requests with
    really small reclaim_age.

    There's a couple of places in the replicator and reconstructor that
    reach into their manager to borrow the reclaim_age when emptying out
    the aborted PUTs that failed to cleanup their files in tmp - but that
    timeout doesn't really need to be coupled with reclaim_age and that
    method could have just as reasonably been implemented on the Manager.

    UpgradeImpact: Previously the reclaim_age was documented to be
    configurable in various object-* services config sections, but that did
    not work correctly unless you also configured the option for the
    object-server because of REPLICATE request rehash cleanup. All object
    services must use the same reclaim_age. If you require a non-default
    reclaim age it should be set in the [DEFAULT] section. If there are
    different non-default values, the greater should be used for all object
    services and configured only in the [DEFAULT] section.

    If you specify a reclaim_age value in any object related config you
    should move it to *only* the [DEFAULT] section before you upgrade. If
    you configure a reclaim_age less that your consistency window you are
    likely to be eaten by a Grue.

    Closes-Bug: #1626296

    Change-Id: I2b9189941ac29f6e3be69f76ff1c416315270916
    Co-Authored-By: Clay Gerrard <email address hidden>

Changed in swift:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/swift 2.13.0

This issue was fixed in the openstack/swift 2.13.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.