OpenStack Object Storage (swift)

setting reclaim_age longer than default doesn't work

Bug #1626296 reported by clayg on 2016-09-21

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Object Storage (swift)	Fix Released	Wishlist	Unassigned

Bug Description

You can set it shorter and it *mostly* works - but if you try to set it *longer* than one week you'll find that every time it makes a REPLICATE call to an object-server that triggers a suffix rehash it'll call clean_ondisk_files with that manager reclaim_age which will be the default ONE_WEEK.

The work around is of course to set reclaim_age in the [DEFAULT] section since the object-server already passes it's conf through to the DiskFileManager.

This should be the documented deployment configuration.

Tags:

Revision history for this message

Mahati Chamarthy (mahati-chamarthy) wrote on 2016-11-17:

Associated fix -> https://review.openstack.org/#/c/374419/

Changed in swift:
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-01-16: Fix merged to swift (master)

Reviewed: https://review.openstack.org/374419
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=69f7be99a6e090a251412a6925e0f29946818c6a
Submitter: Jenkins
Branch: master

commit 69f7be99a6e090a251412a6925e0f29946818c6a
Author: Mahati Chamarthy <email address hidden>
Date: Mon Jul 25 20:10:44 2016 +0530

Move documented reclaim_age option to correct location

The reclaim_age is a DiskFile option, it doesn't make sense for two
different object services or nodes to use different values.

    I also driveby cleanup the reclaim_age plumbing from get_hashes to
    cleanup_ondisk_files since it's a method on the Manager and has access
    to the configured reclaim_age. This fixes a bug where finalize_put
    wouldn't use the [DEFAULT]/object-server configured reclaim_age - which
    is normally benign but leads to weird behavior on DELETE requests with
    really small reclaim_age.

    There's a couple of places in the replicator and reconstructor that
    reach into their manager to borrow the reclaim_age when emptying out
    the aborted PUTs that failed to cleanup their files in tmp - but that
    timeout doesn't really need to be coupled with reclaim_age and that
    method could have just as reasonably been implemented on the Manager.

    UpgradeImpact: Previously the reclaim_age was documented to be
    configurable in various object-* services config sections, but that did
    not work correctly unless you also configured the option for the
    object-server because of REPLICATE request rehash cleanup. All object
    services must use the same reclaim_age. If you require a non-default
    reclaim age it should be set in the [DEFAULT] section. If there are
    different non-default values, the greater should be used for all object
    services and configured only in the [DEFAULT] section.

    If you specify a reclaim_age value in any object related config you
    should move it to *only* the [DEFAULT] section before you upgrade. If
    you configure a reclaim_age less that your consistency window you are
    likely to be eaten by a Grue.

Closes-Bug: #1626296

Change-Id: I2b9189941ac29f6e3be69f76ff1c416315270916
Co-Authored-By: Clay Gerrard <email address hidden>

Reviewed:  https://review.openstack.org/374419
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=69f7be99a6e090a251412a6925e0f29946818c6a
Submitter: Jenkins
Branch:    master

commit 69f7be99a6e090a251412a6925e0f29946818c6a
Author: Mahati Chamarthy <mahati.chamarthy@gmail.com>
Date:   Mon Jul 25 20:10:44 2016 +0530

Move documented reclaim_age option to correct location
    
    The reclaim_age is a DiskFile option, it doesn't make sense for two
    different object services or nodes to use different values.
    
    I also driveby cleanup the reclaim_age plumbing from get_hashes to
    cleanup_ondisk_files since it's a method on the Manager and has access
    to the configured reclaim_age.  This fixes a bug where finalize_put
    wouldn't use the [DEFAULT]/object-server configured reclaim_age - which
    is normally benign but leads to weird behavior on DELETE requests with
    really small reclaim_age.
    
    There's a couple of places in the replicator and reconstructor that
    reach into their manager to borrow the reclaim_age when emptying out
    the aborted PUTs that failed to cleanup their files in tmp - but that
    timeout doesn't really need to be coupled with reclaim_age and that
    method could have just as reasonably been implemented on the Manager.
    
    UpgradeImpact: Previously the reclaim_age was documented to be
    configurable in various object-* services config sections, but that did
    not work correctly unless you also configured the option for the
    object-server because of REPLICATE request rehash cleanup.  All object
    services must use the same reclaim_age.  If you require a non-default
    reclaim age it should be set in the [DEFAULT] section.  If there are
    different non-default values, the greater should be used for all object
    services and configured only in the [DEFAULT] section.
    
    If you specify a reclaim_age value in any object related config you
    should move it to *only* the [DEFAULT] section before you upgrade.  If
    you configure a reclaim_age less that your consistency window you are
    likely to be eaten by a Grue.
    
    Closes-Bug: #1626296
    
    Change-Id: I2b9189941ac29f6e3be69f76ff1c416315270916
    Co-Authored-By: Clay Gerrard <clay.gerrard@gmail.com>

Changed in swift:
status:	In Progress → Fix Released

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-02-16: Fix included in openstack/swift 2.13.0

This issue was fixed in the openstack/swift 2.13.0 release.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.