Do not sync suffixes when remote rejects reconstructor revert
SSYNC is designed to limit concurrent incoming connections in order to
prevent IO contention. The reconstructor should expect remote
replication servers to fail ssync_sender when the remote is too busy.
When the remote rejects SSYNC - it should avoid forcing additional IO
against the remote with a REPLICATE request which causes suffix
rehashing.
Suffix rehashing via REPLICATE verbs takes two forms:
1) a initial pre-flight call to REPLICATE /dev/part will cause a remote
primary to rehash any invalid suffixes and return a map for the local
sender to compare so that a sync can be performed on any mis-matched
suffixes.
2) a final call to REPLICATE /dev/part/suf1-suf2-suf3[-sufX[...]] will
cause the remote primary to rehash the *given* suffixes even if they are
*not* invalid. This is a requirement for rsync replication because
after a suffix is synced via rsync the contents of a suffix dir will
likely have changed and the remote server needs to update it hashes.pkl
to reflect the new data.
SSYNC does not *need* to send a post-sync REPLICATE request. Any
suffixes that are modified by the SSYNC protocol will call _finalize_put
under the hood as it is syncing. It is however not harmful and
potentially useful to go ahead refresh hashes after an SSYNC while the
inodes of those suffixes are warm in the cache.
However, that only makes sense if the SSYNC conversation actually synced
any suffixes - if SSYNC is rejected for concurrency before it ever got
started there is no value in the remote performing a rehash. It may be
that *another* reconstructor is pushing data into that same partition
and the suffixes will become immediately invalidated.
If a ssync_sender does not successful finish a sync the reconstructor
should skip the REPLICATE call entirely and move on to the next
partition without causing any useless remote IO.
Reviewed: https:/ /review. openstack. org/464980 /git.openstack. org/cgit/ openstack/ swift/commit/ ?id=e127f2277c4 436a97f5b2d7430 7a31af2c98297f
Committed: https:/
Submitter: Jenkins
Branch: stable/ocata
commit e127f2277c4436a 97f5b2d74307a31 af2c98297f
Author: Clay Gerrard <email address hidden>
Date: Thu Feb 16 14:14:09 2017 -0800
Do not sync suffixes when remote rejects reconstructor revert
SSYNC is designed to limit concurrent incoming connections in order to
prevent IO contention. The reconstructor should expect remote
replication servers to fail ssync_sender when the remote is too busy.
When the remote rejects SSYNC - it should avoid forcing additional IO
against the remote with a REPLICATE request which causes suffix
rehashing.
Suffix rehashing via REPLICATE verbs takes two forms:
1) a initial pre-flight call to REPLICATE /dev/part will cause a remote
primary to rehash any invalid suffixes and return a map for the local
sender to compare so that a sync can be performed on any mis-matched
suffixes.
2) a final call to REPLICATE /dev/part/ suf1-suf2- suf3[-sufX[ ...]] will
cause the remote primary to rehash the *given* suffixes even if they are
*not* invalid. This is a requirement for rsync replication because
after a suffix is synced via rsync the contents of a suffix dir will
likely have changed and the remote server needs to update it hashes.pkl
to reflect the new data.
SSYNC does not *need* to send a post-sync REPLICATE request. Any
suffixes that are modified by the SSYNC protocol will call _finalize_put
under the hood as it is syncing. It is however not harmful and
potentially useful to go ahead refresh hashes after an SSYNC while the
inodes of those suffixes are warm in the cache.
However, that only makes sense if the SSYNC conversation actually synced
any suffixes - if SSYNC is rejected for concurrency before it ever got
started there is no value in the remote performing a rehash. It may be
that *another* reconstructor is pushing data into that same partition
and the suffixes will become immediately invalidated.
If a ssync_sender does not successful finish a sync the reconstructor
should skip the REPLICATE call entirely and move on to the next
partition without causing any useless remote IO.
Closes-Bug: #1665141
Change-Id: Ia72c407247e452 5ef071a17287508 50807ae8231