Database replicator will rsync too often

Bug #1019712 reported by gholt
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Low
Matthew Oliver

Bug Description

The database replicator will resort to "rsync+merge" mode too often. The code I noticed is in swift/common/db_replicator.py:

            # if the difference in rowids between the two differs by
            # more than 50%, rsync then do a remote merge.
            if rinfo['max_row'] / float(info['max_row']) < 0.5:

So, in the case where one database has 1 row, and the other 3 rows, an rsync+merge is done when it could've just sent the rows over.

Might not be a big deal as both methods should be pretty quick, but I thought dropping to rsync was less desirable. I'm not sure, so I figured I'd post a bug and let someone else take a look.

My thought is that it should only resort to rsync+merge when the difference is great (say, greater than per_diff * max_diff) and more than 50% different.

Chuck Thier (cthier)
Changed in swift:
status: New → Triaged
Simon Chang (changsimon)
Changed in swift:
assignee: nobody → Simon Chang (changsimon)
Revision history for this message
Simon Chang (changsimon) wrote :

I think part of the fix would be changing 0.5 to a smaller value, but not sure if we should just arbitrary choose some smaller value (say ... 0.2). The value should be based on production statistics.

Simon Chang (changsimon)
Changed in swift:
assignee: Simon Chang (changsimon) → nobody
Revision history for this message
drax (devesh-gupta) wrote :

Instead of checking on the percentage we can perform check on the difference.

Like if the difference between the rinfo['max_row'] and info['max_row'] is some threshold value we can perfrom rsync+merge.

Changed in swift:
assignee: nobody → drax (devesh-gupta)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/108231

Changed in swift:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on swift (master)

Change abandoned by John Dickinson (<email address hidden>) on branch: master
Review: https://review.openstack.org/108231
Reason: Abandoning due to lack of activity since the last negative review. You can restore the change if you want to keep working on it.

Revision history for this message
John Dickinson (notmyname) wrote :

needs reverified

Changed in swift:
assignee: drax (devesh-gupta) → nobody
status: In Progress → Incomplete
Changed in swift:
assignee: nobody → Matthew Oliver (matt-0)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (master)

Fix proposed to branch: master
Review: https://review.openstack.org/232931

Changed in swift:
status: Incomplete → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (master)

Reviewed: https://review.openstack.org/232931
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=4a13dcc4a8d7c890ff322190ca80fbb460b55de4
Submitter: Jenkins
Branch: master

commit 4a13dcc4a8d7c890ff322190ca80fbb460b55de4
Author: Matthew Oliver <email address hidden>
Date: Fri Oct 9 17:25:08 2015 +1100

    Make db_replicator usync smaller containers

    The current rule inside the db_replicator is to rsync+merge
    containers during replication if the difference between rowids
    differ by more than 50%:

      # if the difference in rowids between the two differs by
      # more than 50%, rsync then do a remote merge.
      if rinfo['max_row'] / float(info['max_row']) < 0.5:

    This mean on smaller containers, that only have few rows, and differ
    by a small number still rsync+merge rather then copying rows.

    This change adds a new condition, the difference in the rowids must
    be greater than the defined per_diff otherwise usync will be used:

      # if the difference in rowids between the two differs by
      # more than 50% and the difference is greater than per_diff,
      # rsync then do a remote merge.
      # NOTE: difference > per_diff stops us from dropping to rsync
      # on smaller containers, who have only a few rows to sync.
      if rinfo['max_row'] / float(info['max_row']) < 0.5 and \
              info['max_row'] - rinfo['max_row'] > self.per_diff:

    Change-Id: I9e779f71bf37714919a525404565dd075762b0d4
    Closes-bug: #1019712

Changed in swift:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/crypto)

Fix proposed to branch: feature/crypto
Review: https://review.openstack.org/244249

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/crypto)
Download full text (24.4 KiB)

Reviewed: https://review.openstack.org/244249
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=19226e4449bca399981cde3a1e9882735fed74d5
Submitter: Jenkins
Branch: feature/crypto

commit 2d85a3f6997fbadac210207f841c4ec25ff50cc4
Author: Tushar Gohad <email address hidden>
Date: Fri Oct 23 13:16:33 2015 +0000

    EC: Use best available ec_type in unittests

    To minimize external library dependencies for Swift unit
    tests and SAIO, PyECLib 1.1.1 introduces a native backend
    'liberasurecode_rs_vand.' This patch is to migrate over
    the unit tests to the new ec_type when available.

    This change will work with current pyeclib requirements
    (==1.0.7) and also future requirements (>=1.0.7).

    When we're able to raise *our* requirements to >=1.1.1 we
    should remove jerasure from the list of preferred backends.
    Related SAIO doc and example config changes should be
    included with that patch.

    Co-Authored-By: Clay Gerrard <email address hidden>
    Change-Id: Idf657f0acf0479bc8158972e568a29dbc08eaf3b

commit 66dc1eebb1db51c81891b1f7bcc3e85aef9b8c1d
Author: Bill Huber <email address hidden>
Date: Fri Oct 16 11:27:34 2015 -0500

    ObjectControllers return application errors as 499 on bad read

    In the _transfer_data method, we translate all (Exception, Timeout)
    into a 499 whereas we should consider translating them to 500 on
    particular returning error scenarios.

    This affects both ReplicatedObjectController and ECObjectControllear.

    Change-Id: I571bbc5b1451243907b094a5718c8735fd824268
    Closes-Bug: 1504299

commit 705642db4c27c34ac8eb840d40b16470b85418c8
Author: venkatamahesh <email address hidden>
Date: Tue Nov 10 19:46:31 2015 +0530

    Change stackforge repo to openstack repo

    The projects which are moved to openstack
    are corrected by replacing stackforge with openstack

    Change-Id: I65b794a7f10df617bc2a4caf2c4010477a82dbc2

commit d755f5b52024e255787d180bbca60c09192a8850
Author: John Dickinson <email address hidden>
Date: Mon Nov 9 22:03:34 2015 -0800

    suppress warning output in a unit test

    test_write_builder_after_device_removal() wasn't setting a
    default min_part_hours so a warnign was printed. Explicitly
    adding a min_part_hours suppresses the warning

    Change-Id: I6f234b72c34e066abb91f28e6eacf50e29be8842

commit 37337d5fcb0a29344381f8de7d810a9e76e10928
Author: Pradeep Kumar Singh <email address hidden>
Date: Sun May 17 13:35:58 2015 +0530

    Read the response body, if response status is greater than 300.

    internal_client was not reading response if response status is not 200.
    So proxy server treats this as client disconnect and logs 499 in log file.
    This patch fixes that by reading response if response status is greater
    than or equal to 300 and in acceptable statuses.

    Closes-Bug: #1364752

    Change-Id: I0512a25895da583956f76031e3c5de5c970bce01

commit c81f202f714c36390a25b24fa96d1b277788ade9
Author: Kota Tsuyuzaki <email address hidden>
Date: Thu Nov 5 17:37:30 2015 -0800

    Add missing docs for ring.builder.rebalance

 ...

tags: added: in-feature-crypto
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to swift (feature/hummingbird)

Fix proposed to branch: feature/hummingbird
Review: https://review.openstack.org/264517

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to swift (feature/hummingbird)
Download full text (54.1 KiB)

Reviewed: https://review.openstack.org/264517
Committed: https://git.openstack.org/cgit/openstack/swift/commit/?id=93ddaffaebb620bc712d52d0267194562dce02be
Submitter: Jenkins
Branch: feature/hummingbird

commit f53cf1043d078451c4b9957027bf3af378aa0166
Author: Ondřej Nový <email address hidden>
Date: Tue Jan 5 20:20:15 2016 +0100

    Fixed few misspellings in comments

    Change-Id: I8479c85cb8821c48b5da197cac37c80e5c1c7f05

commit 79222e327f9df6335b58e17a6c8dd0dc44b86c17
Author: ChangBo Guo(gcb) <email address hidden>
Date: Sat Dec 26 13:13:37 2015 +0800

    Fix AttributeError for LogAdapter

    LogAdapter object has no attribute 'warn' but has attribute
    'warning'.

    Closes-Bug: #1529321
    Change-Id: I0e0bd0a3dbc4bb5c1f0b343a8809e53491a1da5f

commit 684c4c04592278a280032002b5313b171ee7a4c0
Author: janonymous <email address hidden>
Date: Sun Aug 2 22:47:42 2015 +0530

    Python 3 deprecated the logger.warn method in favor of warning

    DeprecationWarning: The 'warn' method is deprecated, use 'warning'
    instead

    Change-Id: I35df44374c4521b1f06be7a96c0b873e8c3674d8

commit d0a026fcb8e8a9f5475699cc56e1998bdc4cd5ca
Author: Hisashi Osanai <email address hidden>
Date: Wed Dec 16 18:50:37 2015 +0900

    Fix duplication for headers in Access-Control-Expose-Headers

    There are following problems with Access-Control-Expose-Headers.

    * If headers in X-Container-Meta-Access-Control-Expose-Headers are
      configured, the headers are kept with case-sensitive string.
      Then a CORS request comes, the headers are merged into
      Access-Control-Expose-Headers as case-sensitive string even if
      there is a same header which is not case-sensitive string.

    * Access-Control-Expose-Headers is handled by a list.
      If X-Container/Object-Meta-XXX is configured in container/object
      and X-Container-Meta-Access-Control-Expose-Headers, same header
      is listed in Access-Control-Expose-Headers.

    This patch provides a fix for the problems.

    Change-Id: Ifc1c14eb3833ec6a851631cfc23008648463bd81

commit 0bcd7fd50ec0763dcb366dbf43a9696ca3806f15
Author: Bill Huber <email address hidden>
Date: Fri Nov 20 12:09:26 2015 -0600

    Update Erasure Coding Overview doc to remove Beta version

    The major functionality of EC has been released for Liberty and
    the beta version of the code has been removed since it is now
    in production.

    Change-Id: If60712045fb1af803093d6753fcd60434e637772

commit 84ba24a75640be4212e0f984c284faf4c894e7c6
Author: Alistair Coles <email address hidden>
Date: Fri Dec 18 11:24:34 2015 +0000

    Fix rst errors so that html docs are complete

    rst table format errors don't break the gate job
    but do cause sections of the documents to go missing
    from the html output.

    Change-Id: Ic8c9953c93d03dcdafd8f47b271d276c7b356dc3

commit 87f7e907ee412f5847f1f9ffca7a566fb148c6b1
Author: Matthew Oliver <email address hidden>
Date: Wed Dec 16 17:19:24 2015 +1100

    Pass HTTP_REFERER down to subrequests

    Currently a HTTP_REFERER (Referer) header isn't passed down to
    subreq...

tags: added: in-feature-hummingbird
Changed in swift:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.