tripleo-transfer rsync command does only partial syncs

Bug #1923898 reported by Michele Baldessari
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
Critical
Unassigned

Bug Description

The new mechanism
https://review.opendev.org/c/openstack/tripleo-ansible/+/778665

added the following command:
   shell: >-
        /usr/bin/rsync
        -v
        --delay-updates
        -F
        --compress
        --archive
        --delete
        --rsync-path='sudo rsync'
        --rsh='ssh -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -i {{ tripleo_transfer_key_location }}'
        {{ tripleo_transfer_src_dir_safe }}
        {{ tripleo_transfer_dest_user }}@{{ hostvars_dest_host_ip }}:{{ tripleo_transfer_dest_dir_safe }}

This new mechanism calls rsync without passing --checksum. So any file that has the same size/permission and *timestamp* will not be transferred over. Now *timestamp* in this case means 1 second resolution. See also the following rsync option which explains it:
“””
--modify-window=NUM, -@ When comparing two timestamps, rsync treats the timestamps as being equal if they differ by no more than the modify-window value. The default is 0, which matches just integer seconds.
“””

This has shown to be problematic at the very least with the mariadb system transfer where only a partial list of files would be transferred, causing all kinds of data corruption and segfaults in the database being leapped.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (master)
Changed in tripleo:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/tripleo-ansible/+/786320

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/tripleo-ansible/+/786417

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to tripleo-ansible (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/tripleo-ansible/+/786418

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (master)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ansible/+/786318
Committed: https://opendev.org/openstack/tripleo-ansible/commit/c801abb14794c900fd863bbe1549787cdb18b564
Submitter: "Zuul (22348)"
Branch: master

commit c801abb14794c900fd863bbe1549787cdb18b564
Author: Michele Baldessari <email address hidden>
Date: Wed Apr 14 21:49:20 2021 +0200

    Use --ignore-times when transferring files via rsync

    Currently rsync is being called without passing --ignore-times nor
    --checksum. So any file that has the same size/permission and
    *timestamp* will not be transferred over. Now *timestamp* in this case
    means 1 second resolution. See also the following rsync option which
    explains it:

      --modify-window=NUM, -@ When comparing two timestamps, rsync treats the
        timestamps as being equal if they differ by no more than the
        modify-window value. The default is 0, which matches just integer
        seconds.

    This has shown to be problematic at the very least with the mariadb
    system transfer where only a partial list of files would be transferred,
    causing all kinds of data corruption and segfaults in the database being
    leapped.

    We debated the use of --checksum vs --ignore-times and are settling on
    --ignore-times to avoid any risks of hash collision (and hence missed
    transfer of a different file, since the default hash is 128bits)
    and because that is also what the galera SST helper uses and has proven
    solid over time, so it seems the more cautious decision.

    Closes-Bug: #1923898

    Change-Id: Ibd53fad900cfa002bf2ad9b2ae6f62babd4140e5
    Co-Authored-By: Damien Ciabrini <email address hidden>
    Co-Authored-By: John Eckersberg <email address hidden>

Changed in tripleo:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ansible/+/786417
Committed: https://opendev.org/openstack/tripleo-ansible/commit/95e32e1ede8abda6de1fdef9d8e325a6fd74333d
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 95e32e1ede8abda6de1fdef9d8e325a6fd74333d
Author: Michele Baldessari <email address hidden>
Date: Wed Apr 14 21:49:20 2021 +0200

    Use --ignore-times when transferring files via rsync

    Currently rsync is being called without passing --ignore-times nor
    --checksum. So any file that has the same size/permission and
    *timestamp* will not be transferred over. Now *timestamp* in this case
    means 1 second resolution. See also the following rsync option which
    explains it:

      --modify-window=NUM, -@ When comparing two timestamps, rsync treats the
        timestamps as being equal if they differ by no more than the
        modify-window value. The default is 0, which matches just integer
        seconds.

    This has shown to be problematic at the very least with the mariadb
    system transfer where only a partial list of files would be transferred,
    causing all kinds of data corruption and segfaults in the database being
    leapped.

    We debated the use of --checksum vs --ignore-times and are settling on
    --ignore-times to avoid any risks of hash collision (and hence missed
    transfer of a different file, since the default hash is 128bits)
    and because that is also what the galera SST helper uses and has proven
    solid over time, so it seems the more cautious decision.

    Closes-Bug: #1923898

    Change-Id: Ibd53fad900cfa002bf2ad9b2ae6f62babd4140e5
    Co-Authored-By: Damien Ciabrini <email address hidden>
    Co-Authored-By: John Eckersberg <email address hidden>

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/ussuri)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ansible/+/786418
Committed: https://opendev.org/openstack/tripleo-ansible/commit/7ada6e1f19d6ad3b68c3451cbf42dc3243a0feb2
Submitter: "Zuul (22348)"
Branch: stable/ussuri

commit 7ada6e1f19d6ad3b68c3451cbf42dc3243a0feb2
Author: Michele Baldessari <email address hidden>
Date: Wed Apr 14 21:49:20 2021 +0200

    Use --ignore-times when transferring files via rsync

    Currently rsync is being called without passing --ignore-times nor
    --checksum. So any file that has the same size/permission and
    *timestamp* will not be transferred over. Now *timestamp* in this case
    means 1 second resolution. See also the following rsync option which
    explains it:

      --modify-window=NUM, -@ When comparing two timestamps, rsync treats the
        timestamps as being equal if they differ by no more than the
        modify-window value. The default is 0, which matches just integer
        seconds.

    This has shown to be problematic at the very least with the mariadb
    system transfer where only a partial list of files would be transferred,
    causing all kinds of data corruption and segfaults in the database being
    leapped.

    We debated the use of --checksum vs --ignore-times and are settling on
    --ignore-times to avoid any risks of hash collision (and hence missed
    transfer of a different file, since the default hash is 128bits)
    and because that is also what the galera SST helper uses and has proven
    solid over time, so it seems the more cautious decision.

    Closes-Bug: #1923898

    Change-Id: Ibd53fad900cfa002bf2ad9b2ae6f62babd4140e5
    Co-Authored-By: Damien Ciabrini <email address hidden>
    Co-Authored-By: John Eckersberg <email address hidden>

tags: added: in-stable-ussuri
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to tripleo-ansible (stable/train)

Reviewed: https://review.opendev.org/c/openstack/tripleo-ansible/+/786320
Committed: https://opendev.org/openstack/tripleo-ansible/commit/5931d7c7b113363655157698247bbc3b7cdc3f19
Submitter: "Zuul (22348)"
Branch: stable/train

commit 5931d7c7b113363655157698247bbc3b7cdc3f19
Author: Michele Baldessari <email address hidden>
Date: Wed Apr 14 21:49:20 2021 +0200

    Use --ignore-times when transferring files via rsync

    Currently rsync is being called without passing --ignore-times nor
    --checksum. So any file that has the same size/permission and
    *timestamp* will not be transferred over. Now *timestamp* in this case
    means 1 second resolution. See also the following rsync option which
    explains it:

      --modify-window=NUM, -@ When comparing two timestamps, rsync treats the
        timestamps as being equal if they differ by no more than the
        modify-window value. The default is 0, which matches just integer
        seconds.

    This has shown to be problematic at the very least with the mariadb
    system transfer where only a partial list of files would be transferred,
    causing all kinds of data corruption and segfaults in the database being
    leapped.

    We debated the use of --checksum vs --ignore-times and are settling on
    --ignore-times to avoid any risks of hash collision (and hence missed
    transfer of a different file, since the default hash is 128bits)
    and because that is also what the galera SST helper uses and has proven
    solid over time, so it seems the more cautious decision.

    Closes-Bug: #1923898

    Change-Id: Ibd53fad900cfa002bf2ad9b2ae6f62babd4140e5
    Co-Authored-By: Damien Ciabrini <email address hidden>
    Co-Authored-By: John Eckersberg <email address hidden>
    (cherry picked from commit 7ada6e1f19d6ad3b68c3451cbf42dc3243a0feb2)

tags: added: in-stable-train
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 3.1.0

This issue was fixed in the openstack/tripleo-ansible 3.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 2.3.0

This issue was fixed in the openstack/tripleo-ansible 2.3.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 0.8.0

This issue was fixed in the openstack/tripleo-ansible 0.8.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/tripleo-ansible 1.5.4

This issue was fixed in the openstack/tripleo-ansible 1.5.4 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.