Multiple part power increases leads to misplaced data

Bug #1910589 reported by Tim Burke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Undecided
Unassigned

Bug Description

I ran through part power increase once, and everything was great. I put a some data in my vsaio, and every step along the way I could use swift-account-audit to verify everything was still accessible: http://paste.openstack.org/show/801495/

My part power was still ridiculously low, so I went to increase it again. Starts out well enough:

========================================
vagrant@saio:~/swift$ swift-ring-builder /etc/swift/object.builder prepare_increase_partition_power
The next partition power is now 6.
The change will take effect after the next write_ring.
Ensure your proxy-servers, object-replicators and
reconstructors are using the changed rings and relink
(using swift-object-relinker) your existing data
before the partition power increase
vagrant@saio:~/swift$ swift-ring-builder /etc/swift/object.builder write_ring
vagrant@saio:~/swift$ swift-account-audit AUTH_test
Auditing account "AUTH_test"
Auditing container "c"

  Accounts checked: 1

Containers checked: 1

   Objects checked: 83
========================================

The relink was a little funky (but didn't actually error), and the audit kept passing, so I kept going:

========================================
vagrant@saio:~/swift$ for i in {1..4}; do swift-object-relinker relink --devices /srv/node$i ; done
Relinking files for policy default under /srv/node1
Relinked 0 diskfiles (0 errors)
Relinking files for policy default under /srv/node2
Relinked 0 diskfiles (0 errors)
Relinking files for policy default under /srv/node3
Relinked 0 diskfiles (0 errors)
Relinking files for policy default under /srv/node4
Relinked 0 diskfiles (0 errors)
vagrant@saio:~/swift$ swift-account-audit AUTH_test
Auditing account "AUTH_test"
Auditing container "c"

  Accounts checked: 1

Containers checked: 1

   Objects checked: 83
vagrant@saio:~/swift$ swift-ring-builder /etc/swift/object.builder increase_partition_power
The partition power is now 6.
The change will take effect after the next write_ring.
vagrant@saio:~/swift$ swift-ring-builder /etc/swift/object.builder write_ring
========================================

But at this point it all goes to pot:

========================================
vagrant@saio:~/swift$ swift-account-audit AUTH_test
Auditing account "AUTH_test"
Auditing container "c"
  Bad status HEADing object "/AUTH_test/c/..." on 127.0.0.3/sdb7
  Bad status HEADing object "/AUTH_test/c/..." on 127.0.0.1/sdb5
  ...
  Failed fo fetch object /AUTH_test/c/... at all!

  Accounts checked: 1

Containers checked: 1

   Objects checked: 83
  Missing Replicas: 492
========================================

All those "Relinked 0 diskfiles (0 errors)" lines? We didn't relink *anything*! The trouble is the progress state we introduced in https://review.opendev.org/c/openstack/swift/+/695344 -- it keeps us from needing to reprocess partitions if the process needs to be restarted (which is good and useful!), but there's nothing to remove the status files once a particular increase completes.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/swift 2.27.0

This issue was fixed in the openstack/swift 2.27.0 release.

Revision history for this message
Tim Burke (1-tim-z) wrote :
Changed in swift:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.