Relinker marks partition as relinked even when there were errors
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Fix Released
|
Critical
|
Unassigned |
Bug Description
The relinker is meant to be this idempotent tool where you can run it repeatedly until you're satisfied that everything has been linked where it needs to go. But even if there are errors processing a partition:
vagrant@saio:~$ swift-object-
Processing files for policy replicated under /srv/node1/ (cleanup=False)
Error relinking: failed to relink /srv/node1/
Step: relink Device: sdb1 Policy: replicated Partitions: 1/1
1 hash dirs processed (cleanup=False) (1 files, 0 linked, 0 removed, 1 errors)
...we write down that the partition was relinked...
vagrant@saio:~$ < /srv/node1/
{
"part_power": 8,
"next_
"state": {
"150": true
}
}
...which causes a subsequent run to skip it and gives a false sense that everything's OK:
vagrant@saio:~$ swift-object-
Processing files for policy replicated under /srv/node1/ (cleanup=False)
0 hash dirs processed (cleanup=False) (0 files, 0 linked, 0 removed, 0 errors)
Operators can work around it by removing the state file between runs, but then every partition will be processed again. It'd be way better if the error prevented the partition from being marked completed -- then operators could do one run, check for errors, perform whatever manual intervention might be necessary, then kick off another run that will only attempt to process the partitions that had errors.
Changed in swift: | |
status: | New → In Progress |
Changed in swift: | |
importance: | Undecided → Critical |
Fixed in https:/ /review. opendev. org/c/openstack /swift/ +/788089:
relinker: Only mark partitions "done" if there were no (new) errors
This way operators can re-run the relinker in the face of errors without
needing to manually clear the state file.
Change-Id: Ida1c1c0c8a695b 1b226121b426b82 26a43f3056b
Co-Authored-By: Clay Gerrard <email address hidden>
Will be included in swift 2.28.0.