relinker reports ENOENT when data file is overwritten by tombstone

Bug #1917541 reported by Tim Burke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Committed
Undecided
Alistair Coles

Bug Description

There's a race in the relinker where a client DELETE can get serviced between when we do the listdir (to determine what needs to be relinked) and when we actually do the relinking. When that happens, the relinker reports:

    Relinking /srv/node/d10538/objects/34819/c51/.../1607014988.37615.data to /srv/node/d10538/objects/69639/c51/.../1607014988.37615.data failed: [Errno 2] No such file or directory

But checking the objects tree shows there's no problem:

    ls -ld /srv/node/d10538/objects/34819/c51/.../*
    -rwxr-xr-x. 2 swift swift 0 Mar 3 00:07 /srv/node/d10538/objects/34819/c51/.../1614729916.02920.ts

Naturally, the object-server also took care of linking it into the new partition as well as the old:

    ls -ld /srv/node/d10538/objects/6963[89]/c51/.../*
    -rwxr-xr-x. 2 swift swift 0 Mar 3 00:07 /srv/node/d10538/objects/69639/c51/.../1614729916.02920.ts

Much better would be for us to do a second listdir before reporting the error, and skip reporting it if there was a newer write.

Revision history for this message
Alistair Coles (alistair-coles) wrote :
Changed in swift:
assignee: nobody → Alistair Coles (alistair-coles)
status: New → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.