swift-recon-object-cron gets stuck

Bug #1687509 reported by clayg
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Fix Released
Medium
Unassigned

Bug Description

[Errno 17] File exists: '/var/lock/swift/swift-recon-object-cron'

If the lock dir for swift-recon-cron doesn't get cleaned up because of an unclean shutdown it will never start again until an operator logs in to rmdir.

Probably this happens when someone decides they have to send sigterm to everyone using /etc/swift/object-server.conf

In this case I think swift-recon-cron would have an opportunity to handle SystemExit - but it might be useful to add a configurable timeout on the lockdir as well. This job is supposed to run pretty frequently to give up-to-date numbers. You don't want to overwhelm the disks - but I don't think it's going to cause systemic resource consumption issues if you were to consistently blow away >1hr old lockdirs. Maybe a default timeout in the range of 6-24hrs would be helpful and safe in nearly all imaginable configurations?

N.B. all of the code for this bin script is in the bin dir so moving it to the cli module will be a requirement for unittesting.

Revision history for this message
clayg (clay-gerrard) wrote :
Changed in swift:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
kim woo seok (rladntjr4) wrote (last edit ):

Fixed it in https://opendev.org/openstack/swift/commit/a7c5ca08066d457695c906bd6348b96a54636b86
Change-Id: Icb328b2766057a2a4d126f63e2d6dfa5163dd223

Changed in swift:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.