swift-recon-object-cron gets stuck
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Fix Released
|
Medium
|
Unassigned |
Bug Description
[Errno 17] File exists: '/var/lock/
If the lock dir for swift-recon-cron doesn't get cleaned up because of an unclean shutdown it will never start again until an operator logs in to rmdir.
Probably this happens when someone decides they have to send sigterm to everyone using /etc/swift/
In this case I think swift-recon-cron would have an opportunity to handle SystemExit - but it might be useful to add a configurable timeout on the lockdir as well. This job is supposed to run pretty frequently to give up-to-date numbers. You don't want to overwhelm the disks - but I don't think it's going to cause systemic resource consumption issues if you were to consistently blow away >1hr old lockdirs. Maybe a default timeout in the range of 6-24hrs would be helpful and safe in nearly all imaginable configurations?
N.B. all of the code for this bin script is in the bin dir so moving it to the cli module will be a requirement for unittesting.
Changed in swift: | |
status: | New → Confirmed |
importance: | Undecided → Medium |
Fixed it in https:/ /opendev. org/openstack/ swift/commit/ a7c5ca08066d457 695c906bd6348b9 6a54636b86 2a4d126f63e2d6d fa5163dd223
Change-Id: Icb328b2766057a