Without an easily reproducible scenario, I dived into the code and observed:
1. We optimistically Unlock held locks, i.e we do not check if unlock succeeds;
2. Unlocking may fail if we are trying to unlock a lock that (a) is not held by us or (b) we could not move the lock to a temporary file/directory.
I think that in this bug we are observing the consequences of the 2b and due to 1, we will not see anything useful in the logs - we are not logging failures to unlock.
So, in conclusion, current workaround - removing lock's directory as above - is necessary in the rare cases where lock is not unlocked after running/resolving hooks. However, this should be used with care and under advisement to avoid removing valid locks.
We need to re-design lock mechanism to be more robust to address this and other scenarios.
An easily re-producible scenario could help us ensure that the new design/implementation will address this case. Please submit one if you have it readily available \o/
Without an easily reproducible scenario, I dived into the code and observed:
1. We optimistically Unlock held locks, i.e we do not check if unlock succeeds;
2. Unlocking may fail if we are trying to unlock a lock that (a) is not held by us or (b) we could not move the lock to a temporary file/directory.
I think that in this bug we are observing the consequences of the 2b and due to 1, we will not see anything useful in the logs - we are not logging failures to unlock.
So, in conclusion, current workaround - removing lock's directory as above - is necessary in the rare cases where lock is not unlocked after running/resolving hooks. However, this should be used with care and under advisement to avoid removing valid locks.
We need to re-design lock mechanism to be more robust to address this and other scenarios.
An easily re-producible scenario could help us ensure that the new design/ implementation will address this case. Please submit one if you have it readily available \o/