cleanup_file_locks does not remove stale sentinel files

Bug #1018586 reported by Branan Purvine-Riley
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Undecided
Eugene Kirpichov
Essex
Fix Released
High
Unassigned
nova (Ubuntu)
Fix Released
High
Unassigned
Precise
Fix Released
Undecided
Unassigned

Bug Description

Related to https://bugs.launchpad.net/nova/+bug/785955

The patch for that issue has an incorrect regex for sentinel files.

The correct regex is "hostname + r'-.*\.(\d+$)'"

Related branches

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in ubuntu:
status: New → Confirmed
Revision history for this message
Pádraig Brady (p-draigbrady) wrote :

shouldn't you just reopen 785955 and add the comment there?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/10095

Changed in nova:
assignee: nobody → Eugene Kirpichov (ekirpichov)
status: New → In Progress
Michael Still (mikal)
tags: added: canonistack ops
Revision history for this message
Eugene Kirpichov (ekirpichov) wrote :
Revision history for this message
Eugene Kirpichov (ekirpichov) wrote :

(oops, didn't notice it was already linked to automatically)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/10095
Committed: http://github.com/openstack/nova/commit/974417b75f5f839ce4daaf080147ad154d727f10
Submitter: Jenkins
Branch: master

commit 974417b75f5f839ce4daaf080147ad154d727f10
Author: Eugene Kirpichov <email address hidden>
Date: Sat Jul 21 23:17:55 2012 +0000

    Fix wrong regex in cleanup_file_locks.

    The sentinel filename actually has form hostname-threadid.pid,
    not hostname.threadid-pid.
    Launchpad bug 1018586.
    Change-Id: I09c01e0e63ee704b1485c196dc0b396ee03b2e5c

Changed in nova:
status: In Progress → Fix Committed
Changed in ubuntu:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/essex)

Fix proposed to branch: stable/essex
Review: https://review.openstack.org/10321

tags: added: essex-backport
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/essex)

Reviewed: https://review.openstack.org/10321
Committed: http://github.com/openstack/nova/commit/f2bc403879234aaaeeb61e1dca1affe18192cfa1
Submitter: Jenkins
Branch: stable/essex

commit f2bc403879234aaaeeb61e1dca1affe18192cfa1
Author: Eugene Kirpichov <email address hidden>
Date: Sat Jul 21 23:17:55 2012 +0000

    Fix wrong regex in cleanup_file_locks.

    The sentinel filename actually has form hostname-threadid.pid,
    not hostname.threadid-pid.

    Launchpad bug 1018586.

    Update: Add Eugene to Authors for stable/essex.

    Change-Id: I09c01e0e63ee704b1485c196dc0b396ee03b2e5c
    (cherry picked from commit 974417b75f5f839ce4daaf080147ad154d727f10)

tags: added: in-stable-essex
Revision history for this message
Eugene Kirpichov (ekirpichov) wrote :

Hm, I'm confused. I just noticed that in ubuntu precise, the package python-lockfile uses a version of lockfile (0.8) for which this regex IS CORRECT. Where did I and the other guy whom this bug affects get the more up-to-date version of lockfile??

Revision history for this message
Adam Gandelman (gandelman-a) wrote :

Hey Eugene-

I'm not sure where you and the other guy got a more up-to-date version of lockfile. python-lockfile has remained at 0.8 in Ubuntu since the package was introduced in Lucid. That said, AFACIS I'm not sure any of this is lockfile related as nova.utils.GreenLockFile overrides lockfile's naming scheme for sentinel files, anyway, and the sentinel regexp is dependent on that, not lockfile.

Did a quick test locally, and found that system named 'warhead.home.base' leaves a sentinel file as 'warhead.home.base-2ae619-2a025a0.24791', for which your newer regexp works, and the original does not:

#!/usr/bin/python
import re

hostname = 'warhead.home.base'
file="warhead.home.base-2ae619-2a025a0.24791"
orig_sentinel_re = hostname + r'\..*-(\d+$)'
new_sentinel_re = hostname + r'-.*\.(\d+$)'
print re.match(orig_sentinel_re, file)
print re.match(new_sentinel_re, file)

output:
None
<_sre.SRE_Match object at 0x7f69e74ad558>

Revision history for this message
Pádraig Brady (p-draigbrady) wrote :

tl;dr The current code should be correct.

old naming = blah-pid
new naming = blah.pid
That was changed upstream in:
http://code.google.com/p/pylockfile/source/detail?r=102
That was released upstream in 0.9.1

But nova overrides lockfile naming since essex-1-2022-geb42e7f
The new regexp is correct for that.
I.E. diablo lock files and named depending on lockfile version,
but diablo doesn't have the cleaning code, so that is moot.

p.s. This cleanup code doesn't work on windows I think,
as it's assuming file rather than directory locks.
Maybe os.link is available on windows but I don't think
it's available in python yet.

Thierry Carrez (ttx)
Changed in nova:
milestone: none → folsom-3
status: Fix Committed → Fix Released
Dave Walker (davewalker)
affects: ubuntu → nova (Ubuntu)
Changed in nova (Ubuntu):
status: Confirmed → Fix Released
Dave Walker (davewalker)
Changed in nova (Ubuntu Precise):
status: New → Confirmed
Revision history for this message
Adam Gandelman (gandelman-a) wrote : Verification report.

Please find the attached test log from the Ubuntu Server Team's CI infrastructure. As part of the verification process for this bug, Nova has been deployed and configured across multiple nodes using precise-proposed as an installation source. After successful bring-up and configuration of the cluster, a number of exercises and smoke tests have be invoked to ensure the updated package did not introduce any regressions. A number of test iterations were carried out to catch any possible transient errors.

Please Note the list of installed packages at the top and bottom of the report.

For records of upstream test coverage of this update, please see the Jenkins links in the comments of the relevant upstream code-review(s):

Trunk review: https://review.openstack.org/10095
Stable review: https://review.openstack.org/10321

As per the provisional Micro Release Exception granted to this package by the Technical Board, we hope this contributes toward verification of this update.

Revision history for this message
Adam Gandelman (gandelman-a) wrote :

Test coverage log.

tags: added: verification-done
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (5.4 KiB)

This bug was fixed in the package nova - 2012.1.3+stable-20120827-4d2a4afe-0ubuntu1

---------------
nova (2012.1.3+stable-20120827-4d2a4afe-0ubuntu1) precise-proposed; urgency=low

  * New upstream snapshot, fixes FTBFS in -proposed. (LP: #1041120)
  * Resynchronize with stable/essex (4d2a4afe):
    - [5d63601] Inappropriate exception handling on kvm live/block migration
      (LP: #917615)
    - [ae280ca] Deleted floating ips can cause instance delete to fail
      (LP: #1038266)

nova (2012.1.3+stable-20120824-86fb7362-0ubuntu1) precise-proposed; urgency=low

  * New upstream snapshot. (LP: #1041120)
  * Dropped, superseded by new snapshot:
    - debian/patches/CVE-2012-3447.patch: [d9577ce]
    - debian/patches/CVE-2012-3371.patch: [25f5bd3]
    - debian/patches/CVE-2012-3360+3361.patch: [b0feaff]
  * Resynchronize with stable/essex (86fb7362):
    - [86fb736] Libvirt driver reports incorrect error when volume-detach fails
      (LP: #1029463)
    - [272b98d] nova delete lxc-instance umounts the wrong rootfs (LP: #971621)
    - [09217ab] Block storage connections are NOT restored on system reboot
      (LP: #1036902)
    - [d9577ce] CVE-2012-3361 not fully addressed (LP: #1031311)
    - [e8ef050] pycrypto is unused and the existing code is potentially insecure
      to use (LP: #1033178)
    - [3b4ac31] cannot umount guestfs (LP: #1013689)
    - [f8255f3] qpid_heartbeat setting in ineffective (LP: #1030430)
    - [413c641] Deallocation of fixed IP occurs before security group refresh
      leading to potential security issue in error / race conditions
      (LP: #1021352)
    - [219c5ca] Race condition in network/deallocate_for_instance() leads to
      security issue (LP: #1021340)
    - [f2bc403] cleanup_file_locks does not remove stale sentinel files
      (LP: #1018586)
    - [4c7d671] Deleting Flavor currently in use by instance creates error
      (LP: #994935)
    - [7e88e39] nova testsuite errors on newer versions of python-boto (e.g.
      2.5.2) (LP: #1027984)
    - [80d3026] NoMoreFloatingIps: Zero floating ips available after repeatedly
      creating and destroying instances over time (LP: #1017418)
    - [4d74631] Launching with source groups under load produces lazy load error
      (LP: #1018721)
    - [08e5128] API 'v1.1/{tenant_id}/os-hosts' does not return a list of hosts
      (LP: #1014925)
    - [801b94a] Restarting nova-compute removes ip packet filters (LP: #1027105)
    - [f6d1f55] instance live migration should create virtual_size disk image
      (LP: #977007)
    - [4b89b4f] [nova][volumes] Exceeding volumes, gigabytes and floating_ips
      quotas returns general uninformative HTTP 500 error (LP: #1021373)
    - [6e873bc] [nova][volumes] Exceeding volumes, gigabytes and floating_ips
      quotas returns general uninformative HTTP 500 error (LP: #1021373)
    - [7b215ed] Use default qemu-img cluster size in libvirt connection driver
    - [d3a87a2] Listing flavors with marker set returns 400 (LP: #956096)
    - [cf6a85a] nova-rootwrap hardcodes paths instead of using
      /sbin:/usr/sbin:/usr/bin:/bin (LP: #1013147)
    - [2efc87c] affinity filters don't work if scheduler_hints is None
      (LP: #1007573)
  ...

Read more...

Changed in nova (Ubuntu Precise):
status: Confirmed → Fix Released
Revision history for this message
Clint Byrum (clint-fewbar) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Thierry Carrez (ttx)
Changed in nova:
milestone: folsom-3 → 2012.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.