PosixLock not resistant to program termination

Bug #1327946 reported by Joshua Harlow on 2014-06-09
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
oslo-incubator
Critical
Ben Nemec
oslo.concurrency
Critical
Ben Nemec

Bug Description

If a program is terminated via ctrl-c or other signal and it has acquired locks using the posix_ipc sempahore those sempahores are never released (unlike the filelock which does release automatically on program termination). This can be easily seen by trying the following in a shell (see below). You can come back after minutes and hours and try to acquire the abandoned semaphore and still be unable to (it's still apparently busy), perhaps there is a automatic timeout that needs to be provided? I can imagine that if people do upgrades of a service (using for example `service nova-api stop`) using this lock and the program is currently locked that they would have to have some kind of sempahore cleaning script to even restart that application (which seems bad).

Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from openstack.common import lockutils
>>> import os
>>> os.getpid()
16123
>>> lockutils.InterProcessLock("testing")
<openstack.common.lockutils._PosixLock object at 0x7f8bb4a8e210>
>>> a = lockutils.InterProcessLock("testing")
>>> a.acquire()
<openstack.common.lockutils._PosixLock object at 0x7f8bb30e7ad0>
>>> exit()
(.dev)josh@lappy:~/Dev/oslo-incubator$ python
Python 2.7.6 (default, Mar 22 2014, 22:59:56)
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from openstack.common import lockutils
>>> import os
>>> os.getpid()
16128
>>> a = lockutils.InterProcessLock("testing")
>>> a.acquire(timeout=5)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "openstack/common/lockutils.py", line 168, in acquire
    self.semaphore.acquire(timeout)
posix_ipc.BusyError: Semaphore is busy
....

Joshua Harlow (harlowja) on 2014-06-09
description: updated
Joshua Harlow (harlowja) on 2014-06-09
description: updated
Ben Nemec (bnemec) on 2014-06-09
Changed in oslo:
status: New → Triaged
importance: Undecided → High

Do we need to add atexit handlers to cleanup?

Joshua Harlow (harlowja) wrote :

Adding atexit() handlers will help but even with that it's not resistant to `kill -9` (or other forced termination).

Ben Nemec (bnemec) wrote :

Is there some way we can see whether the process that is holding a lock actually still exists? I don't know if that's possible with posix locks, but if it is then I think that would maintain the same behavior we had with file locks.

Joshua Harlow (harlowja) wrote :

Unsure, if that does then this could be a way to force release of the lock by the next 'lock user'.

Joshua Harlow (harlowja) wrote :

Cool, so then how does one make this check & set atomic?

If two processes read the process from the semaphore id, then one sets it, then the other deletes the one the other process just set and replaces it with a new one (this results in 2 processes thinking they own it). Just something to think about (this is similar to the ABA problem).

k. based on a tip in [1]. It looks like we can set the value of the semaphore to the pid() and retrieve it when needed to check if the other process is still active. Here's a code snippet

vagrant@bosh-lite:/tmp$ python
Python 2.7.3 (default, Feb 27 2014, 19:58:35)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import posix_ipc
>>> import os
>>> a = posix_ipc.Semaphore("xyz002",flags=posix_ipc.O_CREAT,initial_value=os.getpid())
>>> print a.value
1233
>>> a.acquire
<built-in method acquire of posix_ipc.Semaphore object at 0x14206e8>
>>>
vagrant@bosh-lite:/tmp$ python
Python 2.7.3 (default, Feb 27 2014, 19:58:35)
[GCC 4.6.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import posix_ipc
>>> import os
>>> a = posix_ipc.Semaphore("xyz002",flags=posix_ipc.O_CREAT,initial_value=os.getpid())
>>> print os.getpid()
25901
>>> print a.value
1233
>>>

[1] http://osdir.com/ml/darwin-kernel/2009-03/msg00029.html

one more piece of the puzzle, posix_ipc.unlink_semaphore can delete the stuck semaphore from the new process.

Joshua Harlow (harlowja) wrote :

Getting closer I think, the algorithm could be,

Try to run the following:

posix_ipc.Semaphore("xyz002",flags=posix_ipc.O_CREAT,initial_value=os.getpid())

If this fails, read pid from existing Semaphore?, check if that pid exists, if not, then atomically unlink it and try the later statement again? That atomic part is what I am wondering about, how do we atomically unlink it and run the above statement at the same time?

For example:

Process1,

* Try above statement, fail
* Read semaphore pid and find process dead/gone that owned it.
* Sleep for 3 seconds (or > 3...)
* Unlink *wrong* semaphore, try the above statement again, succeed.

Process2,

* Try above statement, fail
* Read semaphore pid and find process dead/gone that owned it.
* Unlink *right* semaphore, try the above statement again, succeed.

Joshua Harlow (harlowja) wrote :

So an idea how to make the above check and set atomic, what about using the file_lock primitives to do just this?

Acquire file_lock:
  - check sempahore owner
  - clear sempahore (if owner not found)
  - create semaphore (with self as owner)
Release

Now the above is using file locks to do the clearing only. Perhaps this can be a strategy thats provided by default in this new oslo.concurrency library and the library itself takes care of doing this logic described above in a correct manner.

Josh, The above strategy looks good to me!

Ben Nemec (bnemec) wrote :

I'm not sure we want to use file locks for this. The whole point of switching to PosixLock was to remove the need for the lock_path config option, and this would reintroduce that dependency (note that there are other places we still use file locks in lockutils, but in those cases we explicitly require a lock_path to be passed in so the config issue doesn't happen).

It would be ideal if PosixLock provided some way to atomically destroy and recreate a lock, but I haven't had a chance to look that closely at it yet.

Joshua Harlow (harlowja) wrote :

Let us know if you find one, since I thought DIMS was looking for that same thing. I'd be interested in knowing if said atomic API exists since I'm unsure how the current PosixLock is working out for the users of lockutils (it appears that the services using lockutils would currently be locking up with the current PosixLock usage, unless they are doing something different to get around there service being terminated and then restarted, and then locking up).

Yuriy Taraday (yorik-sar) wrote :

I've found out that SysV semaphores support SEM_UNDO flag that would essentially undo all actions done to the semaphore by a process when it exits. Can we use SysV semaphores instead of POSIX ones? From the looks of it they are almost identical for our purposes. There is sysv_ipc module from the same maintainer and SysV IPC itself is just as widely supported as POSIX one.

Fix proposed to branch: master
Review: https://review.openstack.org/108954

Changed in oslo:
assignee: nobody → Yuriy Taraday (yorik-sar)
status: Triaged → In Progress

Hi all,
I'm the author and maintainer of both posix_ipc and sysv_ipc. Yuriy Taraday brought this thread to my attention.

I wanted to comment that the clever idea of writing the pid to a semaphore's value won't work under OS X because OS X doesn’t support sem_getvalue(). Under OS X, attempting to access sem.value on a posix_ipc semaphore raises an AttributeError.

Sys V semaphores and SEM_UNDO might be a better option, but note that it might introduce some unpredictable behavior in certain circumstances. See the BUGS section here:
http://man7.org/linux/man-pages/man2/semop.2.html

Yuriy Taraday (yorik-sar) wrote :

Hello, Philip! Thanks for joining us here.

As I understand in our case when:
a) semaphore is initially open;
b) we set SEM_UNDO for all semop's;
c) all semop's are symmetrical;
we're safe from this bug, because semadj will be always be either 1 or 0 and it'll always get back to zero when lock is released.

Ben Nemec (bnemec) wrote :

Okay, this has lingered too long for as serious a bug as it is. A couple of thoughts:

As far as OS X compatibility, it's not a huge concern of mine. If something doesn't work there we can just fall back to file locks like we do on Windows. If someone wants to provide a file-free lock for either of those platforms I'm sure we'd be happy to take it.

That said, if sysv provides a proper way to do this I'm okay with using that. If I'm reading the man page correctly, the bug is only relevant on kernels <= 2.6.10? None of our supported platforms are using a kernel that old, so it isn't a concern for us.

I'll take a closer look at Yuriy's change to switch to sysv_ipc.

Download full text (3.7 KiB)

+2 from me,

This is a bug that is just going to cause problems for multiple releases
(especially since lockutils.py is copy and pasted across many projects..),
it's likely already caused serious problems for operators and we don't even
know it...

When a fundamental lock mechanism is broken (thankfully it just stalls) it
raises serious questions IMHO), how are they (or other users) coping with
this bug (restarting nodes to clear old locks?? something else??)...

Looking forward to seeing a reliable fix...

On Thursday, August 7, 2014, Ben Nemec <<email address hidden>
<javascript:_e(%7B%7D,'cvml','<email address hidden>');>> wrote:

> Okay, this has lingered too long for as serious a bug as it is. A
> couple of thoughts:
>
> As far as OS X compatibility, it's not a huge concern of mine. If
> something doesn't work there we can just fall back to file locks like we
> do on Windows. If someone wants to provide a file-free lock for either
> of those platforms I'm sure we'd be happy to take it.
>
> That said, if sysv provides a proper way to do this I'm okay with using
> that. If I'm reading the man page correctly, the bug is only relevant
> on kernels <= 2.6.10? None of our supported platforms are using a
> kernel that old, so it isn't a concern for us.
>
> I'll take a closer look at Yuriy's change to switch to sysv_ipc.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1327946
>
> Title:
> PosixLock not resistant to program termination
>
> Status in Oslo - a Library of Common OpenStack Code:
> In Progress
>
> Bug description:
> If a program is terminated via ctrl-c or other signal and it has
> acquired locks using the posix_ipc sempahore those sempahores are
> never released (unlike the filelock which does release automatically
> on program termination). This can be easily seen by trying the
> following in a shell (see below). You can come back after minutes and
> hours and try to acquire the abandoned semaphore and still be unable
> to (it's still apparently busy), perhaps there is a automatic timeout
> that needs to be provided? I can imagine that if people do upgrades of
> a service (using for example `service nova-api stop`) using this lock
> and the program is currently locked that they would have to have some
> kind of sempahore cleaning script to even restart that application
> (which seems bad).
>
> Python 2.7.6 (default, Mar 22 2014, 22:59:56)
> [GCC 4.8.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from openstack.common import lockutils
> >>> import os
> >>> os.getpid()
> 16123
> >>> lockutils.InterProcessLock("testing")
> <openstack.common.lockutils._PosixLock object at 0x7f8bb4a8e210>
> >>> a = lockutils.InterProcessLock("testing")
> >>> a.acquire()
> <openstack.common.lockutils._PosixLock object at 0x7f8bb30e7ad0>
> >>> exit()
> (.dev)josh@lappy:~/Dev/oslo-incubator$ python
> Python 2.7.6 (default, Mar 22 2014, 22:59:56)
> [GCC 4.8.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from openstack.common ...

Read more...

Yuriy Taraday (yorik-sar) wrote :

I guess operators don't usually kill OpenStack daemons with brutal signals. This issue shouldn't appear with simple SIGINT or SIGTERM as they're handled by openstack.common.service and cause graceful stop and all locks should be released while appropriate context managers exit. This problem would however appear if someone tries to kill process the second time with one of these signals since handler is set to default after the first signal.

Anyway, SysV signals are better in this regard. Thanks to Philip, sysv_ipc is already up on PyPI and so my fix have one less barrier on the way to master. The fix might land to oslo.concurrency though, given lockutils are in process of graduation there.

Joshua Harlow (harlowja) wrote :
Download full text (3.7 KiB)

Understood, I was thinking also of the issue that more than just sevices
use the oslo incubator and copy and paste code from it.

I'm pretty sure stackforge projects are and others (taskflow is to,
although I haven't synced this over due to this bug, tooz just copied
lockutils over recently... so it now provides an interface to a buggy
feature...)

Looking forward to the fixed version on oslo.concurreny (although someone
IMHO should tell people to stop using the buggy lockutils or only use it
with the services module...)

On Thursday, August 7, 2014, Yuriy Taraday <email address hidden>
wrote:

> I guess operators don't usually kill OpenStack daemons with brutal
> signals. This issue shouldn't appear with simple SIGINT or SIGTERM as
> they're handled by openstack.common.service and cause graceful stop and
> all locks should be released while appropriate context managers exit.
> This problem would however appear if someone tries to kill process the
> second time with one of these signals since handler is set to default
> after the first signal.
>
> Anyway, SysV signals are better in this regard. Thanks to Philip,
> sysv_ipc is already up on PyPI and so my fix have one less barrier on
> the way to master. The fix might land to oslo.concurrency though, given
> lockutils are in process of graduation there.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1327946
>
> Title:
> PosixLock not resistant to program termination
>
> Status in Oslo - a Library of Common OpenStack Code:
> In Progress
>
> Bug description:
> If a program is terminated via ctrl-c or other signal and it has
> acquired locks using the posix_ipc sempahore those sempahores are
> never released (unlike the filelock which does release automatically
> on program termination). This can be easily seen by trying the
> following in a shell (see below). You can come back after minutes and
> hours and try to acquire the abandoned semaphore and still be unable
> to (it's still apparently busy), perhaps there is a automatic timeout
> that needs to be provided? I can imagine that if people do upgrades of
> a service (using for example `service nova-api stop`) using this lock
> and the program is currently locked that they would have to have some
> kind of sempahore cleaning script to even restart that application
> (which seems bad).
>
> Python 2.7.6 (default, Mar 22 2014, 22:59:56)
> [GCC 4.8.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from openstack.common import lockutils
> >>> import os
> >>> os.getpid()
> 16123
> >>> lockutils.InterProcessLock("testing")
> <openstack.common.lockutils._PosixLock object at 0x7f8bb4a8e210>
> >>> a = lockutils.InterProcessLock("testing")
> >>> a.acquire()
> <openstack.common.lockutils._PosixLock object at 0x7f8bb30e7ad0>
> >>> exit()
> (.dev)josh@lappy:~/Dev/oslo-incubator$ python
> Python 2.7.6 (default, Mar 22 2014, 22:59:56)
> [GCC 4.8.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
> >>> from openstack.common import lockutil...

Read more...

Joshua Harlow (harlowja) wrote :
Download full text (4.1 KiB)

For example:

https://github.com/stackforge/tooz/commit/f3e11e40f9871f8328e107bf1390c532fbe3a54b

On Friday, August 8, 2014, Joshua Harlow <email address hidden> wrote:

> Understood, I was thinking also of the issue that more than just sevices
> use the oslo incubator and copy and paste code from it.
>
> I'm pretty sure stackforge projects are and others (taskflow is to,
> although I haven't synced this over due to this bug, tooz just copied
> lockutils over recently... so it now provides an interface to a buggy
> feature...)
>
> Looking forward to the fixed version on oslo.concurreny (although someone
> IMHO should tell people to stop using the buggy lockutils or only use it
> with the services module...)
>
> On Thursday, August 7, 2014, Yuriy Taraday <<email address hidden>
> <javascript:_e(%7B%7D,'cvml','<email address hidden>');>> wrote:
>
>> I guess operators don't usually kill OpenStack daemons with brutal
>> signals. This issue shouldn't appear with simple SIGINT or SIGTERM as
>> they're handled by openstack.common.service and cause graceful stop and
>> all locks should be released while appropriate context managers exit.
>> This problem would however appear if someone tries to kill process the
>> second time with one of these signals since handler is set to default
>> after the first signal.
>>
>> Anyway, SysV signals are better in this regard. Thanks to Philip,
>> sysv_ipc is already up on PyPI and so my fix have one less barrier on
>> the way to master. The fix might land to oslo.concurrency though, given
>> lockutils are in process of graduation there.
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1327946
>>
>> Title:
>> PosixLock not resistant to program termination
>>
>> Status in Oslo - a Library of Common OpenStack Code:
>> In Progress
>>
>> Bug description:
>> If a program is terminated via ctrl-c or other signal and it has
>> acquired locks using the posix_ipc sempahore those sempahores are
>> never released (unlike the filelock which does release automatically
>> on program termination). This can be easily seen by trying the
>> following in a shell (see below). You can come back after minutes and
>> hours and try to acquire the abandoned semaphore and still be unable
>> to (it's still apparently busy), perhaps there is a automatic timeout
>> that needs to be provided? I can imagine that if people do upgrades of
>> a service (using for example `service nova-api stop`) using this lock
>> and the program is currently locked that they would have to have some
>> kind of sempahore cleaning script to even restart that application
>> (which seems bad).
>>
>> Python 2.7.6 (default, Mar 22 2014, 22:59:56)
>> [GCC 4.8.2] on linux2
>> Type "help", "copyright", "credits" or "license" for more information.
>> >>> from openstack.common import lockutils
>> >>> import os
>> >>> os.getpid()
>> 16123
>> >>> lockutils.InterProcessLock("testing")
>> <openstack.common.lockutils._PosixLock object at 0x7f8bb4a8e210>
>> >>> a = lockutils.InterProcessLock("testing")
>> >>> a.acquire()
>> <openstack.common.loc...

Read more...

Joshua Harlow (harlowja) wrote :

How is this going?

Whats the current way the difference in constructor parameters will be dealt with??

in sysv_ipc.Sempahore it takes a key where key:

key must be None, IPC_PRIVATE or an integer > KEY_MIN and ≤ KEY_MAX. If the key is None, the module chooses a random unused key.

vs posix_ipc.Sempahore where it takes a name, was the plan to hash the previous name into the range of KEY_MIN and KEY_MAX and hope for the best (aka no collisions)?

Yuriy Taraday (yorik-sar) wrote :

I'm yet to port change request for oslo-incubator to oslo.concurrency and fix some test issues to proceed with this.

For key I use basically (md5(lock_name) % KEY_MAX). It should be good enough given rather tight field of feasible lock names out there (basically [a-z0-9/-]+).

You can see change request at https://review.openstack.org/108954 for details.

Ben Nemec (bnemec) wrote :
Changed in oslo:
importance: High → Critical
Ben Nemec (bnemec) wrote :

However, it's maybe worth noting that if we had access to semctl(GETPID) we could potentially implement the PID-based unlocking: http://linux.die.net/man/2/semctl

Yuriy Taraday (yorik-sar) wrote :

As I can see, that StackOverflow thread suggests that POSIX locks are not suitable for this, not SysV ones. So I still think that SysV semaphores are the way to go.

Joshua Harlow (harlowja) wrote :

If the semctl(GETPID) is used how does that work to help the situation?

If process A tries to get a sempahore, sees that it can't then reads pid Y (notices Y doesn't exist anymore) and process B does the same, what do they do then to atomically ensure that only one of A or B actually gets the semaphore (assume A or B will try to take the semphaore as there own)? Do the semaphore operations provide a CAS like function to do something like this?

Yuriy Taraday (yorik-sar) wrote :

I think we got a bit mixed up in APIs here.

semctl is part of SysV IPC API that provide SEM_UNDO as a way to deal with spontaneous process death. If we're going with SysV IPC we don't need to look for PIDs since we won't get locked by sudden process death.

On the other hand POSIX IPC don't provide such simple mechanism. So we might have to invent something there (and I don't see much options but basically reinventing some parts of file locking or SysV-like API).

We should get an understanding of which way are we taking here. I see 3 options:

a) revert to file locks.

It'll require yet another behavior change in lockutils and more burden on operators.

b) stick with POSIX semaphores, add some mechanism to avoid waiting on dead processes.

It'll require some elaborate plan to atomically store PID of process holding the lock and verify if the process holding the lock is running.

c) use SysV semaphores.

With this option we don't need to care about cleanup after a process death as kernel does that for us. The only Issue here (a negligible one from my point of view) is that there are harder limit on the range of keys - there can be only 2^64 of them.

I'm all for option c.

Ben Nemec (bnemec) wrote :

My concerns with sysv semaphores:
1) Hash collisions. I may be comfortable with this if I/someone else do the math to figure out how often this will happen. In theory they're possible even with full sha1, but the probability is so low no one worries about it. If the same is true of 64-bits of sha1 then great, but I honestly don't know.
2) System semaphore limits. On my laptop it's 32000. We apparently have projects creating 10's of thousands of locks, according to https://bugs.launchpad.net/oslo/+bug/1256306. We could very well exhaust that limit. This can be mitigated by extending the lock file deletion function to remove semaphores too, but before it was largely a cosmetic problem and now it would become a functional one (and one you would only hit at scale, so hard to catch in the gate).
3) A new DoS avenue. Because of 2, we can be almost trivially DoS'd by a malicious user on the system who creates a large number of semaphores, exhausting the available pool. Before this particular attack was impossible as long as the lock_path was properly secured.
4) New edge cases. What happens if we do exhaust the semaphore limit? Does it block? Does it raise an exception? Does it just fail silently (!)?
5) sysv semaphores are, by all accounts, horrible to work with. Even the initialization race workaround in the sysv_ipc docs apparently doesn't work. Is this something we want to base our whole IPC locking mechanism on?

There are obviously issues with file-based locks too, but I believe they're largely a solved problem at this point because we've been through years of pain teasing out all the little (and not-so-little) issues. Plus it seems to be the consensus way to handle interprocess locking.

Pádraig Brady (p-draigbrady) wrote :

In my experience I would just go with the fcntl locks. Auto unlocked and well supported, and importantly supported for distributed processes. I'm not sure how problematic the lock_path config is TBH. That is adjusted automatically in certain cases where needed anyway.

Fix proposed to branch: master
Review: https://review.openstack.org/116126

Changed in oslo:
assignee: Yuriy Taraday (yorik-sar) → Ben Nemec (bnemec)

Reviewed: https://review.openstack.org/116126
Committed: https://git.openstack.org/cgit/openstack/oslo.concurrency/commit/?id=f1688a70f28a15cb864651f534344faab92c0d1f
Submitter: Jenkins
Branch: master

commit f1688a70f28a15cb864651f534344faab92c0d1f
Author: Ben Nemec <email address hidden>
Date: Thu Aug 21 22:16:15 2014 +0000

    Use file locks by default again

    After a lot of discussion[1], it was decided that this is the
    safest way forward. In the future we can investigate alternative
    locking methods, probably as opt-ins so we don't break any
    existing consumers of these locks.

    [1] http://lists.openstack.org/pipermail/openstack-dev/2014-August/043090.html

    Change-Id: I49ff98abc395d3263d55aeff26081b462a8a294e
    Partial-Bug: 1327946

Changed in oslo.concurrency:
importance: Undecided → Critical
status: New → In Progress
assignee: nobody → Ben Nemec (bnemec)
Ben Nemec (bnemec) wrote :

Fixed in concurrrency by https://review.openstack.org/#/c/116126/

Still need to backport that to incubator though.

Changed in oslo.concurrency:
status: In Progress → Fix Committed
Changed in oslo.concurrency:
milestone: none → juno-rc1
Ben Nemec (bnemec) on 2014-09-04
Changed in oslo-incubator:
milestone: none → juno-rc1
Doug Hellmann (doug-hellmann) wrote :

The incubator change is in https://review.openstack.org/#/c/118457/

Changed in oslo-incubator:
status: In Progress → Fix Released
Thierry Carrez (ttx) on 2014-09-05
Changed in oslo.concurrency:
milestone: none → next-juno
Thierry Carrez (ttx) on 2014-09-09
Changed in oslo-incubator:
milestone: juno-rc1 → juno-3
Changed in oslo.concurrency:
milestone: next-juno → juno-rc1
Changed in oslo.concurrency:
milestone: next-juno → next-kilo
Changed in oslo.concurrency:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers