race condition updating statefile
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
ifupdown (Ubuntu) |
Fix Released
|
Medium
|
Chris J Arges | ||
Precise |
Fix Released
|
Medium
|
Chris J Arges | ||
Quantal |
Fix Released
|
Medium
|
Chris J Arges | ||
Raring |
Fix Released
|
Medium
|
Chris J Arges | ||
Saucy |
Fix Released
|
Medium
|
Chris J Arges |
Bug Description
SRU Justification:
[Impact]
* Users will occasionally see their network interfaces not come up due to race conditions.
[Test Case]
* See this comment: https:/
[Regression Potential]
* This fix backports a change from upstream ifupdown. Instead of locking a statefile it locks a lockfile.
--
Ubuntu 12.04.2
ifupdown 0.7~beta2ubuntu8
Symptom: Every so often, /etc/init/
> Mar 25 16:39:37 XXXXXXXX kernel: [ 28.793922] init: network-interface (lo) pre-start process (1079) terminated with status 1
/var/log/
> ifup: failed to overwrite statefile /run/network/
Relevant section of the ifup sources, in update_state():
if (rename (tmpstatefile, statefile)) {
}
update_state() opens the statefile, gets a F_SETLKW lock on it, opens a tmpfile, filters the contents of the statefile into the tmpfile, closes the tmpfile, then renames the tempfile over the statefile.
Once the rename() happens in one instance of ifup, any other blocked instances are waiting around to lock a file that no longer exists in the filesystem. Overlap enough instances of ifup just right and you have them all locking different copies of statefile, which then doesn't prevent any of them from rename()ing tmpstatefile out from underneath the others, thus causing their own rename()s to fail with ENOENT.
Example:
Process A starts, opens statefile.
Process A locks statefile.
Process B starts, opens statefile.
Process B waits for lock on statefile.
Process A renames tmpstatefile to statefile and exits.
Process B acquires lock on *outdated* statefile FILE pointer.
Process C starts, opens current statefile (written by Process A).
Process C locks current statefile.
** Two ifups now have locks **
Process B renames tmpstatefile to statefile and exits.
Process C tries to rename tmpstatefile, fails because tmpstatefile has already been renamed out from under it by Process B.
NOTE: Since Process B was operating on an outdated statefile, it has also stomped over any changes made by Process A, so simply making the tmpstatefile process-specific to avoid rename()ing out from under each other won't help.
Related bugs:
* bug 1226067: ifquery fails with bad file descriptor
Changed in ifupdown (Ubuntu): | |
assignee: | nobody → Chris J Arges (arges) |
Changed in ifupdown (Ubuntu Precise): | |
assignee: | nobody → Chris J Arges (arges) |
Changed in ifupdown (Ubuntu Quantal): | |
assignee: | nobody → Chris J Arges (arges) |
Changed in ifupdown (Ubuntu Raring): | |
assignee: | nobody → Chris J Arges (arges) |
status: | New → In Progress |
Changed in ifupdown (Ubuntu Saucy): | |
status: | Confirmed → In Progress |
Changed in ifupdown (Ubuntu Quantal): | |
status: | New → In Progress |
Changed in ifupdown (Ubuntu Precise): | |
status: | New → In Progress |
importance: | Undecided → Medium |
Changed in ifupdown (Ubuntu Quantal): | |
importance: | Undecided → Medium |
Changed in ifupdown (Ubuntu Raring): | |
importance: | Undecided → Medium |
Changed in ifupdown (Ubuntu Saucy): | |
importance: | Undecided → Medium |
description: | updated |
Changed in ifupdown (Ubuntu Raring): | |
status: | In Progress → Fix Committed |
tags: |
added: verification-failed-precise verification-failed-quantal verification-failed-raring removed: verification-done-precise verification-done-quantal |
tags: |
added: verification-done-precise removed: verification-failed-precise |
tags: |
added: verification-done-raring removed: verification-failed-raring |
Note: Exact same locking semantics are present in latest ifupdown (0.7.40) from upstream Debian.
Suggested fix: Lock a lockfile, not the statefile.