weird pthread/fork race/deadlock
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
eglibc (Ubuntu) |
Fix Released
|
High
|
Unassigned | ||
Natty |
Invalid
|
High
|
Unassigned | ||
Oneiric |
Fix Released
|
High
|
Unassigned | ||
glibc (Fedora) |
Fix Released
|
Undecided
|
Bug Description
There appears to be a strange bug in glibc that causes deadlocks when calling fork() from threads. We had a testcase in GLib failing from time to time because of this.
I've attached a minimal testcase that uses only pure pthreads + libc. Compile it with -pthread and run it. It should fill your screen with dots for a while, then hang when it hits the bug (which happens randomly anywhere between 1 dot and hundreds). I've already received independent verification that this testcase hangs on several people's computers.
I believe this to be an upstream issue since this bug is visible on Fedora 15 and 16, but the glibc website says I should file bugs against distributions first. I also believe the issue to be a regression since Lucid is fine but Oneiric is not. The problem appears to affect both 32 and 64bits.
Some notes:
- compiling the testcase with -static has the side-effect of causing the bug to go away
- compiling the testcase with -DFORK_DIRECTLY also appears to solve the problem
- replacing the execv() with a direct exit(0) doesn't solve the problem but causes the frequency to change
The fact that both static linking and making the fork() syscall directly cause the problem to disappear leads me to believe that this is a libc bug rather than a kernel bug (which is the only other possibility). I'm not 100% sure of that, though, since libc actually uses the clone() syscall to implement fork(), so there could be a different inside the kernel because of that.
Related branches
Changed in eglibc (Ubuntu Oneiric): | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → Canonical Foundations Team (canonical-foundations) |
milestone: | none → ubuntu-11.10 |
Changed in eglibc (Ubuntu Natty): | |
status: | New → Triaged |
importance: | Undecided → High |
assignee: | nobody → Canonical Foundations Team (canonical-foundations) |
milestone: | none → natty-updates |
Changed in eglibc (Ubuntu): | |
assignee: | Canonical Foundations Team (canonical-foundations) → nobody |
Changed in eglibc (Ubuntu Natty): | |
assignee: | Canonical Foundations Team (canonical-foundations) → nobody |
Changed in eglibc (Ubuntu Oneiric): | |
assignee: | Canonical Foundations Team (canonical-foundations) → nobody |
Changed in glibc (Fedora): | |
importance: | Unknown → Undecided |
status: | Unknown → Fix Released |
Micah Gersten just tested on Natty and discovered that the bug is there too.