Unclear failure in TacTestSetup
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Triaged
|
High
|
Unassigned |
Bug Description
A buildmanager tests is giving me this strange error:
TacException: Could not kill stale process /var/tmp/
When I check, /var/tmp/buildd no longer exists and the “stale process” is no longer running.
At the source of this error, in canonical.
If the daemon already appears to be running, then setUp takes roughly the following steps:
1. Kill it.
2. Wait for a child process to die.
3. If pidfile still exists, fail.
That last failure is happening to my branch, but it's a Heisenbug: a bit of debug code inserted between steps 2 and 3 makes it go away.
What I think is happening is this:
1. Some *different* child process dies first.
2. The kill function (lp.osutils.
3. two_stage_kill gets notification, but about the wrong child process.
4. Returns early.
5. TacTestSetup.setUp checks for the pidfile to be gone.
6. It's not. Kaboom.
7. Killed daemon cleans up its pidfile.
8. Killed daemon dies.
9. Developer receives error.
10. Puzzled frown.
Changed in launchpad: | |
status: | Fix Committed → In Progress |
Changed in launchpad: | |
assignee: | Martin Pool (mbp) → nobody |
status: | In Progress → Triaged |
Nope, it doesn't seem to be that. We do wait for the right pid.