ec2test sometimes hangs when running the windmill test suite
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Fix Released
|
High
|
Māris Fogels |
Bug Description
On occasion our test suite will hang when loading the first windmill test via ec2test. This only appears to affect developer systems running ec2test.
If you encounter this problem, rerunning the test suite should work.
The following stack trace was pulled from a hung system:
> Thread 2
> #0 0x00002b85227fd7fb in accept () from None
> #1 0x00002b852388f947 in sock_accept (s=0x94409c0) from
> /build/
> /usr/lib/
> /usr/lib/
> /usr/lib/
> /var/launchpad/
> py2.5.egg/
> /usr/lib/
> /usr/lib/
> /usr/lib/
MaxB said "This must be the culprit of the hang, it appears similar to one I've been looking at for the Python 2.6 migration. Whatever was supposed to knock this thread out of its accept loop, hasn't. "
Attached are two log files from two simultaneously hung test runs. They both hang in the same place, just after the RegistryWindmil
The following thread is also relevant: https:/
Related branches
- Gary Poster (community): Approve
-
Diff: 80 lines (+20/-6)2 files modifiedMakefile (+19/-5)
lib/devscripts/ec2test/remote.py (+1/-1)
- Leonard Richardson (community): Approve
-
Diff: 33 lines (+1/-13)2 files modifiedMakefile (+0/-12)
lib/devscripts/ec2test/remote.py (+1/-1)
description: | updated |
Changed in launchpad-foundations: | |
status: | Fix Committed → In Progress |
Changed in launchpad-foundations: | |
milestone: | 10.05 → 10.06 |
tags: | removed: qa-needstesting |
Changed in launchpad-foundations: | |
status: | In Progress → Fix Committed |
tags: |
added: qa-done removed: qa-needstesting |
tags: |
added: qa-ok removed: qa-done |
Changed in launchpad-foundations: | |
status: | Fix Committed → Fix Released |
Quoted from the mailing list thread:
"Windmill implements a custom HTTPS web server, which is waiting for data. I would guess that something in the web browser itself hung: either loading the test harness, loading the site under test, or passing back test results. We need a log file to know for sure."
I think the next thing we should do is look for a robust mechanism for running and restarting hung tests. Even if we find what is hanging in the test harness (browser, server, network, moon phase) we will probably end up fixing it with a full restart anyway.