ceilometer/tests/test_bin.py can fail on slow machine
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Ceilometer |
Won't Fix
|
Low
|
Unassigned |
Bug Description
test_bin.py spawns a vareity of subprocess for the various "bins" (scripts) that ceilometer has. One subprocess contains an api server and the test of it makes requests to /v2/meters to confirm a 200 response and an empty JSON list.
This test can sometimes fail:
self.
AttributeError: 'NoneType' object has no attribute 'status'
This happens because the HTTP request to the server is getting connection refused more than 10 times. The server does not finish initializing before a for loop with a .5 sleep between HTTP requests completes.
The lame fix for this is to increase either the timeout or the number of loops but this doesn't strike me as robust. In fact I'd go so far as to say any test (in a high latency environment) that is contingent on a timeout is a pretty bad smell. A slightly less worse option may be to sleep a bit (in the parent) before entering the loop. Still pretty icky.
So what's the other option? Not really clear. The test's goal is to test the viability of the ceilometer-api console-script. Is that really necessary. Are we interested in that specifically or are we more concerned that the WSGI interface works as designed? I don't know? Anyone?
Changed in ceilometer: | |
status: | Triaged → Won't Fix |
In my specific case the slowness is caused by 0.0.0.0 resolving very slowly in a call to `socket. get_fqdn( )`. It can be resolved by making some changes to the local `/etc/hosts`.
However the same point remains: The test is very fragile because of the way it handles timeouts and uses subprocesses. Is this wise?