End-to-end check intermittently failing
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
pkgme service |
Fix Released
|
Critical
|
Jonathan Lange |
Bug Description
For Canonicalers: https:/
Every 20 minutes we run an end-to-end check on the Canonical production & staging pkgme-services. The end-to-end check consists of requesting that a PDF (a copy of "The Jabberwocky") be packaged.
Sometimes, the check fails with "Connection Refused" and we don't know why.
Here's the full error::
Submitting success info to 'http://
[2012-08-14 05:00:58,672: ERROR/MainProcess] Task djpkgme.
Traceback (most recent call last):
File "/usr/lib/
R = retval = task(*args, **kwargs)
File "/srv/pkgme-
logger=
File "/srv/pkgme-
return submit_
File "/srv/pkgme-
url, method='PUT', headers=headers, body=json_body)
File "/usr/lib/
(response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
File "/usr/lib/
(response, content) = self._conn_
File "/usr/lib/
conn.connect()
File "/usr/lib/
raise socket.error, msg
error: [Errno 111] Connection refused
We use nagios to do the check. Because it has built-in limits preventing a check taking more than 10s, and because we expect packaging to sometimes take more than 10s, we do the actual packaging request in a cron job. The nagios check looks at the stored output from the last run of cron and evaluates it.
Both the submission and the evaluation are done using tools from lp:txpkgme: submit-
summary: |
- End-to-end check intermittently failing with "Connection refused" + End-to-end check intermittently failing |
Changed in pkgme-service: | |
status: | Triaged → In Progress |
assignee: | nobody → Jonathan Lange (jml) |
Changed in pkgme-service: | |
status: | In Progress → Fix Released |
Bug 1038967 makes this more difficult to debug. I recommend that we deploy a workaround/fix for that asap so we can better debug this problem.