Oops report does not always log timed-out query
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Launchpad itself |
Fix Released
|
High
|
Stuart Bishop |
Bug Description
Timeouts on database queries produce incomplete oops reports in about two-thirds of the cases I've looked at.
In these cases, the query that actually timed out is not logged. This causes several inconveniences:
* The oops report seems to count the query's execution time as "non-sql time."
* Oops summaries sometimes list the last-logged query instead of the failing one as the point of timeout.
* It's not easy to see how long the timed-out query really took.
Examples: OOPS-1085D2293, OOPS-1085C2639, OOPS-1085D2607, OOPS-1085E2170, OOPS-1085F1669, OOPS-1085A2372. The last query in the log is not the same one in the timeout message, and the last query's start time is nowhere near the timeout limit.
It's not consistent though: in the remaining one-third of cases the query was logged. Could this happen because the timed-out query took even longer than the infrastructure accounts for? Or could the timeout happen completely between the time when the ORM issues the query and the time it's actually logged?
Related branches
- Jeroen T. Vermeulen (community): Approve
- Diff: None lines
Changed in launchpad-foundations: | |
assignee: | nobody → Stuart Bishop (stub) |
Changed in launchpad-foundations: | |
status: | Triaged → In Progress |
Changed in launchpad-foundations: | |
status: | Fix Committed → Fix Released |
Danilo brought up today OOPS-1174S166. The query shown in the traceback doesn't appear in the SQL statement log.
Initially I thought it could be a problem rendering the SQL statements from the raw oops report, but after checking it, it's indeed not recorded in the raw report.