Supervisors must detect and document failed OQ jobs
Bug #809231 reported by
Muharem Hrnjadovic
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenQuake (deprecated) |
Fix Released
|
High
|
Gabriele Favalessa |
Bug Description
This is the responsibility of the OQ job's supervisor:
When the machines/workers calculating a job emit log records with errors/failures the job is to be terminated and marked as failed in the postgres database.
Also brief/detailed errors are to be stored in the db so that the various front-ends can display these to the end users.
tags: | added: database error-feedback |
summary: |
- A job must be marked as failed upon seeing log records with - errors/failures + Failed/crashed OQ jobs must have their status and error info in the + postgres db updated |
Changed in openquake: | |
status: | New → Confirmed |
importance: | Undecided → High |
milestone: | none → 0.4.2 |
description: | updated |
tags: |
added: job-supervision user-interface removed: error-feedback |
summary: |
- Failed/crashed OQ jobs must have their status and error info in the - postgres db updated + Supervisors must detect and document failed/crashed OQ jobs |
description: | updated |
summary: |
- Supervisors must detect and document failed/crashed OQ jobs + Supervisors must detect and document failedOQ jobs |
summary: |
- Supervisors must detect and document failedOQ jobs + Supervisors must detect and document failed OQ jobs |
description: | updated |
Changed in openquake: | |
milestone: | 0.4.2 → 0.4.3 |
Changed in openquake: | |
assignee: | nobody → Gabriele Favalessa (favalex) |
Changed in openquake: | |
status: | Confirmed → In Progress |
Changed in openquake: | |
status: | Fix Committed → Fix Released |
To post a comment you must log in.
terminate the failed job and clean up after it (revocation of pending celery tasks, trigger redis garbage collection)