After failed job redis at 100%, celery queue not draining
Bug #943292 reported by
Muharem Hrnjadovic
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenQuake (deprecated) |
Won't Fix
|
High
|
Muharem Hrnjadovic |
Bug Description
Observed this morning on the model facility cluster:
1 - job with hazard calculation failures aborts and is terminated
2 - redis load goes up and it uses 100% of cpu time
3 - the worker machines seem idle
4 - the remaining task messages of the dead job are *not* drained from the celery queue
A job started subsequently appears hung but redis is not responsive and likely the root cause of the problem.
Changed in openquake: | |
milestone: | 0.6.0 → 0.6.1 |
tags: | added: mfcluster |
Changed in openquake: | |
status: | Confirmed → Won't Fix |
To post a comment you must log in.
The can probably only be reproduced on the model facility cluster.