Our software provides no reliable indication of progress

Bug #890703 reported by Muharem Hrnjadovic
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenQuake (deprecated)
Fix Released
Critical
Muharem Hrnjadovic

Bug Description

This makes it impossible to analyze the problems we are experiencing when running large scale computations on the gemsun cluster. See e.g. bug #881894 and bug #890405

Revision history for this message
Muharem Hrnjadovic (al-maisan) wrote :

We should explore the possibility of maintaining progress counters for various subtasks in redis.

Changed in openquake:
status: New → Confirmed
importance: Undecided → Critical
tags: added: celery devop monitoring rabbitmq sys-quality
Revision history for this message
Muharem Hrnjadovic (al-maisan) wrote :

Right now I am observing stalled hazard curve computations on the gemsun cluster -- when they occur I have no idea how many hazard curves were actually computed (data mining the logs is error prone and leads to contradictions).

This prevents me from understanding what is going on and finding a solution.

tags: removed: celery rabbitmq
Revision history for this message
Muharem Hrnjadovic (al-maisan) wrote :
Changed in openquake:
assignee: nobody → Muharem Hrnjadovic (al-maisan)
milestone: none → 0.4.6
Changed in openquake:
status: Confirmed → In Progress
Changed in openquake:
status: In Progress → Fix Committed
Changed in openquake:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.