Comment 1 for bug 2072790

Revision history for this message
Galen Charlton (gmc) wrote (last edit ):

One way to deal with this is to arrange for fast reports to be handled by dedicated Clark worker processes. In particular, two new configuration parameters could be added to Clark:

- maximum amount of time that a report's SQL query can be allowed to run to be considered "fast". This would default to a low value such as 5 seconds.
- number of workers to reserve for handling fast reports

A given report run would be initially assumed to be fast per a new Boolean column on reporter.schedule, is_slow, that defaults to false.

Each time that Clark polls to see if it should spawn a worker to run a report, it would compare the number of workers already running a report to its maximum concurrency and the number of workers to reserve to run fast reports, i.e., whether $concurrency - $current_running <= $num_reserved. If that condition is met, the first scheduled fast report would be selected to run (i.e., where is_slow IS FALSE). Otherwise, the first pending scheduled report of any speed would be selected.

When a worker runs a "fast" report, a statement_timeout would be set to the maximum amount of time allowed for fast reports. If the report's query times out, the reporter.schedule row would be reset by setting start_time to NULL and is_slow to TRUE.

With this approach, setting the concurrency to 4 and the number reserved to 1 would allow up to 3 slow reports to be run in parallel while reserving a worker slot for the fast reports to continue to flow.

If the concurrency is left at just 1 and the number reserved to 1, the effect would be that any available fast reports are run first, but once it gets to a slow report, the slow report will block further reports until it is completed. If the concurrency is set to 1 and the number reserved to 0, there would be no change in behavior as compared to the status quo: reports would just get processed based on their scheduled time.

Mike Rylander contributed to this idea.