Disaggregation calculator: improve the performance

Bug #1279247 reported by Michele Simionato
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenQuake Engine
Fix Released
Critical
Michele Simionato

Bug Description

After discussion with Damiano it turned out that we actually need to rewrite the calculator to use the functions in hazardlib correctly, without wasting cycles and recomputing things several time.

Revision history for this message
Michele Simionato (michele-simionato) wrote :

Currently the disaggregation computation is split in two phases:

Phase 1: compute_hazard_curves

This is the same as in the classical calculator. The computation
is split among sources and it works well. No changes are required.

Phase 2: compute_disagg

Perform the disaggregation. Notice that the number of generated
tasks is #sites * #realization and the number of sites is very small:
in practice, unless there are lots of realizations, the available
cores are wasted and the disaggregation is ultra-slow.

The proposal is to change the workflow in three phases:

Phase 1: compute_hazard_curves

Same as now.

Phase 2: collect_bins

Parallelized by sources exactly as phase 1. It calls `disaggregate_poe`
and fills a dictionary with key (rlz_id, site, poe, iml, im_type, sa_period,
sa_damping) and values (mag_bins, dist_bins, lon_bins, lat_bins, trt_bins,
eps_bins). The number of tasks generated is the same as in phase 1.

Phase 3: arrange_and_save_disagg_matrix

Build the disaggregation matrix from the bins, once for each result, in
parallel. There are #sites * #realizations * #disagg_poes * #IMLs
distinct results, so the parallelization is better than before.
For instance if there are 10 IMLs and 2 disagg_poes you will use
20 times more cores. Of course, if there is only on IML, only 1 poe,
only 1 realization and only 1 site you will use only 1 core. This is
the only phase where the parallelization is not optimal.

Changed in oq-engine:
status: New → In Progress
importance: Undecided → Critical
assignee: nobody → Michele Simionato (michele-simionato)
milestone: none → 1.0.1
Revision history for this message
Michele Simionato (michele-simionato) wrote :
Revision history for this message
Michele Simionato (michele-simionato) wrote :

The plan has been changed. Now we extract only the relevant sections of the full disaggregation matrix, which are extremely small. That means that the saving of the result can be done sequentially in the controller node. Also the collect_bins and arrange_bins operation are done together in the same task.

Changed in oq-engine:
status: In Progress → Fix Committed
Changed in oq-engine:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.