OpenQuake Engine

Disaggregation calculator: improve the performance

Bug #1279247 reported by Michele Simionato on 2014-02-12

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenQuake Engine	Fix Released	Critical	Michele Simionato	OpenQuake Engine 1.2.0

Bug Description

After discussion with Damiano it turned out that we actually need to rewrite the calculator to use the functions in hazardlib correctly, without wasting cycles and recomputing things several time.

Revision history for this message

Michele Simionato (michele-simionato) wrote on 2014-03-12:

Currently the disaggregation computation is split in two phases:

Phase 1: compute_hazard_curves

This is the same as in the classical calculator. The computation
is split among sources and it works well. No changes are required.

Phase 2: compute_disagg

Perform the disaggregation. Notice that the number of generated
tasks is #sites * #realization and the number of sites is very small:
in practice, unless there are lots of realizations, the available
cores are wasted and the disaggregation is ultra-slow.

The proposal is to change the workflow in three phases:

Phase 1: compute_hazard_curves

Same as now.

Phase 2: collect_bins

Parallelized by sources exactly as phase 1. It calls `disaggregate_poe`
and fills a dictionary with key (rlz_id, site, poe, iml, im_type, sa_period,
sa_damping) and values (mag_bins, dist_bins, lon_bins, lat_bins, trt_bins,
eps_bins). The number of tasks generated is the same as in phase 1.

Phase 3: arrange_and_save_disagg_matrix

Build the disaggregation matrix from the bins, once for each result, in
parallel. There are #sites * #realizations * #disagg_poes * #IMLs
distinct results, so the parallelization is better than before.
For instance if there are 10 IMLs and 2 disagg_poes you will use
20 times more cores. Of course, if there is only on IML, only 1 poe,
only 1 realization and only 1 site you will use only 1 core. This is
the only phase where the parallelization is not optimal.

Changed in oq-engine:
status:	New → In Progress
importance:	Undecided → Critical
assignee:	nobody → Michele Simionato (michele-simionato)
milestone:	none → 1.0.1

Revision history for this message

Michele Simionato (michele-simionato) wrote on 2014-03-25:

https://github.com/gem/oq-hazardlib/pull/187
https://github.com/gem/oq-engine/pull/1390

Revision history for this message

Michele Simionato (michele-simionato) wrote on 2014-03-27:

The plan has been changed. Now we extract only the relevant sections of the full disaggregation matrix, which are extremely small. That means that the saving of the result can be done sequentially in the controller node. Also the collect_bins and arrange_bins operation are done together in the same task.

Michele Simionato (michele-simionato) on 2014-04-01

Changed in oq-engine:
status:	In Progress → Fix Committed

Daniele Viganò (daniele-vigano) on 2014-12-15

Changed in oq-engine:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.