Remove the duplicate entries in hzrdr.ses_rupture

Bug #1266641 reported by Michele Simionato
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenQuake Engine
Fix Released
High
Michele Simionato

Bug Description

In the event based calculator, the generated ruptures ARE THE SAME FOR ALL SESs, however we store lots of copies of them without reason. The sensible solution would be to store the ruptures only once, and to add a repetition number, to count how many copies are generated: that number is the one returned by the method rupture.sample_number_of_occurrences(). Notice that this inefficiency HAS AN IMPACT FOR OUR SPONSORS: I discovered it when analyzing the bad performances of Hannover RE South East Asia computation. With the change proposed here using 10,000 SES with an investigation time of 1 year (what the sponsor wants) is nearly as efficient as using
1 SES per 10,000 years.

description: updated
Changed in oq-engine:
assignee: nobody → Michele Simionato (michele-simionato)
importance: Undecided → High
milestone: none → 1.0.1
Revision history for this message
Michele Simionato (michele-simionato) wrote :

Here is an analysis performed on gemsun02, in the branch removing-ruptures-2 (date 2014-02-22), for Miriam's island:

operation tot_duration pymemory counts
total compute_ses_and_gmfs 11,716 1,855 57
saving ses 5,128 None 57
computing gmfs 3,823 None 57
saving gmfs 2,578 24 57
generating ruptures 105 None 57
filtering ruptures 44 None 57
initialize_sources 12 4.094 1

As you see, "saving ses" is the dominating operation with nearly half of the time spent there (and it is worse in the cluster).
The time spent there is wasted in saving copies. There are 243202 ruptures, but actually only 11091 of them are distinct ruptures, 95.4% of them are copies.

Revision history for this message
Michele Simionato (michele-simionato) wrote :

Here is an analysis with the branch removing-ruptures-3, where copies are not saved:

         operation | tot_duration | pymemory | counts
----------------------------+-------------------+------------------------+--------
 total compute_ses_and_gmfs | 8048.98307180225 | 1854.4101562500000000 | 57
 computing gmfs | 3686.07636260986 | | 57
 saving gmfs | 3143.40093994196 | 23.0078125000000000 | 57
 saving ses | 509.088505744934 | | 57
 generating ruptures | 106.924173355103 | | 57
 filtering ruptures | 45.477959394455 | | 57
 initialize_sources | 12.2058050632 | 4.2187500000000000 | 1

I was able to speed up the rupture saving time by a factor of 10.

Changed in oq-engine:
status: New → In Progress
Revision history for this message
Michele Simionato (michele-simionato) wrote :
description: updated
Changed in oq-engine:
status: In Progress → Fix Committed
Changed in oq-engine:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.