Optimize SiteCollection.expand()

Bug #1094297 reported by Lars Butler
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenQuake (deprecated)
Fix Released
Critical
Lars Butler

Bug Description

We've found that large calculations, spanning roughly 95k sites of interest, take a really long time to complete.

The attached files detail a test scenario and the resulting profiling report. The case was derived from one of the SHARE calculations, using the first area source defined in the source model and all of the same input parameters.

Revision history for this message
Lars Butler (lars-butler) wrote :

This is the python script used to run the test scenario. For profiling, it was run like so:

$ python -m cProfile -s time profile.py

Changed in openquake:
status: New → In Progress
importance: Undecided → Critical
assignee: nobody → Lars Butler (lars-butler)
milestone: none → 0.9.0
Revision history for this message
Lars Butler (lars-butler) wrote :

Here is the initial profiling report. Note that the major bottleneck here is the 2.3 million calls to numpy.put(). 82% of the total time was consumed doing this operation, which occurs in SiteCollection.expand() (see https://github.com/gem/nhlib/blob/1ce300b8002a540c8cf7b76ef5714715aa4c90a5/nhlib/site.py#L282).

Revision history for this message
Lars Butler (lars-butler) wrote :

This patch appears to give a significant performance boost. Running the same test (profile.py), performance improved by a factor of 6.

<patch>
--- a/nhlib/site.py
+++ b/nhlib/site.py
@@ -278,8 +278,8 @@ class SiteCollection(object):
         num_values = data.shape[1]
         result = numpy.empty((total_sites, num_values))
         result.fill(placeholder)
- for i in xrange(num_values):
- result[:, i].put(self.indices, data[:, i])
+ for i, idx in enumerate(self.indices):
+ result[idx] = data[i]
         return result

     def filter(self, mask):
</patch>

Revision history for this message
Lars Butler (lars-butler) wrote :

Here is the result from the post-optimization profiling report.

The patch above passed all nhlib tests; I still have to run oq-engine QA tests to ensure that they are all happy.

Revision history for this message
Lars Butler (lars-butler) wrote :
Changed in openquake:
status: In Progress → Fix Committed
tags: added: hazard optimization
Changed in openquake:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.