OpenQuake Engine

Improve the "parallelize" distribution mechanism

Bug #1245747 reported by Michele Simionato on 2013-10-29

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	OpenQuake Engine	Fix Released	High	Michele Simionato	OpenQuake Engine 1.2.0

Bug Description

At the moment the parallelize method of the base calculator works by splitting the task to spawn in bunches. Only after the first bunch is completely finished the second bunch starts, only after the second bunch finished the third bunch starts, and so one. This is bad: suppose there are sources which is particularly slow to compute. Then we have to wait for them in each bunch. The solution is to generate a single bunch with a lot of tasks: we will have to wait for the slow sources again, but only once and not N times, where N is the number of bunches. It is important to notice that celery is perfectly capable to manage in its queue thousands of tasks, so sending everything to celery is a perfectly legitimate approach, unless we start sending millions of tasks. In that case all the time will be spent in networking. Also building the arglist in memory is perfectly possibile, even with 100,000 sources the memory consumption is negligible (I checked).

See original description

Michele Simionato (michele-simionato) on 2013-10-29

description:	updated
Changed in oq-engine:
importance:	Undecided → High
status:	New → In Progress
assignee:	nobody → Michele Simionato (michele-simionato)

Michele Simionato (michele-simionato) on 2013-11-03

Changed in oq-engine:
status:	In Progress → Fix Committed
milestone:	none → 1.0.1

Daniele Viganò (daniele-vigano) on 2014-12-15

Changed in oq-engine:
status:	Fix Committed → Fix Released

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.