expand_indices does not scale
Bug #959052 reported by
Johan Hake
This bug affects 1 person
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
UFL |
Fix Released
|
Critical
|
Martin Sandve Alnæs |
Bug Description
Then memory is filled pretty fast for large forms. This happens for both SFC and FFC within the UFL function:
expand_indices
The memory consumption for the attached script peaks at more than 4.6 GB.
Changed in ufl: | |
importance: | Undecided → Critical |
status: | New → In Progress |
assignee: | nobody → Martin Sandve Alnæs (martinal) |
Changed in ufl: | |
milestone: | none → 1.1.0-alpha |
Changed in ufl: | |
status: | Fix Committed → Fix Released |
To post a comment you must log in.
I have tried to improve this situation as follows.
1) Avoid caching repr strings for all non-terminal expressions. This reduces memory usage drastically; memory usage was something like O(n^2) in number of nodes in an expression. This makes repr(expr) slower, because it must be reconstructed each time.
2) Avoid using repr in __eq__ and some other places. This means a == b will now potentially have to compare all subexpressions of a and b, and == can thus be costly for large and (almost) equal a and b. But comparing their large repr strings would probably be almost as costly anyway.
3) Improve hashing algorithm. The new algorithm causes significantly fewer collisions, and thus reduces the need to call == a lot in algorithms depending on dicts or sets of expressions.
Together these improvements are quite significant. But it is difficult to foresee side effects, as the performance characteristics of core operations have changed quite a bit. Therefore I'm reluctant to just push, and want some more testing first. I have not run the ffc and dolfin tests yet, will do that tomorrow.
Anyone who wants to test should get this branch:
bzr pull lp:~martinal/ufl/work
You will also need a tiny patch to ffc:
bzr pull lp:~martinal/ffc/trunk
Any ideas for improvement to the hashing algorithm (in operatorbase.py) and the __eq__ implementation (in exproperators.py) are welcome!
In addition, I found an old attempt at improving expand_indices, which I may take a look at later to see if we can get further reductions in run time.
Another improvement could be to implement a more compact signature string to write to generated code instead of the repr string, which becomes quite large in this case.