Comment 1 for bug 2020760

Revision history for this message
Anton Troyanov (troyanov) wrote (last edit ):

I don't think that added subquery introduced such a dramatic penalty (x3 as seen in performance tests) on its own.

Here is the [query](https://pastebin.ubuntu.com/p/Ny4VNtrfW4/) which is a simplified version of what Django [generates](https://pastebin.ubuntu.com/p/xdRYHwMyTR/) with removed WHERE filtering.

Original (with WHERE and retrieving all the columns):
`Execution Time: 1873.566 ms`

Simplified (without WHERE and most of the columns):
`Execution Time: 1567.970 ms`

Interesting part, if I completely remove the qubquery:
`Execution Time: 1458.419 ms`

When I started to analyze execution plan, there was one line that I think explains why this is slow `rows=1062792`:
```
 GroupAggregate (cost=904.26..31523.40 rows=505 width=16) (actual time=4.274..289.491 rows=505 loops=1)
   Group Key: maasserver_node.id
   -> Merge Left Join (cost=904.26..26425.12 rows=1018645 width=16) (actual time=4.024..213.045 rows=1062792 loops=1)
```

This is the cost for being able to filter on everything using LEFT JOIN, which gives an enormous data merge.
And then we do a GROUP BY over those entries, to group it back to number of machines (thats what makes this query expensive).

Also the fix was backported to 3.3 which is already released, but performance tests do not show any changes there.