Intermittent slowness on MAAS 2.8.1
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Fix Released
|
Undecided
|
Unassigned | ||
maas-ui |
Fix Released
|
Unknown
|
Bug Description
Hey there,
We've been seeing intermittent slow page loads on our MAAS server.
Sometimes on the machines list, sometimes on the KVM list, sometimes on the Images list.
When we hit a very slow patch it can take 20-40 seconds to load one of the given pages.
Here's our setup:
3x SuperMicro X9DRT (blades), running Ubuntu 18.04 Server, 1x Samsung 850 EVO 500GB (boot drive), 2x Samsung 860 EVOs (mirrored zpool for LXDs), 64GB RAM, 10G LAN connection
3x LXDs each with a region+rack server (snap version)
3x LXDs each running PostgreSQL (deployed with juju using the postgresql charm version 208 [postgresql version 10.12]) as the database for MAAS.
176x Hardware Machines are checked into MAAS
~35x KVMs deployed across 77 KVM hosts (in MAAS)
Watching usual system statistics has been no help (CPU/load/disk bandwith+IO/network bandwidth+IO), as far as these go they seem to be pretty quiet.
I've been struggling to find anything else that could be the cause, so I've been watching Postgres for slow queries (>5 seconds), but I have no idea if what I'm seeing is out of the ordinary or not:
```
2020-08-12 20:46:53 UTC [22532]: [3-1] db=maasdb,user=maas LOG: duration: 31784.074 ms fastpath function call: "lo_unlink" (OID 964)
2020-08-12 20:47:32 UTC [40006]: [3-1] db=maasdb,user=maas LOG: duration: 6950.174 ms statement: COMMIT
2020-08-11 22:30:34 UTC [27895]: [3-1] db=maasdb,user=maas LOG: duration: 19999.379 ms fastpath function call: "loread" (OID 954)
2020-08-11 22:30:34 UTC [9323]: [5-1] db=maasdb,user=maas LOG: duration: 15769.444 ms statement: SELECT "maasserver_
2020-08-14 04:13:11 UTC [40654]: [3-1] db=maasdb,
2020-08-11 22:30:34 UTC [27881]: [3-1] db=maasdb,user=maas LOG: duration: 19160.838 ms statement: UPDATE "maasserver_node" SET "power_
2020-08-11 23:36:23 UTC [13011]: [3-1] db=maasdb,user=maas LOG: duration: 26002.864 ms statement: SELECT pg_advisory_
```
Current counts for the above queries:
```
grep lo_unlink postgresql-
24
grep COMMIT postgresql-
390
grep loread postgresql-
201
grep 'SELECT "maasserver_node"' postgresql-
201
grep COPY postgresql-
4
grep UPDATE postgresql-
943
grep pg_advisory_
94
```
For the queries above, the Durations listed are pretty representative of those queries (at least when they're >5 seconds).
Is the above strange or out of order?
Is there something else we should be looking into?
Or are the slow UI times we're seeing expected performance?
This is cross-posted on the MAAS Discourse: https:/
Thank you!
-Derek
tags: | added: pb same |
tags: | added: ui |
Changed in maas-ui: | |
importance: | Undecided → Unknown |
Changed in maas-ui: | |
status: | New → Fix Released |
Changed in maas: | |
status: | New → Fix Committed |
Changed in maas: | |
milestone: | none → 3.2.0-beta1 |
Changed in maas: | |
status: | Fix Committed → Fix Released |
> 3x LXDs each with a region+rack server (snap version)
This could be a source of slowness, snaps in LXD use squashfuse which is not as performant as native squashfs. This issue is tracked in https:/ /bugs.launchpad .net/snapd/ +bug/1817276