machines read (API) call is unreasonably slow

Bug #1880966 reported by David Andruczyk
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Expired
Medium
Unassigned

Bug Description

When using the CLI or the python-libmass API, the "machines read" or client.machines.list (python-libmass) call is extraordinarily slow.

Our maas installation is running maas 2.6.2 from PPA has about 140 deployed machines known to it at this time, and the above call takes anywhere from 2 MINUTES to 4 MINUTES to return. This has been getting progressively slower over time as the number of deployed machines has increased

From our custom code leveraging the python-libmaas API
2020-05-27 10:21:02,075 - INFO - Querying MAAS for machines to see if it knows about the 2 instances specified in the CSV file (demo.csv)
2020-05-27 10:24:32,160 - INFO - Maas knows about 142 machines

Total time spent 3 minutes 30 seconds!

Leveraging the MAAS cli client that came with maas results in similar extraordinarily slow times with times varying from 2 minutes up to 5 at which point the python-libmaas API times out.

# time maas <PROFILE> machines read >/dev/null
real 2m51.982s
user 0m2.517s
sys 0m0.729s

This call returns the full YAML blob for each machine when it really only needs to return the hostname and ID, which should be a very fast query to request, the "maas machine read" provided that ID will return the full data, which currently takes 5+ seconds (not ideal, but not terrible).

When using automation to drive maas, we need to use this call to be fast to ensure the machine exists and can then query it to ensure it is in the correct state, and if not we can take action to add it as needed.

Ideas:
Make "maas <profile> machines read" only return the hostname/ID in yaml so it's fast
Add an API call to return a machine info based on criteria other than ID (i.e. hostname, tags, other)
maas <profile> machines find <search_criteria> i.e. hostname=blah, tag=blah, etc.
Implement caching in memcached to reduce excessive DB querying.

MAAS region controller information:
8 CPU
100G disk (SSD)
18.04
MaaS installed from official PPA

# dpkg -l |grep maas
ii maas-cli 2.6.2-7841-ga10625be3-0ubuntu1~18.04.1 all MAAS client and command-line interface
ii maas-common 2.6.2-7841-ga10625be3-0ubuntu1~18.04.1 all MAAS server common files
ii maas-proxy 2.6.2-7841-ga10625be3-0ubuntu1~18.04.1 all MAAS Caching Proxy
ii maas-region-api 2.6.2-7841-ga10625be3-0ubuntu1~18.04.1 all Region controller API service for MAAS
ii maas-region-controller 2.6.2-7841-ga10625be3-0ubuntu1~18.04.1 all Region Controller for MAAS
ii python3-django-maas 2.6.2-7841-ga10625be3-0ubuntu1~18.04.1 all MAAS server Django web framework (Python 3)
ii python3-maas-client 2.6.2-7841-ga10625be3-0ubuntu1~18.04.1 all MAAS python API client (Python 3)
ii python3-maas-provisioningserver 2.6.2-7841-ga10625be3-0ubuntu1~18.04.1 all

description: updated
Dougal Matthews (d0ugal)
tags: added: api
Revision history for this message
Dougal Matthews (d0ugal) wrote :

This has been reported previously and some additional context can be seen in;

https://bugs.launchpad.net/maas/+bug/1826879/comments/14

and

https://bugs.launchpad.net/maas/+bug/1682317

Changed in maas:
status: New → Triaged
Revision history for this message
Björn Tillenius (bjornt) wrote :

Actually, there are some things that may improve things already now.

First of all is to upgrade to 2.7. There's been quite a few performance fixes there, so it should be faster than 2.6.

Also, you can already specify 'machines read hostname=blah', so that should improve the speed as well.

Feel free to report the new timings here.

Revision history for this message
David Andruczyk (dandruczyk) wrote :

2.7 was very broken in our testing, specifically. https://bugs.launchpad.net/maas/+bug/1864241

OK, the CLI, it works with "maas <profile> machines read hostname=<shortname>"
Still not very fast...
Ran 10 runs for the SAME hostname (which I would expect the first to potentially be slow and the repeat calls to be quick, turns out its not, they all took about 10-12 seconds for each call.

for i in `seq 1 10` ; do time maas <profile> machines read hostname=<supersecrethostname> >/dev/null ; done

real 0m10.051s
user 0m0.448s
sys 0m0.032s

real 0m10.382s
user 0m0.430s
sys 0m0.044s

real 0m10.096s
user 0m0.421s
sys 0m0.052s

real 0m10.578s
user 0m0.421s
sys 0m0.059s

real 0m10.069s
user 0m0.441s
sys 0m0.040s

real 0m10.781s
user 0m0.438s
sys 0m0.037s

real 0m10.746s
user 0m0.463s
sys 0m0.044s

real 0m9.782s
user 0m0.439s
sys 0m0.037s

real 0m12.308s
user 0m0.452s
sys 0m0.020s

real 0m10.016s
user 0m0.439s
sys 0m0.032s

The python-libmaas API doesn't seem to have a way to do this and that's what I'm using...

Tried both:
    machine = client.machines.get(hostname=canidate['shortname'])
    machine = client.machines.read(hostname=canidate['shortname'])

2020-06-01 13:05:53,786 - INFO - Querying MAAS for machines to see if it knows about the 1 instances specified in the CSV file (demo.csv)
Traceback (most recent call last):
  File "./maasterblaster.py", line 1207, in <module>
    main()
  File "./maasterblaster.py", line 1173, in main
    found, all_machines = query_for_existence(client=client, search_for=to_do)
  File "./maasterblaster.py", line 286, in query_for_existence
    machine = client.machines.get(hostname=canidate['shortname'])
  File "/home/dandruczyk/libmaas/lib/python3.6/site-packages/maas/client/utils/maas_async.py", line 40, in wrapper
    result = func(*args, **kwargs)
TypeError: read() got an unexpected keyword argument 'hostname'

Revision history for this message
João Pedro Seara (jpseara) wrote :
Alberto Donato (ack)
Changed in maas:
importance: Undecided → Medium
Revision history for this message
Thorsten Merten (thorsten-merten) wrote (last edit ):

We are working on improving MAAS. We already did some improvements on API, currently focusing on lists in the UI. Do you still see this problem in 3.4?

Changed in maas:
status: Triaged → Incomplete
Changed in maas:
status: Incomplete → New
status: New → Incomplete
Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for MAAS because there has been no activity for 60 days.]

Changed in maas:
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.