OpenStack Compute (nova)

The instance_faults table is too large, leading to slow query speed of command: nova list --all-tenants

Bug #1800755 reported by Sun Mengyun on 2018-10-31

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Compute (nova)	Triaged	Medium	Unassigned

Bug Description

Description
===========
The execution of command: nova list --all-t, takes 50+ seconds, but the number of virtual machines is only 50.
This is because this command will call the function fill_faults() in "\nova\objects\instance.py", and this function
will query the database table: instance_faults. If the number of records in this table is too large, the performance will be very poor.

For example, in my openstack, due to many wrong operations, the record number is more than 250 thousand and query time is 50+ second.

In my opinion, as time goes on, data will become more and more, and query performance will be lower and lower. So, we need a plan to ensure that query performance is not affected by data volume.

Steps to reproduce
==================
This bug is not easy to reproduce, unless your data is large too.

Environment
===========
[root@nail1 ~]# rpm -qa | grep nova
openstack-nova-api-18.0.2-1.el7.noarch
openstack-nova-common-18.0.2-1.el7.noarch
python2-novaclient-11.0.0-1.el7.noarch
openstack-nova-placement-api-18.0.2-1.el7.noarch
openstack-nova-scheduler-18.0.2-1.el7.noarch
openstack-nova-conductor-18.0.2-1.el7.noarch
openstack-nova-novncproxy-18.0.2-1.el7.noarch
python-nova-18.0.2-1.el7.noarch
openstack-nova-compute-18.0.2-1.el7.noarch
openstack-nova-console-18.0.2-1.el7.noarch

hypervisor:
Libvirt + KVM

Tags:

Sun Mengyun (kmehxhcr) on 2018-10-31

tags:

added: list

Revision history for this message

Matt Riedemann (mriedem) wrote on 2018-11-01:

Sounds like bug 1632247 but that was "fixed" a couple of years ago with this change:

https://review.openstack.org/#/c/409943/

I wonder if there has been a regression?

summary:

The instance_faults table is too large, leading to slow query speed of
- command: nova list --all-t
+ command: nova list --all-tenants

Revision history for this message

Matt Riedemann (mriedem) wrote on 2018-11-01:

I'm not sure why we don't provide some way to purge old faults. The API only shows the latest fault for a given instance. And we don't have any API or nova-manage CLI to list *all* faults for a given instance, so I guess they are just there in the database until the instance is deleted and archived/purged. Seems we could add a nova-manage CLI to allow purging old fault information as long as the latest fault is left intact.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2018-11-01:

This is also likely poor for performance:

https://github.com/openstack/nova/blob/f13debf2f0e5377b9d0b0bbd9422c6a79d2cc611/nova/objects/instance.py#L1259

But I'm not sure that is in the same "nova list" call path so it shouldn't be related to that issue, but could be a problem for performance if it's ever called with "fault" in expected_attrs. It would need to be audited.

Matt Riedemann (mriedem) on 2018-11-01

Changed in nova:
status:	New → Triaged
importance:	Undecided → Medium

Revision history for this message

Chris Friesen (cbf123) wrote on 2018-11-01:

Pretty sure that StarlingX purges instance_faults when purging instances. It's in INSTANCES_CHILD_TABLES here:

https://github.com/starlingx-staging/stx-nova/commit/71acfeae0d1c59fdc77704527d763bd85a276f9a#diff-8fec546e4c39f78d233f8e21dadaa3ffR88

Before we purge soft-deleted instances, we purge entries in the tables in INSTANCES_CHILD_TABLES which refer to the soft-deleted instances that are about to be purged.

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.