Increasing number of InstancePCIRequests.get_by_instance_uuid RPC calls during compute host auditing

Bug #1387244 reported by James Page
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Unassigned
nova (Ubuntu)
Fix Released
High
Unassigned

Bug Description

Environment: Ubuntu 14.04/OpenStack Juno Release

The periodic auditing on compute node becomes very RPC call intensive when a large number of instances are running on a cloud; the InstancePCIRequests.get_by_instance_uuid call is made on all instances running on the hypervisor - when this is multiplied across a large number of hypervisors, this impacts back onto the conductor processes as they try to service an increasing amount of RPC calls over time.

Revision history for this message
James Page (james-page) wrote :
description: updated
tags: added: juno scale-testing
James Page (james-page)
summary: - Large number of InstancePCIRequests.get_by_instance_uuid RPC calls
+ Increasing number of InstancePCIRequests.get_by_instance_uuid RPC calls
during compute host auditing
Revision history for this message
Dan Smith (danms) wrote :

This is a known consequence of some last-minute code that was pulled into juno. We have patches up for improving this flow during instance build, as part of this work:

https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/flavor-from-sysmeta-to-blob,n,z

This should be specifically related to extra calls during the process of building or resizing instances and not constant additional load. If that's not the case, then please provide extra details.

Changed in nova:
status: New → Confirmed
importance: Undecided → Medium
Revision history for this message
James Page (james-page) wrote :

Looking at the log data, the number of RPC calls relates exactly to the number of running instances on the hypervisor, and is definately in the context of the period audit.

I'm running a continual 100 instances parallel instance creation load against the cloud.

Revision history for this message
James Page (james-page) wrote :

For reference:

Number of hypervisors: 500
Number of conductors: 8 servers, each with 4 cores/8 threads and 16G of memory, each conductor server is configured with 32 worker threads.
Number of running instances: 29,000

Revision history for this message
James Page (james-page) wrote :

The proposed change:

https://review.openstack.org/#/c/131321/1

Appears to resolve this issue.

Dan Smith (danms)
Changed in nova:
importance: Medium → High
tags: added: juno-backport-potential
James Page (james-page)
Changed in nova (Ubuntu):
importance: Undecided → High
status: New → Triaged
Dan Smith (danms)
Changed in nova:
status: Confirmed → Fix Released
Chuck Short (zulcss)
Changed in nova (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.