VMware: InstanceList.get_by_host raise rpc timeout error

Bug #1420662 reported by Rui Chen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Opinion
Low
Unassigned

Bug Description

I deploy my OpenStack with VMware driver, one nova-compute connect to VMware deployment, there are about 3000 VMs in VMware deployment. I use mysql.

The method of InstanceList.get_by_host rasie rpc timeout error when ComputeManager.init_host() and _sync_power_states periodic task execute.

Looks like a performance issue. currently, one nova-compute host map to the whole VMware deployment that maybe contain several clusters in nova VMware driver. When InstanceList.get_by_host execute in ComputeManager, it indicate that nova-compute will execute a rpc call to nova-conducutor, nova-conductor will fetch a lots of instances in the whole VMware deployment in once, in my case , it's 3000 instances. The long time SQL query maybe lead to the nova-conductor rpc timeout.

PS:
vSphere 5.1 now allows 100 hosts and 3000 powered on VMs.
vSphere 6 now allows 1000 hosts and 10,000 powered on VMs.

Tags: vmware
Rui Chen (kiwik-chenrui)
Changed in nova:
assignee: nobody → Rui Chen (kiwik-chenrui)
Changed in nova:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/155676

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Joe Gordon (<email address hidden>) on branch: master
Review: https://review.openstack.org/155676
Reason: This review is > 4 weeks without comment and currently blocked by a core reviewer with a -2. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and contacting the reviewer with the -2 on this review to ensure you address their concerns.

Changed in nova:
assignee: Rui Chen (kiwik-chenrui) → nobody
status: In Progress → Confirmed
Matt Riedemann (mriedem)
Changed in nova:
importance: Medium → Low
status: Confirmed → Opinion
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.