vmware virt driver's report of VCPU can be inaccurate in some cases

Bug #1847999 reported by Chris Dent
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Triaged
Low
Chris Dent

Bug Description

caveat lector: This is a placeholder bug to record an issue with the vmware virtdriver so that if a reasonable solution is determined it can be contributed upstream. The challenge is that no solution is going to be perfect so it may be easier to just leave things as they are, but I wanted to get this in place to remember it. If a patch does happen, I'll be doing it.

In the downstream version of the vmware driver more features are exposed, based on various settings made on the individual esxi hosts and the vcenter cluster manager. Some of these features consume available resources (cpu, disk, memory) that needs to be accounted as overhead, per esxi host. However, because the vmware driver has chosen to expose the vcenter cluster as the unit of hypervisor, per esxi host differences are difficult to manage in nova and placement. In some cases compensation can be done by tweaking max_unit of a resource class (see update_provider_tree in nova/virt/vmwareapi/driver.py for existing examples) to have a value of the maximum available slice on any host (or datastore) and regularly updating this (in the periodic job or after a workload lands).

For VCPU resources there is a mismatch between how the esxi host reports overhead and how nova and placement think of it. vmware talks Hz, nova and placement in whole CPUs. For some NFV-related features, reserving a "core" for network management (things which help a workload but are not the workload itself) will lower the value of available Hz, but not impact 'summary.hardware.numCpuThread', the attribute currently used to calculate total and max_unit for the VCPU resource class.

A more accurate picture of available resources can be created by doing some math across several hardware summary attributes: numCpuThreads, cpuMhz, and numCpuCores. Probably with some "what features are turned on" magic for extra accuracy.

The correct math is being researched, I'll hang it on this bug when it is figured out.

Chris Dent (cdent)
Changed in nova:
status: New → Triaged
importance: Undecided → Low
assignee: nobody → Chris Dent (cdent)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.