VMware driver cannot report non-contiguous resources to the scheduler

Bug #1462957 reported by Matthew Booth
26
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Radoslav Gerganov

Bug Description

A VMware hypervisor can have various types of non-contiguous resource. This includes:

* CPUs and memory, assuming a cluster has more than 1 member.
* Storage space, if a (VMware) host has more than 1 datastore.

Focussing on the latter, if a host has 5 datastores, each with 50GB of free space, we currently report the largest contiguous free space to the hypervisor: 50GB. This means that the scheduler knows it can allocate an instance with a 50GB block device, but until the host stats are updated it will not allow subsequent instances to be scheduled there. We could alternatively report 250GB of free space, but would risk the scheduler repeatedly sending us a request for an instance with a 100GB block device, which we cannot fulfil. Without the ability to represent non-contiguous resources we are left choosing between 2 suboptimal choices.

Tags: vmware
Revision history for this message
Matthew Booth (mbooth-9) wrote :

Incidentally, this was previously reported more narrowly here: https://bugs.launchpad.net/nova/+bug/1220459 . That bug has been closed, but the issue is not resolved.

tags: added: vmware
Revision history for this message
Gary Kotton (garyk) wrote :

My two cents: the scheduler should be able to receive the amount of free disk and the largest contiguous available disk and then make its decision accordingly.

Changed in nova:
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Gary Kotton (garyk) wrote :

There are actually two stages of scheduling. The first stage where the scheduler selects the cluster, then the vmware driver selected the datastore

Changed in nova:
assignee: nobody → Radoslav Gerganov (rgerganov)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/516634
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=93bd310b91f2ca6034fea04a48e8399270db1816
Submitter: Zuul
Branch: master

commit 93bd310b91f2ca6034fea04a48e8399270db1816
Author: Radoslav Gerganov <email address hidden>
Date: Tue Oct 31 11:22:20 2017 +0200

    VMware: fix memory stats

    The total memory for the vCenter cluster managed by Nova
    should be the aggregated sum of total memory of each ESX host in the
    cluster. This is more accurate than using the available memory of the
    resource pool associated to the cluster.

    Partial-Bug: #1462957
    Change-Id: I030cee9cebb0f030361aa6bbb612da5cd4202a7f

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/516635
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0124d57275a17544f796edb244a9406487efa29d
Submitter: Zuul
Branch: master

commit 0124d57275a17544f796edb244a9406487efa29d
Author: Radoslav Gerganov <email address hidden>
Date: Tue Oct 31 11:22:46 2017 +0200

    VMware: expose max vCPUs and max memory per ESX host

    Expose maximum vCPUs and maximum memory from single ESX host in the
    vCenter cluster. This will be used for implementing get_inventory() in
    the follow up patch.

    Partial-Bug: #1462957

    Change-Id: I28e19d46a737ac253718c7c66837bd71b064b0b9

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/506175
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=d5e90c8971352721f75bad112ebc1d374c0cc2f9
Submitter: Zuul
Branch: master

commit d5e90c8971352721f75bad112ebc1d374c0cc2f9
Author: Radoslav Gerganov <email address hidden>
Date: Thu Sep 21 16:39:49 2017 +0300

    VMware: implement get_inventory() driver method

    Implementing the get_inventory() method allow us to report the
    aggregated capacity for cpu, disk and memory. The max_unit property for
    cpu and memory is set to the maximum vCPU and memory available from
    single ESX host. The max_unit property for disk is set to the maximum
    free space available on a single datastore.

    Change-Id: I7324bae93e5fbb1301c76466575a906434ca5376
    Closes-Bug: #1462957

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 17.0.0.0b3

This issue was fixed in the openstack/nova 17.0.0.0b3 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.