vsphere driver hardcoded to only use first datastore in cluster

Bug #1171930 reported by dan wendlandt
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Opinion
Wishlist
Kiran Kumar Vaddi

Bug Description

this applies to havana master

One of the biggest stumbling blocks for people using the vSphere driver is that is has very poor flexibility in terms of choosing which datastore a VM will be placed on. It simply picks the first datastore the API returns.

I see people asking for two improvements:
- being able to choose the datastore(s) used.
- being able to "spread" disk images across datastore.

One simple mechanism that seems like it could help a lot would be if the user could specify a "datastore_regex", and the behavior of the vSphere driver would be to "round-robin" disk images across any datastore in the cluster that matched this regex. Note, if true round-robin is hard, random + a check for capacity would probably be a good approximation.

Tags: vmware
Changed in nova:
assignee: nobody → Shawn Hartsock (hartsock)
dan wendlandt (danwent)
summary: - vsphere driver hardcoded to only use first datastore in cluster
+ vsphere driver hardcoded to only use first datastore in vCenter
summary: - vsphere driver hardcoded to only use first datastore in vCenter
+ vsphere driver hardcoded to only use first datastore in cluster
Changed in nova:
status: New → Confirmed
importance: Undecided → Wishlist
Revision history for this message
Ananth (cbananth) wrote :

As per discussion on 5/15/13 on IRC, HP is looking into this issue and will be addressing soon.
Below is the approach:
Currently the first datastore of the cluster is selected for provisioning. This results in the following.
1. If the first datastore is not accessible, provisioning fails.
2. If the first datastore is full, provisioning fails.
3. If the first datastore happens to be a the local volume of the ESX host, cluster capabilities such as DRS cannot be exploited.

The proposed change is to ensure the method get_datastore_ref_and_name handles:
1. Select a datastore if it is accessible and meets the memory requirement criteria, depending on the flavor of the VM / image size used for provisioning.
2. Select a datastore which is sharable between hosts of a cluster instead of selecting a datastore local to a host.

Revision history for this message
Shanshi Shi (ssshi) wrote :

I think it would be nice to let the scheduler handle all the "intelligent" stuff, and the virt driver just report its capabilities honestly to the scheduler.
Here's the benefits:
 1. It's more convenient to add a custom filter/weigher than to tweak the vsphere driver.
 2. The scheduler has better access to the database, which could provide datastore_regex'es.
 3. Personally I think the scheduler module is easier and more flexible to handle than a virt driver.

For the scheduler part, a custom filter/weigher could be used to pick a datastore, but we need a good place to pass it to the virt driver. The host manager should also be modified to deal with the capability message change.

For the virt driver part, instead of selecting only one datastore to report, it should report all the datastores, an additional field might be preferred for now. The `spawn` method might also be modified to leverage the datastore information.

As for usage reporting, I'm waiting for more details from this bp https://blueprints.launchpad.net/nova/+spec/accurate-capacity-of-clusters-for-scheduler .

Revision history for this message
Shawn Hartsock (hartsock) wrote :

I like the idea of moving this logic up to the scheduler and I'd like to invest some time in researching that idea. It might make better sense for OpenStack architecturally especially since we were discussing adding user customizable behaviors to how OpenStack + VMware interoperate.

Revision history for this message
Shawn Hartsock (hartsock) wrote :

After thinking about this a little bit, I'd like this to be broken up into two phases. Phase 1: fix this bug. Phase 2: back out the code for this bug and support a Filter/Weighter feature in the scheduler.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/29552

Changed in nova:
assignee: Shawn Hartsock (hartsock) → Kiran Kumar Vaddi (kiran-kumar-vaddi)
status: Confirmed → In Progress
Changed in nova:
milestone: none → havana-2
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/30628

Changed in nova:
milestone: havana-2 → none
milestone: none → havana-3
Thierry Carrez (ttx)
Changed in nova:
milestone: havana-3 → havana-rc1
Changed in nova:
milestone: havana-rc1 → none
Changed in nova:
importance: Wishlist → High
importance: High → Wishlist
importance: Wishlist → High
description: updated
Changed in nova:
importance: High → Wishlist
Revision history for this message
Shawn Hartsock (hartsock) wrote :

Is this still a valid concern after this:( https://review.openstack.org/#/c/52815/1/nova/virt/vmwareapi/vm_util.py ) merged?

Changed in nova:
status: In Progress → Incomplete
status: Incomplete → Opinion
Revision history for this message
Shawn Hartsock (hartsock) wrote :

I'm marking this "Opinion" because there are still some things in the patches associated with this bug # that might come in handy in discussion but I think the problem is mostly taken care of by other bug fixes at this point. Re-open and discuss if I'm mistaken. Thanks.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.