[SRU] multi scheduler does not handle capabilities updates correctly

Bug #1000403 reported by Armando Migliaccio on 2012-05-16
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Low
Armando Migliaccio
Essex
Undecided
Unassigned
nova (Ubuntu)
Undecided
Unassigned
Precise
Undecided
Unassigned

Bug Description

The multi scheduler allows the routing of VM-related ops to the compute_driver and the routing of Volume-related ops to the volume_driver.

Unfortunately, the Multi Scheduler does not handle capabilities updates correctly in that it does not pass the information on to the child schedulers. So when a request gets to the FilterScheduler (which is the default compute scheduler), the latter does not have any host capabilities, thus making a ComputeFilter fail to choose a specific host. This occur particularly if host selection is made based on the flavor's extra specs.

This happens on Folsom trunk.

Related branches

CVE References

Vish Ishaya (vishvananda) wrote :

We probably won't need the multi-scheduler once the move to cinder is finished, but If you see an easy way to fix it, please do so.

Changed in nova:
importance: Undecided → Low
status: New → Triaged

Vish,

I'll propose a change shortly. The fix is fairly straightforward

Changed in nova:
assignee: nobody → Armando Migliaccio (armando-migliaccio)
Changed in nova:
status: Triaged → In Progress

Reviewed: https://review.openstack.org/7526
Committed: http://github.com/openstack/nova/commit/3ae69ebcc5febd79c6cfdb8e37ce34a2fe660655
Submitter: Jenkins
Branch: master

commit 3ae69ebcc5febd79c6cfdb8e37ce34a2fe660655
Author: Armando Migliaccio <email address hidden>
Date: Thu May 17 01:54:53 2012 +0100

    Fix Multi_Scheduler to process host capabilities

    To fix bug #1000403, make sure that each driver held by the
    Multi Scheduler gets called during update_service_capabilities.

    Change-Id: Iee8141f1a6dcfa24101640626d209d2d65776339

Changed in nova:
status: In Progress → Fix Committed
Devin Carlen (devcamcar) on 2012-05-22
Changed in nova:
milestone: none → folsom-1
Thierry Carrez (ttx) on 2012-05-23
Changed in nova:
status: Fix Committed → Fix Released

Reviewed: https://review.openstack.org/7592
Committed: http://github.com/openstack/nova/commit/4e423cd558e2f36ebe6553a9df1a32fca93b0428
Submitter: Jenkins
Branch: stable/essex

commit 4e423cd558e2f36ebe6553a9df1a32fca93b0428
Author: Armando Migliaccio <email address hidden>
Date: Thu May 17 01:54:53 2012 +0100

    Fix Multi_Scheduler to process host capabilities

    To fix bug #1000403, make sure that each driver held by the
    Multi Scheduler gets called during update_service_capabilities.

    Change-Id: If8a942317b9b26dd101c5bcf502aab7296608abd

tags: added: in-stable-essex
Chuck Short (zulcss) on 2012-06-07
summary: - multi scheduler does not handle capabilities updates correctly
+ [SRU] multi scheduler does not handle capabilities updates correctly
Chuck Short (zulcss) wrote :

** Impact **

The multi scheduler allows the routing of VM-related ops to the compute_driver and the routing of Volume-related ops to the volume_driver.

Unfortunately, the Multi Scheduler does not handle capabilities updates correctly in that it does not pass the information on to the child schedulers. So when a request gets to the FilterScheduler (which is the default compute scheduler), the latter does not have any host capabilities, thus making a ComputeFilter fail to choose a specific host. This occur particularly if host selection is made based on the flavor's extra specs.

** Development Fix **

This has been addressed in: https://review.openstack.org/7526 and fixed in quantal

** Stable fix **

This has been addressed in: https://review.openstack.org/7592

** Test Case **

Run the nova.tests.scheduler tests

** Regression **

Minimal this has been tested in openstack-ci and has past tests.

Chuck Short (zulcss) on 2012-06-08
Changed in nova (Ubuntu Precise):
milestone: none → ubuntu-12.04.1
Robie Basak (racb) on 2012-06-12
Changed in nova (Ubuntu):
status: New → Triaged
Changed in nova (Ubuntu Precise):
status: New → Triaged

Hello Armando, or anyone else affected,

Accepted nova into precise-proposed. The package will build now and be available in a few hours. Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users. If this package fixes the bug for you please change the bug tag from verification-needed to verification-done. If it does not, change the tag to verification-failed. In either case details of your testing will help us make a better decision. Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nova (Ubuntu Precise):
status: Triaged → Fix Committed
tags: added: verification-needed

I followed instructions on:

https://wiki.ubuntu.com/QATeam/PerformingSRUVerification

and I get the following package:

dpkg -l nova-scheduler | cat
Desired=Unknown/Install/Remove/Purge/Hold
| Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend
|/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad)
||/ Name Version Description
+++-===============================-========================================-==========================================================================
ii nova-scheduler 2012.1-0ubuntu2.3 OpenStack Compute - virtual machine scheduler

however:

/usr/lib/python2.7/dist-packages/nova/scheduler/multi.py

does not seem to have the fix.

tags: added: verification-failed
removed: verification-needed
Chuck Short (zulcss) wrote :

Please make sure you have proposed enabled.

https://wiki.ubuntu.com/Testing/EnableProposed

Hi Chuck, that'll do ;)

Thanks, verified successfully.

tags: added: verification-done
removed: verification-failed
Adam Gandelman (gandelman-a) wrote :

Please find the attached Jenkins job results from the Ubuntu Server Team's CI
infrastructure. As part of the verification process for this bug, Nova has
been deployed and configured across multiple nodes using precise-proposed as
an installation source. After successful bring-up and configuration of the
cluster, a number of exercises and smoke tests have be invoked to ensure the
updated package did not introduce any regressions. A number of test iterations
were carried out to catch any possible transient errors.

Note the list of installed packages at the top and bottom of the report.

For records of upstream test coverage of this update, please see the
Jenkins links in the comments of the relevant upstream code-review:

https://review.openstack.org/7592

As per the provisional Micro Release Exception granted to this package by
the Technical Board, we hope this contributes toward verification of this
update.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nova - 2012.1+stable~20120612-3ee026e-0ubuntu1

---------------
nova (2012.1+stable~20120612-3ee026e-0ubuntu1) precise-proposed; urgency=low

  * New upstream snapshot. (LP: #1010473)
  * Dropped, superseeded by new snapshot:
    - debian/patches/upstream/0001-fix-bug-where-nova-ignores-glance-host-in-imageref.patch
    - debian/patches/upstream/0002-Stop-libvirt-test-from-deleting-instances-dir.patch
    - debian/patches/upstream/0003-Allow-unprivileged-RADOS-users-to-access-rbd-volumes.patch
    - debian/patches/upstream/0004-Fixed-bug-962840-added-a-test-case.patch
    - debian/patches/upstream/0005-Populate-image-properties-with-project_id-again.patch
    - debian/patches/upstream/0006-Use-project_id-in-ec2.cloud._format_image.patc
    - debian/patches/CVE-2012-2101.patch
    - debian/patches/CVE-2012-2654.patch
  * Resynchronize with stable/essex:
    - 3ee026e Only invoke .lower() on non-None protocols. (LP: #1010514)
    - f0a9f47 Create a utf8 version of the dns_domains table. (LP: #993663)
    - 84a43e1 Report memory correctly on Xen. (LP: #997014)
    - 8c72924 Add libvirt get_console_output tests: pty and file. (LP: #990237)
    - 4e423cd Fix Multi_Scheduler to process host capabilities. (LP: #1000403)
    - 4aea7f1 Nail pep8 dependencies to 1.0.1
    - 2b3bbc4 handle updated qemu-img info output. (LP: #1000261)
    - 2d7d51c Fix type of snapshot_id column to match db. (LP: #962615)
    - ec70c69 Generate a Changelog for Nova
    - e5e890f Fix nova.tests.test_nova_rootwrap on Fedora 17. (LP: #992916)
    - 9e9a554 Ec2 handle strings with "0x" (LP: #983206)
    - 26dc6b7 QuantumManager will start dnsmasq during startup. Fixes (LP: #977759)
    - 7028d66 Introduced flag base_dir_name. (LP: #973194)
    - 76b525a Get unit tests functional in OS X.
    - facb936 Update KillFilter to handle 'deleted' exe's. (LP: #967931)
    - 1209af4 Checks if value is string or not before decode. (LP: #952176)
    - 1209af4 Fix timeout in EC2 CloudController.create_image(). (LP: #989764)
    - 108e74b Re-add console_log from console_console_output(). (LP: #987335)
    - 48a0768 Don't leak RPC connections on timeouts or other exceptions. (LP: #968843)
    - 7c64de9 Cloudpipe tap vpn not always working. (LP: #975043)
    - 5ab5051 add libvirt_inject_key flag fix (LP: #971640)
    - 6c68ef5 Xen: Pass session to destroy_vdi. (LP: #988615)
    - 015744e Delete fixed_ips when network is deleted. (LP: #754900)
  * Add debian/scripts/changelog.sh to help generate the changelog.
  * Add debian/nova-common.docs:
    - Include changelog and README.rst
  * debian/rules: Generate a tarball from git snapshot.
  * debian/patches/fix-pep8-errors.patch: Fix pep8 errors due to pep8 upstream
    migration.
 -- Chuck Short <email address hidden> Tue, 05 Jun 2012 09:50:59 -0400

Changed in nova (Ubuntu Precise):
status: Fix Committed → Fix Released
Chuck Short (zulcss) on 2012-07-31
Changed in nova (Ubuntu):
status: Triaged → Fix Released
Thierry Carrez (ttx) on 2012-09-27
Changed in nova:
milestone: folsom-1 → 2012.2
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers