OpenStack Compute (nova)

Inconsistent value for vcpu_used

Bug #1729621 reported by Maciej Jozefczyk on 2017-11-02

This bug affects 7 people

	Status	Importance	Assigned to
OpenStack Compute (nova)	Fix Released	High	Maciej Jozefczyk
Pike	Fix Released	High	Balazs Gibizer
Queens	Fix Released	High	Radoslav Gerganov
Rocky	Fix Released	High	Radoslav Gerganov

Bug Description

Description
===========

Nova updates hypervisor resources using function called ./nova/compute/resource_tracker.py:update_available_resource().

In case of *shutdowned* instances it could impact inconsistent values for resources like vcpu_used.

Resources are taken from function self.driver.get_available_resource():
https://github.com/openstack/nova/blob/f974e3c3566f379211d7fdc790d07b5680925584/nova/compute/resource_tracker.py#L617
https://github.com/openstack/nova/blob/f974e3c3566f379211d7fdc790d07b5680925584/nova/virt/libvirt/driver.py#L5766

This function calculates allocated vcpu's based on function _get_vcpu_total().
https://github.com/openstack/nova/blob/f974e3c3566f379211d7fdc790d07b5680925584/nova/virt/libvirt/driver.py#L5352

As we see in _get_vcpu_total() function calls *self._host.list_guests()* without "only_running=False" parameter. So it doesn't respect shutdowned instances.

At the end of resource update process function _update_available_resource() is beign called:
> /opt/stack/nova/nova/compute/resource_tracker.py(733)

677 @utils.synchronized(COMPUTE_RESOURCE_SEMAPHORE)
678 def _update_available_resource(self, context, resources):
679
681 # initialize the compute node object, creating it
682 # if it does not already exist.
683 self._init_compute_node(context, resources)

It initialize compute node object with resources that are calculated without shutdowned instances. If compute node object already exists it *UPDATES* its fields - *for a while nova-api has other resources values than it its in real.*

731 # update the compute_node
732 self._update(context, cn)

The inconsistency is automatically fixed during other code execution:
https://github.com/openstack/nova/blob/f974e3c3566f379211d7fdc790d07b5680925584/nova/compute/resource_tracker.py#L709

But for heavy-loaded hypervisors (like 100 active instances and 30 shutdowned instances) it creates wrong informations in nova database for about 4-5 seconds (in my usecase) - it could impact other issues like spawning on already full hypervisor (because scheduler has wrong informations about hypervisor usage).

Steps to reproduce
==================

1) Start devstack
2) Create 120 instances
3) Stop some instances
4) Watch blinking values in nova hypervisor-show
nova hypervisor-show e6dfc16b-7914-48fb-a235-6fe3a41bb6db

Expected result
===============
Returned values should be the same during test.

Actual result
=============
while true; do echo -n "$(date) "; echo "select hypervisor_hostname, vcpus_used from compute_nodes where hypervisor_hostname='example.compute.node.com';" | mysql nova_cell1; sleep 0.3; done

Thu Nov 2 14:50:09 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:10 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:10 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:10 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:11 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:11 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:11 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:11 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:12 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:12 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:12 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:13 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:13 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:13 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:14 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:14 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:14 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:15 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:15 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:15 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:16 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:16 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:16 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:17 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:17 UTC 2017 example.compute.node.com 117
Thu Nov 2 14:50:17 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:17 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:18 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:18 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:18 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:19 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:19 UTC 2017 example.compute.node.com 120
Thu Nov 2 14:50:19 UTC 2017 example.compute.node.com 120

Bad values were stored in nova DB for about 5 seconds. During this time nova-scheduler could take this host.

Environment
===========
Devstack master (f974e3c3566f379211d7fdc790d07b5680925584).
For sure releases down to Newton are impacted.

See original description

Tags:

Maciej Jozefczyk (maciejjozefczyk) on 2017-11-02

description:

updated

Revision history for this message

Maciej Jozefczyk (maciejjozefczyk) wrote on 2017-11-13:

I see solutions:

1. Change self._init_compute_node() in _update_available_resource() to not call self._update(), maybe by introducing new boolean parameter in _init_compute_node() args to not call self._update().

2. Add some kind of db transaction (Its not a good idea I think)

3. Modify calls of self._host.list_guests() to list all instances (those shutdowned too) - but for sure it will break other things

4. Re-organize code (?)

Matt Riedemann (mriedem) on 2017-11-13

tags:

added: resource-tracker

Maciej Jozefczyk (maciejjozefczyk) on 2017-11-14

Changed in nova:
assignee:	nobody → Maciej Jozefczyk (maciej.jozefczyk)
status:	New → In Progress

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-11-15: Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/520024

Revision history for this message

Jay Pipes (jaypipes) wrote on 2017-12-15:

Please note that Pike and Ocata schedulers are not affected by this issue, since starting in Ocata, we stop using the ComputeNode.vcpus_used value from the CoreFilter (which was deprecated/removed in Ocata) and instead use the (accurate) information from the placement API service about allocated VCPU resources for instances. Placement doesn't care or know about whether an instance is shutdown or not -- only if the instance is "on" the host.

Revision history for this message

Matt Riedemann (mriedem) wrote on 2017-12-19:

To be clear, the CoreFilter isn't enabled by default, but Ram and Disk filters are.

The CoreFilter was not deprecated in Ocata:

https://github.com/openstack/nova/blob/stable/ocata/nova/scheduler/filters/core_filter.py

In fact it's still not deprecated:

https://github.com/openstack/nova/blob/master/nova/scheduler/filters/core_filter.py

Because the CachingScheduler will need to rely on it since the CachingScheduler isn't using Placement, and while the CachingScheduler itself was deprecated in Pike:

https://github.com/openstack/nova/commit/d48bba18a7cebc57e63f5b2c5a1e939654de0883

We can't really remove it until we have a migration path for people using the CachingScheduler to move over to the FilterScheduler and populate Placement with the allocations that the FilterScheduler would have been creating in Pike (remember that once all computes are upgraded to Pike+, the ResourceTracker in nova-compute stops reporting allocations to Placement).

OpenStack Infra (hudson-openstack) on 2017-12-22

Changed in nova:
assignee:	Maciej Jozefczyk (maciej.jozefczyk) → Minho Ban (mhban)

Revision history for this message

Maciej Jozefczyk (maciejjozefczyk) wrote on 2017-12-22:

Working on patch

Changed in nova:
assignee:	Minho Ban (mhban) → Maciej Jozefczyk (maciej.jozefczyk)

Revision history for this message

Matt Riedemann (mriedem) wrote on 2018-01-29:

There are more details in duplicate bug 1739349.

Changed in nova:
importance:	Undecided → High

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-07-10: Change abandoned on nova (master)

Change abandoned by Maciej Jozefczyk (<email address hidden>) on branch: master
Review: https://review.openstack.org/532924

OpenStack Infra (hudson-openstack) on 2018-08-06

Changed in nova:
assignee:	Maciej Jozefczyk (maciej.jozefczyk) → Eric Fried (efried)

Eric Fried (efried) on 2018-08-06

Changed in nova:
assignee:	Eric Fried (efried) → Maciej Jozefczyk (maciej.jozefczyk)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-08-21: Fix merged to nova (master)

Reviewed: https://review.openstack.org/520024
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=c9b74bcfa09d11c2046ce1bfb6dd8463b3a2f3b0
Submitter: Zuul
Branch: master

commit c9b74bcfa09d11c2046ce1bfb6dd8463b3a2f3b0
Author: Maciej Józefczyk <email address hidden>
Date: Thu Nov 16 14:49:42 2017 +0100

Update resources once in update_available_resource

This change ensures that resources are updated only once per
update_available_resource() call.

    Compute resources were previously updated during host
    object initialization and at the end of
    update_available_resource(). It could cause inconsistencies
    in resource tracking between compute host and DB for couple
    of second when final _update() at the end of
    update_available_resource() is being called.

    For example: nova-api shows that host uses 10GB of RAM, but
    in fact its 12GB because DB doesn't have resources that belongs
    to shutdown instance.

Because of that fact nova-scheduler (CachingScheduler) could
choose (based on imcomplete information) host which is already full.

For more informations please see realted bug: #1729621

Change-Id: I120a98cc4c11772f24099081ef3ac44a50daf71d
Closes-Bug: #1729621

Changed in nova:
status:	In Progress → Fix Released

Revision history for this message

Radoslav Gerganov (rgerganov) wrote on 2018-10-22:

This bug affects *all* stable releases because there is race not only for vcpus_used but also for the compute stats which are used by the scheduler. See bug #1798806 for more details.

I will backport the fix for the sable releases.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-10-22: Fix proposed to nova (stable/rocky)

#10

Fix proposed to branch: stable/rocky
Review: https://review.openstack.org/612293

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-10-22: Fix proposed to nova (stable/queens)

#11

Fix proposed to branch: stable/queens
Review: https://review.openstack.org/612294

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-10-22: Fix proposed to nova (stable/pike)

#12

Fix proposed to branch: stable/pike
Review: https://review.openstack.org/612295

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-11-12: Fix merged to nova (stable/rocky)

#13

Reviewed: https://review.openstack.org/612293
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=732b0571cc27a8a1aba30f44c18317f13325aad3
Submitter: Zuul
Branch: stable/rocky

commit 732b0571cc27a8a1aba30f44c18317f13325aad3
Author: Maciej Józefczyk <email address hidden>
Date: Thu Nov 16 14:49:42 2017 +0100

Update resources once in update_available_resource

This change ensures that resources are updated only once per
update_available_resource() call.

    For example: nova-api shows that host uses 10GB of RAM, but
    in fact its 12GB because DB doesn't have resources that belongs
    to shutdown instance.

Because of that fact nova-scheduler (CachingScheduler) could
choose (based on imcomplete information) host which is already full.

For more informations please see realted bug: #1729621

    Change-Id: I120a98cc4c11772f24099081ef3ac44a50daf71d
    Closes-Bug: #1729621
    (cherry picked from commit c9b74bcfa09d11c2046ce1bfb6dd8463b3a2f3b0)

tags:

added: in-stable-rocky

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2018-12-19: Fix included in openstack/nova 18.1.0

#14

This issue was fixed in the openstack/nova 18.1.0 release.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-03-22: Fix included in openstack/nova 19.0.0.0rc1

#15

This issue was fixed in the openstack/nova 19.0.0.0rc1 release candidate.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-03-28: Fix merged to nova (stable/queens)

#16

Reviewed: https://review.openstack.org/612294
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=36d93675d9a6bf903ed64c216243c74a639a2087
Submitter: Zuul
Branch: stable/queens

commit 36d93675d9a6bf903ed64c216243c74a639a2087
Author: Maciej Józefczyk <email address hidden>
Date: Thu Nov 16 14:49:42 2017 +0100

Update resources once in update_available_resource

This change ensures that resources are updated only once per
update_available_resource() call.

    For example: nova-api shows that host uses 10GB of RAM, but
    in fact its 12GB because DB doesn't have resources that belongs
    to shutdown instance.

Because of that fact nova-scheduler (CachingScheduler) could
choose (based on imcomplete information) host which is already full.

For more informations please see realted bug: #1729621

    Change-Id: I120a98cc4c11772f24099081ef3ac44a50daf71d
    Closes-Bug: #1729621
    (cherry picked from commit c9b74bcfa09d11c2046ce1bfb6dd8463b3a2f3b0)

tags:

added: in-stable-queens

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-04-12: Change abandoned on nova (stable/pike)

#17

Change abandoned by Radoslav Gerganov (<email address hidden>) on branch: stable/pike
Review: https://review.openstack.org/612295
Reason: fira

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-07-10: Fix included in openstack/nova 17.0.11

#18

This issue was fixed in the openstack/nova 17.0.11 release.

Matt Riedemann (mriedem) on 2019-08-21

no longer affects:

nova/ocata

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2019-10-30: Change abandoned on nova (master)

#19

Change abandoned by Matt Riedemann (<email address hidden>) on branch: master
Review: https://review.opendev.org/529236
Reason: This was resolved in https://review.opendev.org/#/c/520024/