[SRU] confirm resize fails with CPUUnpinningInvalid

Bug #1944759 reported by Balazs Gibizer
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Balazs Gibizer
Ussuri
New
Undecided
Unassigned
Ubuntu Cloud Archive
New
Undecided
Unassigned
Ussuri
New
Undecided
Unassigned
nova (Ubuntu)
New
Undecided
Unassigned
Focal
New
Undecided
Unassigned

Bug Description

* SRU DESCRIPTION BELOW *

Nova has a race condition between resize_instance() compute manager call and the update_available_resources periodic job. If they overlap at the right place, when resize_instance calls finish_resize, then periodic job will not track the migration nor the instance on the source host. It causes that the PCPU allocation on the source host is dropped in the resource tracker (not in placement). Then when the resize is confirmed nova tries to free the pinned cpus again on the source host and fails with CPUUnpinningInvalid as they are already freed.

I've pushed a reproduction test: https://review.opendev.org/c/openstack/nova/+/810763

It is reproducible at least on master, xena, wallaby, and victoria

===============
SRU DESCRIPTION
===============

[Impact]

Due to a race condition the tracking of pinned CPU resources can go off-sync causing "No valid host" errors while being unable to create new instances with CPU pinning, as the previous pinned CPUs were not marked as freed.

Part of the reason is addressed in the fix for LP#1953359 where a migration context is not pointing to the proper node during the race condition window, resulting in a CPUPinningInvalid error. This fix complements LP#1953359 by addressing the improper tracking of resources that happens only when the resource tracker periodic job runs in the source node while the flavor registered corresponds to the one of the destination. That is solved by setting the instance.old_flavor so the CPU pinning resources are tracked properly.

[Test case]

The test case for this was already implemented on non-live functional tests upstream:

in nova/tests/functional/libvirt/test_numa_servers.py:
- test_resize_dedicated_policy_race_on_dest_bug_1953359
- test_resize_confirm_bug_1944759
- test_resize_revert_bug_1944759

As this is a race condition it is very difficult to validate, even upstream, so the functional tests mock certain parts of the code to simulate the entire scope of the workflow. It is a non-live functional test, so it is more akin to a broader unit test.

[Regression Potential]

The code is considered stable today in newer releases and the scope of the code affected is fairly limited. Given that it is a race condition that it is difficult to validate, despite the non-live functional tests, the regression potential is moderate.

[Other Info]

None.

Changed in nova:
assignee: nobody → Balazs Gibizer (balazs-gibizer)
importance: Undecided → Medium
description: updated
tags: added: compute numa race-condition resize
description: updated
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/810763

Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote : Re: confirm resize fails with CPUUnpinningInvalid

Given the reproduction test works, validating this bug.

Changed in nova:
status: New → Confirmed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.opendev.org/c/openstack/nova/+/810909

Changed in nova:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/xena)

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/810910

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/xena
Review: https://review.opendev.org/c/openstack/nova/+/810911

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/wallaby)

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/810912

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/wallaby
Review: https://review.opendev.org/c/openstack/nova/+/810913

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/victoria)

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/810914

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/victoria
Review: https://review.opendev.org/c/openstack/nova/+/810915

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.opendev.org/c/openstack/nova/+/810763
Committed: https://opendev.org/openstack/nova/commit/3e4e4489b7a6e9cdefcc6ff02ed99a0a70420fca
Submitter: "Zuul (22348)"
Branch: master

commit 3e4e4489b7a6e9cdefcc6ff02ed99a0a70420fca
Author: Balazs Gibizer <email address hidden>
Date: Thu Sep 23 20:05:45 2021 +0200

    Reproduce bug 1944759

    Add functional tests to reproduce the race between resize_instance()
    and update_available_resources().

    Related-Bug: #1944759

    Change-Id: Icb7e3379248fe00f9a94f9860181b5de44902379

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/c/openstack/nova/+/810909
Committed: https://opendev.org/openstack/nova/commit/b841e553214be9a732703e2dfed6c97698ef9b71
Submitter: "Zuul (22348)"
Branch: master

commit b841e553214be9a732703e2dfed6c97698ef9b71
Author: Balazs Gibizer <email address hidden>
Date: Fri Sep 24 15:17:28 2021 +0200

    Store old_flavor already on source host during resize

    During resize, on the source host, in resize_instance(), the instance.host
    and .node is updated to point to the destination host. This indicates to
    the source host's resource tracker that the allocation of this instance
    does not need to be tracked as an instance but as an outbound migration
    instead. However for the source host's resource tracker to do that it,
    needs to use the instance.old_flavor. Unfortunately the
    instance.old_flavor is only set during finish_resize() on the
    destination host. (resize_instance cast to the finish_resize). So it is
    possible that a running resize_instance() set the instance.host to point
    to the destination and then before the finish_resize could set the
    old_flavor an update_available_resources periodic runs on the source
    host. This causes that the allocation of this instance is not tracked as
    an instance as the instance.host point to the destination but it is not
    tracked as a migration either as the instance.old_flavor is not yet set.
    So the allocation on the source host is simply dropped by the periodic
    job.

    When such migration is confirmed the confirm_resize() tries to drop
    the same resource allocation again but fails as the pinned CPUs of the
    instance already freed.

    When such migration is reverted instead, then revert succeeds but the
    source host resource allocation will not contain the resource allocation
    of the instance until the next update_available_resources periodic runs
    and corrects it.

    This does not affect resources tracked exclusively in placement (e.g.
    VCPU, MEMORY_MB, DISK_GB) but it does affect NUMA related resource that
    are still tracked in the resource tracker (e.g. huge pages, pinned
    CPUs).

    This patch moves the instance.old_flavor setting to the source node to
    the same transaction that sets the instance.host to point to the
    destination host. Hence solving the race condition.

    Change-Id: Ic0d6c59147abe5e094e8f13e0d3523b178daeba9
    Closes-Bug: #1944759

Changed in nova:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/xena)

Reviewed: https://review.opendev.org/c/openstack/nova/+/810910
Committed: https://opendev.org/openstack/nova/commit/e6c6880465824f1e327a54143f32bb5a5816ff6c
Submitter: "Zuul (22348)"
Branch: stable/xena

commit e6c6880465824f1e327a54143f32bb5a5816ff6c
Author: Balazs Gibizer <email address hidden>
Date: Thu Sep 23 20:05:45 2021 +0200

    Reproduce bug 1944759

    Add functional tests to reproduce the race between resize_instance()
    and update_available_resources().

    Related-Bug: #1944759

    Change-Id: Icb7e3379248fe00f9a94f9860181b5de44902379
    (cherry picked from commit 3e4e4489b7a6e9cdefcc6ff02ed99a0a70420fca)

tags: added: in-stable-xena
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/c/openstack/nova/+/810911
Committed: https://opendev.org/openstack/nova/commit/d4edcd62bae44c01885268a6cf7b7fae92617060
Submitter: "Zuul (22348)"
Branch: stable/xena

commit d4edcd62bae44c01885268a6cf7b7fae92617060
Author: Balazs Gibizer <email address hidden>
Date: Fri Sep 24 15:17:28 2021 +0200

    Store old_flavor already on source host during resize

    During resize, on the source host, in resize_instance(), the instance.host
    and .node is updated to point to the destination host. This indicates to
    the source host's resource tracker that the allocation of this instance
    does not need to be tracked as an instance but as an outbound migration
    instead. However for the source host's resource tracker to do that it,
    needs to use the instance.old_flavor. Unfortunately the
    instance.old_flavor is only set during finish_resize() on the
    destination host. (resize_instance cast to the finish_resize). So it is
    possible that a running resize_instance() set the instance.host to point
    to the destination and then before the finish_resize could set the
    old_flavor an update_available_resources periodic runs on the source
    host. This causes that the allocation of this instance is not tracked as
    an instance as the instance.host point to the destination but it is not
    tracked as a migration either as the instance.old_flavor is not yet set.
    So the allocation on the source host is simply dropped by the periodic
    job.

    When such migration is confirmed the confirm_resize() tries to drop
    the same resource allocation again but fails as the pinned CPUs of the
    instance already freed.

    When such migration is reverted instead, then revert succeeds but the
    source host resource allocation will not contain the resource allocation
    of the instance until the next update_available_resources periodic runs
    and corrects it.

    This does not affect resources tracked exclusively in placement (e.g.
    VCPU, MEMORY_MB, DISK_GB) but it does affect NUMA related resource that
    are still tracked in the resource tracker (e.g. huge pages, pinned
    CPUs).

    This patch moves the instance.old_flavor setting to the source node to
    the same transaction that sets the instance.host to point to the
    destination host. Hence solving the race condition.

    Change-Id: Ic0d6c59147abe5e094e8f13e0d3523b178daeba9
    Closes-Bug: #1944759
    (cherry picked from commit b841e553214be9a732703e2dfed6c97698ef9b71)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/wallaby)

Reviewed: https://review.opendev.org/c/openstack/nova/+/810912
Committed: https://opendev.org/openstack/nova/commit/140ae45d98dabd30aef5c0ac075346de4eabcea1
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit 140ae45d98dabd30aef5c0ac075346de4eabcea1
Author: Balazs Gibizer <email address hidden>
Date: Thu Sep 23 20:05:45 2021 +0200

    Reproduce bug 1944759

    Add functional tests to reproduce the race between resize_instance()
    and update_available_resources().

    Related-Bug: #1944759

    Change-Id: Icb7e3379248fe00f9a94f9860181b5de44902379
    (cherry picked from commit 3e4e4489b7a6e9cdefcc6ff02ed99a0a70420fca)
    (cherry picked from commit e6c6880465824f1e327a54143f32bb5a5816ff6c)

tags: added: in-stable-wallaby
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/c/openstack/nova/+/810913
Committed: https://opendev.org/openstack/nova/commit/c8b04d183f560a616a79577c4d4ae9465d03e541
Submitter: "Zuul (22348)"
Branch: stable/wallaby

commit c8b04d183f560a616a79577c4d4ae9465d03e541
Author: Balazs Gibizer <email address hidden>
Date: Fri Sep 24 15:17:28 2021 +0200

    Store old_flavor already on source host during resize

    During resize, on the source host, in resize_instance(), the instance.host
    and .node is updated to point to the destination host. This indicates to
    the source host's resource tracker that the allocation of this instance
    does not need to be tracked as an instance but as an outbound migration
    instead. However for the source host's resource tracker to do that it,
    needs to use the instance.old_flavor. Unfortunately the
    instance.old_flavor is only set during finish_resize() on the
    destination host. (resize_instance cast to the finish_resize). So it is
    possible that a running resize_instance() set the instance.host to point
    to the destination and then before the finish_resize could set the
    old_flavor an update_available_resources periodic runs on the source
    host. This causes that the allocation of this instance is not tracked as
    an instance as the instance.host point to the destination but it is not
    tracked as a migration either as the instance.old_flavor is not yet set.
    So the allocation on the source host is simply dropped by the periodic
    job.

    When such migration is confirmed the confirm_resize() tries to drop
    the same resource allocation again but fails as the pinned CPUs of the
    instance already freed.

    When such migration is reverted instead, then revert succeeds but the
    source host resource allocation will not contain the resource allocation
    of the instance until the next update_available_resources periodic runs
    and corrects it.

    This does not affect resources tracked exclusively in placement (e.g.
    VCPU, MEMORY_MB, DISK_GB) but it does affect NUMA related resource that
    are still tracked in the resource tracker (e.g. huge pages, pinned
    CPUs).

    This patch moves the instance.old_flavor setting to the source node to
    the same transaction that sets the instance.host to point to the
    destination host. Hence solving the race condition.

    Change-Id: Ic0d6c59147abe5e094e8f13e0d3523b178daeba9
    Closes-Bug: #1944759
    (cherry picked from commit b841e553214be9a732703e2dfed6c97698ef9b71)
    (cherry picked from commit d4edcd62bae44c01885268a6cf7b7fae92617060)

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/victoria)

Reviewed: https://review.opendev.org/c/openstack/nova/+/810914
Committed: https://opendev.org/openstack/nova/commit/0b1fa9b4ae01114299c5225d66e1f6eba25be43e
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 0b1fa9b4ae01114299c5225d66e1f6eba25be43e
Author: Balazs Gibizer <email address hidden>
Date: Thu Sep 23 20:05:45 2021 +0200

    Reproduce bug 1944759

    Add functional tests to reproduce the race between resize_instance()
    and update_available_resources().

    Related-Bug: #1944759

    Change-Id: Icb7e3379248fe00f9a94f9860181b5de44902379
    (cherry picked from commit 3e4e4489b7a6e9cdefcc6ff02ed99a0a70420fca)
    (cherry picked from commit e6c6880465824f1e327a54143f32bb5a5816ff6c)
    (cherry picked from commit 140ae45d98dabd30aef5c0ac075346de4eabcea1)

tags: added: in-stable-victoria
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.opendev.org/c/openstack/nova/+/810915
Committed: https://opendev.org/openstack/nova/commit/34e0c0205b1053d3bbe064177740aba654997fe0
Submitter: "Zuul (22348)"
Branch: stable/victoria

commit 34e0c0205b1053d3bbe064177740aba654997fe0
Author: Balazs Gibizer <email address hidden>
Date: Fri Sep 24 15:17:28 2021 +0200

    Store old_flavor already on source host during resize

    During resize, on the source host, in resize_instance(), the instance.host
    and .node is updated to point to the destination host. This indicates to
    the source host's resource tracker that the allocation of this instance
    does not need to be tracked as an instance but as an outbound migration
    instead. However for the source host's resource tracker to do that it,
    needs to use the instance.old_flavor. Unfortunately the
    instance.old_flavor is only set during finish_resize() on the
    destination host. (resize_instance cast to the finish_resize). So it is
    possible that a running resize_instance() set the instance.host to point
    to the destination and then before the finish_resize could set the
    old_flavor an update_available_resources periodic runs on the source
    host. This causes that the allocation of this instance is not tracked as
    an instance as the instance.host point to the destination but it is not
    tracked as a migration either as the instance.old_flavor is not yet set.
    So the allocation on the source host is simply dropped by the periodic
    job.

    When such migration is confirmed the confirm_resize() tries to drop
    the same resource allocation again but fails as the pinned CPUs of the
    instance already freed.

    When such migration is reverted instead, then revert succeeds but the
    source host resource allocation will not contain the resource allocation
    of the instance until the next update_available_resources periodic runs
    and corrects it.

    This does not affect resources tracked exclusively in placement (e.g.
    VCPU, MEMORY_MB, DISK_GB) but it does affect NUMA related resource that
    are still tracked in the resource tracker (e.g. huge pages, pinned
    CPUs).

    This patch moves the instance.old_flavor setting to the source node to
    the same transaction that sets the instance.host to point to the
    destination host. Hence solving the race condition.

    Change-Id: Ic0d6c59147abe5e094e8f13e0d3523b178daeba9
    Closes-Bug: #1944759
    (cherry picked from commit b841e553214be9a732703e2dfed6c97698ef9b71)
    (cherry picked from commit d4edcd62bae44c01885268a6cf7b7fae92617060)
    (cherry picked from commit c8b04d183f560a616a79577c4d4ae9465d03e541)

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote : Re: confirm resize fails with CPUUnpinningInvalid

Note that there is another bug and fix that is relevant here too:
https://review.opendev.org/c/openstack/nova/+/820549

Revision history for this message
Pavlo Shchelokovskyy (pshchelo) wrote :

this is still happening when resize to the same host is allowed and is actually happening...

single host devstack/master with both patches present, host with 4 CPUs, nova-cpu.conf

[DEFAULT]
allow_resize_to_same_host = True
cpu_allocation_ratio=1.0
# this one to get the error more often
update_resources_interval=15
[compute]
cpu_dedicated_set=1-3

flavors pin1 and pin2 with hw:cpu_policy='dedicated' and 1 and 2 vcpus respectively,
booted server with flavor pin1 and trying to resize with

openstack server resize <instance name> --flavor pin2 --wait && openstack server resize confirm <instance name>

still occasionally fails with

nova.exception.CPUUnpinningInvalid: CPU set to unpin [1] must be a subset of pinned CPU set [2, 3]

https://paste.opendev.org/show/811839/

Revision history for this message
Balazs Gibizer (balazs-gibizer) wrote :

@Pavlo: could you please open a separate bug for the same_host resize case? Also if you have an idea how to fix it based on the existing two fixes then go ahead and propose the fix and add me as a reviewer.

Revision history for this message
Pavlo Shchelokovskyy (pshchelo) wrote :

filed https://bugs.launchpad.net/nova/+bug/1961188 for very similar error happening when resizing to the same host

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 22.4.0

This issue was fixed in the openstack/nova 22.4.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 23.2.0

This issue was fixed in the openstack/nova 23.2.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 24.1.0

This issue was fixed in the openstack/nova 24.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/nova 25.0.0.0rc1

This issue was fixed in the openstack/nova 25.0.0.0rc1 release candidate.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/ussuri)

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/836733

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/836734

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/train)

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/836993

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/836994

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/train)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/836993
Reason: stable/train branch of nova projects' have been tagged as End of Life. All open patches have to be abandoned in order to be able to delete the branch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/train
Review: https://review.opendev.org/c/openstack/nova/+/836994
Reason: stable/train branch of nova projects' have been tagged as End of Life. All open patches have to be abandoned in order to be able to delete the branch.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (stable/ussuri)

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/836733
Reason: stable/ussuri branch of openstack/nova transitioned to End of Life and is about to be deleted. To be able to do that, all open patches need to be abandoned.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by "Elod Illes <email address hidden>" on branch: stable/ussuri
Review: https://review.opendev.org/c/openstack/nova/+/836734
Reason: stable/ussuri branch of openstack/nova transitioned to End of Life and is about to be deleted. To be able to do that, all open patches need to be abandoned.

summary: - confirm resize fails with CPUUnpinningInvalid
+ [SRU] confirm resize fails with CPUUnpinningInvalid
description: updated
Revision history for this message
Rodrigo Barbieri (rodrigo-barbieri2010) wrote :
tags: added: sts-sru-needed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.