[SRU] update_available_resource periodic fails with exception.CPUPinningInvalid if there is incoming post-migrating migration with cpu pinning
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Compute (nova) |
Fix Released
|
Medium
|
Balazs Gibizer | ||
Victoria |
Fix Released
|
Medium
|
Balazs Gibizer | ||
Wallaby |
Fix Released
|
Medium
|
Balazs Gibizer | ||
Xena |
Fix Released
|
Medium
|
Balazs Gibizer |
Bug Description
*** SRU TEMPLATE IS THE SAME AS https:/
The update_
Reproduce:
* build a multinode env with dedicated cpus and cpu pinning configured
* configure the update_
* create inst1 on the first node and create inst2 on the second node both with requesting one pinned cpu
* check that inst1 pinned to the same pcpu id on node1 as inst2 on node2
* slow down the processing on finish_resize messages in the system to ease the reproduction of the race (e.g. inject sleep or load rabbit etc.)
* migrate inst1 to node2
If you are managed to hit the case when the periodic runs on node2 just after the resize_claim of inst1 finished but the finish_resize RPC call of inst1 is not processed (the migration context is not applied to the instance and the migration is not in finished state but in post-migration) then you will see a CPU pinning conflict. It is because the resource tracker already tracks the incoming instance [1] (the host and node is set in resize_instance already[2]) but the instance still not have the migration context applied (as it is only done in finish_resize[3]) so the instance.
Reproduced both in stable/victoria downstream and in latest master in an upstream devstack.
2021-12-06 15:07:18,013 ERROR [nova.compute.
Traceback (most recent call last):
File "/root/
self.
File "/root/
self.
File "/root/
return f(*args, **kwargs)
File "/root/
instance_
File "/root/
self.
File "/root/
self.
File "/root/
cn.
File "/root/
new_
File "/root/
raise exception.
nova.exception.
[1] https:/
[2] https:/
[3] https:/
tags: | added: numa |
tags: | added: compute resource-tracker |
tags: | added: resize |
Changed in nova: | |
assignee: | nobody → Balazs Gibizer (balazs-gibizer) |
Changed in nova: | |
importance: | Undecided → Medium |
summary: |
- update_available_resource periodic fails with + [SRU] update_available_resource periodic fails with exception.CPUPinningInvalid if there is incoming post-migrating migration with cpu pinning |
description: | updated |
tags: | added: sts-sru-needed |
Fix proposed to branch: master /review. opendev. org/c/openstack /nova/+ /820540
Review: https:/