cpu binding conflict when multiple VMs execute unshelve at the same time

Bug #1748799 reported by tangxing
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
New
Undecided
tangxing

Bug Description

Description
===========
hw:cpu_policy of flavor is dedicated,use this flavor create six vms.Then excute 'nova shelve <server-id' command,after vm's state become to SHELVE_OFFLOAD,execute 'nova unshelve <server-id>' at
the same time.This will lead to server vms bind to same cpu.

Steps to reproduce
==================
1:nova flavor-key 1 set hw:cpu_policy=dedicated
2:[root@nail-5300-1 ~(keystone_admin)]# nova boot --flavor 1 --image 22164f51-c353-4f8a-a073-cbb2930bf25f --nic net-id=90ba0584-b2b7-4f6a-b862-f77864d8626f --max-count 6 test
3:nova shelve test-1
nova shelve test-2
nova shelve test-3
nova shelve test-4
nova shelve test-5
nova shelve test-6
4:
nova unshelve test-1
nova unshelve test-2
nova unshelve test-3
nova unshelve test-4
nova unshelve test-5
nova unshelve test-6

Actual result
=============
[root@E9000slot6 /]# virsh vcpupin 24
VCPU: CPU Affinity
----------------------------------
0: 9
1: 33

[root@E9000slot6 /]# virsh vcpupin 25
VCPU: CPU Affinity
----------------------------------
0: 9
1: 33

Logs & Configs
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 1195, in _update_usage_from_instances
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager require_allocation_refresh=require_allocation_refresh)
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 1125, in _update_usage_from_instance
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager sign=sign)
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/compute/resource_tracker.py", line 913, in _update_usage
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager cn, usage, free)
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 2071, in get_host_numa_usage_from_instance
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager host_numa_topology, instance_numa_topology, free=free))
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/virt/hardware.py", line 1923, in numa_usage_from_instances
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager newcell.pin_cpus(pinned_cpus)
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager File "/usr/lib/python2.7/site-packages/nova/objects/numa.py", line 89, in pin_cpus
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager self.pinned_cpus))
2017-12-21 19:12:46.417 23081 ERROR nova.compute.manager CPUPinningInvalid: CPU set to pin [9, 33] must be a subset of free CPU set [32, 2, 3, 5, 6, 7, 8, 10, 11, 34, 35, 26, 27, 29, 30, 31]

tangxing (tang-xing)
Changed in nova:
assignee: nobody → tangxing (tang-xing)
Revision history for this message
Sylvain Bauza (sylvain-bauza) wrote :

That's a known issue that needs some design discussions in order to fix it correctly. The main upstream bug is https://bugs.launchpad.net/nova/+bug/1417667 for all the move ops.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.