Resize instace will not change the NUMA topology of a running instance to the one from the new flavor
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
| OpenStack Compute (nova) |
Medium
|
Stephen Finucane |
Bug Description
When we resize (change the flavor) of an instance that has a NUMA topology defined, the NUMA info from the new flavor will not be considered during scheduling. The instance will get re-scheduled based on the old NUMA information, but the claiming on the host will use the new flavor data. Once the instane sucessfully lands on a host, we will still use the old data when provisioning it on the new host.
We should be considering only the new flavor information in resizes.
Changed in nova: | |
status: | New → Confirmed |
importance: | High → Medium |
Changed in nova: | |
assignee: | nobody → Tiago Rodrigues de Mello (timello) |
Bart Wensley (bartwensley) wrote : | #1 |
Nikola Đipanov (ndipanov) wrote : | #2 |
This is basically the same as https:/
So after investigating this - it seems that there is really not that much work that needs to be done all the information is passed in to the filter, it's just that we mangle the request_spec and filter_properties dicts so much, and the keys are so generic, that it is really difficult to make sense of it without following the code all the way from the API.
Because of this it would probably be good to add a method that basically says - when inside a filter - give me a flavor I should be looking at right now.
Chris Friesen (cbf123) wrote : | #3 |
While it's true that this bug would cover the resize case that I mentioned in note #1 of bug #1417667, I think that we still need to keep that bug open for the more general case of live-migration, evacuate, rebuild, etc.
The key different for that bug is that when using dedicated CPUs we need to recalculate which CPUs to use on the destination compute node (and claim those resources) before actually doing the migration/
Fix proposed to branch: master
Review: https:/
Changed in nova: | |
assignee: | Tiago Rodrigues de Mello (timello) → Nikola Đipanov (ndipanov) |
status: | Confirmed → In Progress |
Nikola Đipanov (ndipanov) wrote : | #5 |
@Chris - well from the POV of the code, fixing this for the general case of CPU pinning is a sub-problem of fixing it for NUMA as such really, since CPU pinning uses the same code paths as NUMA does and relies on the same filter.
Fixing it for live migration with specified host likely requires a different bug anyway - so we might want to open that and leave this one closed?
zhangtralon (zhangchunlong1) wrote : | #6 |
This is a big problem, I think that we need to think every features related to NUMA. Now, when using the feature of huge page, I meet the same problem.
OpenStack Infra (hudson-openstack) wrote : | #7 |
Fix proposed to branch: master
Review: https:/
Changed in nova: | |
assignee: | Nikola Đipanov (ndipanov) → Ed Leafe (ed-leafe) |
Change abandoned by Joe Gordon (<email address hidden>) on branch: master
Review: https:/
Reason: This review is > 4 weeks without comment, and failed Jenkins the last time it was checked. We are abandoning this for now. Feel free to reactivate the review by pressing the restore button and leaving a 'recheck' comment to get fresh test results.
OpenStack Infra (hudson-openstack) wrote : | #9 |
Change abandoned by Nikola Dipanov (<email address hidden>) on branch: master
Review: https:/
I undertook some research into this. My findings are below, but tl;dr: it appears that this now works as expected and the bug can be closed.
---
# Problem
There were reports that resizing an instance from a pinned flavor to a unpinned
one not result in the pinning being removed. The opposite is also reportedly
true.
# Steps
## Create the required flavors
$ openstack flavor create test.unpinned --id 100 --ram 2048 --disk 0 --vcpus 2
$ openstack flavor create test.pinned --id 101 --ram 2048 --disk 0 --vcpus 2
$ openstack flavor set test.pinned --property "hw:cpu_
# Ensure this is available
$ openstack flavor list
+--
| ID | Name | RAM | Disk | Ephemeral | VCPUs | Is Public |
+--
| 1 | m1.tiny | 512 | 1 | 0 | 1 | True |
| 101 | test.unpinned | 2048 | 0 | 0 | 2 | True |
| 101 | test.pinned | 2048 | 0 | 0 | 2 | True |
| 2 | m1.small | 2048 | 20 | 0 | 1 | True |
| 3 | m1.medium | 4096 | 40 | 0 | 2 | True |
| 4 | m1.large | 8192 | 80 | 0 | 4 | True |
| 42 | m1.nano | 64 | 0 | 0 | 1 | True |
| 5 | m1.xlarge | 16384 | 160 | 0 | 8 | True |
| 84 | m1.micro | 128 | 0 | 0 | 1 | True |
+--
$ openstack image list
+--
| ID | Name | Status |
+--
| c44bba29-
| 8b0284ee-
| 855c2971-
+--
# Boot an instance
$ openstack server create --flavor=
# Validate that the instance is pinned
$ openstack server list
+--
| ID | Name | Status | Networks |
+--
| 857597cb-
+--
$ sudo virsh list
Id Name State
---
...
Changed in nova: | |
assignee: | Ed Leafe (ed-leafe) → Stephen Finucane (sfinucan) |
Changed in nova: | |
status: | In Progress → Invalid |
Tony Walker (tony-walker-h) wrote : | #11 |
I'm seeing this on Kilo @ 2015.1.0. I have 2 NUMA flavors - one double the size of the other in terms of CPU and memory.
If I boot a new instance of the large type, all is well. If I boot the small, and resize to the large, the cputune section gets the correct shares for the large, but the <vcpupin> entries for the old. To compound the issue, the <numa> section contains the memory size of the smaller flavor resulting in:
qemu-system-x86_64: total memory for NUMA nodes (0x1c00000000) should equal RAM size (0x3800000000)
@sfinucan - what version did you find this fixed on?
liuxiuli (liu-lixiu) wrote : | #12 |
@Stephen Finucane - This problem exists in master version. Do you have time to deal with this bug? I wish to see your modification as soon as possible. Thank you.
Change abandoned by Jay Pipes (<email address hidden>) on branch: master
Review: https:/
Reason: The bug appears to now be fixed and Nikola is no longer working on Nova. Abandoning...
This bug essentially means that resize is not usable for any instances that have a NUMA topology. Is anyone working on this?