Error state for VM with enabled CPU pinning after migration

Bug #1564393 reported by Kristina Berezovskaia
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Status tracked in 10.0.x
10.0.x
Fix Committed
High
Sergey Nikitin
9.x
Fix Released
High
Sergey Nikitin

Bug Description

Upstream bug: https://bugs.launchpad.net/nova/+bug/1585214

Detailed bug description:
With enabled cpu pinning for vm migration doesn't work properly

Steps to reproduce:
1) Deploy env with 2 compute node with enable pinning
2) Create aggregate states for this compute-node
3) Create 3 flavors:
- flavor with 2 cpu and 2 numa node
nova flavor-create m1.small.performance-2 auto 2048 20 2
nova flavor-key m1.small.performance-2 set hw:cpu_policy=dedicated
nova flavor-key m1.small.performance-2 set aggregate_instance_extra_specs:pinned=true
nova flavor-key m1.small.performance-2 set hw:numa_nodes=2
nova boot --image TestVM --nic net-id=93e25766-2a22-486c-af82-c62054260c26 --flavor m1.small.performance-2 test2
- flavor with 2 cpu and 1 numa node
nova flavor-create m1.small.performance-1 auto 2048 20 2
nova flavor-key m1.small.performance-1 set hw:cpu_policy=dedicated
nova flavor-key m1.small.performance-1 set aggregate_instance_extra_specs:pinned=true
nova flavor-key m1.small.performance-1 set hw:numa_nodes=1
nova boot --image TestVM --nic net-id=93e25766-2a22-486c-af82-c62054260c26 --flavor m1.small.performance-1 test3
- flavor with 1 cpu and 1 numa node
nova flavor-create m1.small.performance auto 512 1 1
nova flavor-key m1.small.performance set hw:cpu_policy=dedicated
nova flavor-key m1.small.performance set aggregate_instance_extra_specs:pinned=true
nova flavor-key m1.small.performance set hw:numa_nodes=1
4) boot vm1, vm2 and vm3 with this flavors
5) Migrate vm1: nova migrate vm1
Confirm resizing: nova resize-confirm vm1
Expected results:
vm1 migrate to another node
Actual resilts:
vm1 in ERROR
{"message": "Cannot pin/unpin cpus [17] from the following pinned set [3]", "code": 400, "created": "2016-03-31T09:26:00Z"} |
6) Migrate vm2: nova migrate vm2
Confirm resizing: nova resize-confirm vm2
Repeat one more time migration and confirmin
Expected results:
vm1 migrate to another node
Actual resilts:
vm1 in ERROR
6) nova migrate vm3 for 3 time
the same

Description of the environment:
iso #129 9.0, neutron+ubunt, 1 controller and 2 compute node with cpu pinning

Revision history for this message
Kristina Berezovskaia (kkuznetsova) wrote :
Changed in mos:
status: New → Confirmed
tags: added: area-nova
Changed in mos:
assignee: MOS Nova (mos-nova) → Sergey Nikitin (snikitin)
Changed in mos:
status: Confirmed → In Progress
Revision history for this message
Timur Nurlygayanov (tnurlygayanov) wrote :

Do we have a commit on review? Why this issue is in progress?

Revision history for this message
Sergey Nikitin (snikitin) wrote :

I just wanted to mark that I working on it, like it mentioned in the description of status "In progress":

In Progress
The assigned person is working on it.

Or we use "Triaged" for it?

description: updated
Revision history for this message
Sergey Nikitin (snikitin) wrote :
Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix proposed to openstack/nova (9.0/mitaka)

Fix proposed to branch: 9.0/mitaka
Change author: Sergey Nikitin <email address hidden>
Review: https://review.fuel-infra.org/21619

Revision history for this message
Fuel Devops McRobotson (fuel-devops-robot) wrote : Fix merged to openstack/nova (9.0/mitaka)

Reviewed: https://review.fuel-infra.org/21619
Submitter: Pkgs Jenkins <email address hidden>
Branch: 9.0/mitaka

Commit: 780464b9e83a3034cd9dac940d423696aa063500
Author: Sergey Nikitin <email address hidden>
Date: Fri Jun 3 09:49:13 2016

Fixed clean up process in confirm_resize() after resize/cold migration

On env with NUMA topology and enabled cpu pinning we have one problem.
If instance changes numa node (or even pinned cpus in numa node)
during cold migration from one host to another confirming resize
failed with "Cannot pin/unpin cpus from the following pinned set".

It happening because confirm_resize() tries to clean up source
host using numa topology from destination host.

Closes-Bug: #1564393

Change-Id: I3b87be3f25fc0bce4efd9804fa562a6f66355464
(cherry picked from commit d7b8d997f0a7d40055c544470533e8a11855ff8f)

Changed in mos:
status: In Progress → Fix Committed
tags: added: on-verification
Revision history for this message
Kristina Berezovskaia (kkuznetsova) wrote :

Verify on:
cat /etc/fuel_build_id:
 466
cat /etc/fuel_build_number:
 466
cat /etc/fuel_release:
 9.0
cat /etc/fuel_openstack_version:
 mitaka-9.0
rpm -qa | egrep 'fuel|astute|network-checker|nailgun|packetary|shotgun':
 fuel-release-9.0.0-1.mos6349.noarch
 fuel-misc-9.0.0-1.mos8454.noarch
 python-packetary-9.0.0-1.mos140.noarch
 fuel-bootstrap-cli-9.0.0-1.mos285.noarch
 fuel-migrate-9.0.0-1.mos8454.noarch
 rubygem-astute-9.0.0-1.mos750.noarch
 fuel-mirror-9.0.0-1.mos140.noarch
 shotgun-9.0.0-1.mos90.noarch
 fuel-openstack-metadata-9.0.0-1.mos8742.noarch
 fuel-notify-9.0.0-1.mos8454.noarch
 nailgun-mcagents-9.0.0-1.mos750.noarch
 python-fuelclient-9.0.0-1.mos325.noarch
 fuel-9.0.0-1.mos6349.noarch
 fuel-utils-9.0.0-1.mos8454.noarch
 fuel-setup-9.0.0-1.mos6349.noarch
 fuel-provisioning-scripts-9.0.0-1.mos8742.noarch
 fuel-library9.0-9.0.0-1.mos8454.noarch
 network-checker-9.0.0-1.mos74.x86_64
 fuel-agent-9.0.0-1.mos285.noarch
 fuel-ui-9.0.0-1.mos2717.noarch
 fuel-ostf-9.0.0-1.mos935.noarch
 fuelmenu-9.0.0-1.mos274.noarch
 fuel-nailgun-9.0.0-1.mos8742.noarch

neutron+vlan+kvm, cpu pinning

Boot 3 vms with different cpu pinning flavors and migrate each vm 10 times. Migration works correctly

tags: removed: on-verification
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.