Incorrect migration of instances with NUMA/CPU pinning

Bug #1590707 reported by Alexander Rubtsov
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Mirantis OpenStack
Status tracked in 10.0.x
10.0.x
Confirmed
Medium
Sergey Nikitin
7.0.x
Won't Fix
Medium
Sergey Nikitin
9.x
Won't Fix
Medium
Sergey Nikitin

Bug Description

Upstream bug: https://bugs.launchpad.net/nova/+bug/1417667

Description:
I configured a flavor with two vcpus and extra specs "hw:cpu_policy=dedicated" in order to enable vcpu pinning.

I booted up a number of instances such that there was one instance affined to host cpus 12 and 13 on compute-0, and another instance affined to cpus 12 and 13 on compute-2. (As reported by "virsh vcpupin" and "virsh dumpxml".)

I then triggered a live migration of one instance from compute-0 to compute-2. This resulted in both instances being affined to host cpus 12 and 13 on compute-2.

The "hw:cpu_policy=dedicated" extra spec is intended to provide dedicated host cpus for the instance. In order to provide this, on a live migration (or cold migration, or rebuild, or evacuation, or resize, etc.) nova needs to ensure that the instance is affined to host cpus that are not currently being used by other instances.

Revision history for this message
Alexander Rubtsov (arubtsov) wrote :

sla2 for 7.0-updates

tags: added: customer-found sla2
Revision history for this message
Bug Checker Bot (bug-checker) wrote : Autochecker

(This check performed automatically)
Please, make sure that bug description contains the following sections filled in with the appropriate data related to the bug you are describing:

actual result

version

expected result

steps to reproduce

For more detailed information on the contents of each of the listed sections see https://wiki.openstack.org/wiki/Fuel/How_to_contribute#Here_is_how_you_file_a_bug

tags: added: need-info
Revision history for this message
Sergey Nikitin (snikitin) wrote :

Do you have problems with migration (if yes, with what migration? cold or life?) or with all of these things? (live migration, cold migration, rebuild, evacuation, resize)

Also the upstream bug has about 12 fixes (3 of them are not merged yet). I think it would be hard and risky to backport all of them because of a lot of dependent code.

Changed in mos:
status: New → Incomplete
importance: Undecided → Medium
assignee: nobody → Alexander Rubtsov (arubtsov)
Revision history for this message
Alexander Rubtsov (arubtsov) wrote :

Sergey,

A customer, who reported this problem, experiences this issue during block migration (nova migrate <uuid>)

Changed in mos:
status: Incomplete → New
Revision history for this message
Vitaly Sedelnik (vsedelnik) wrote :

NUMA/CPU pinning is not supported in MOS 7.0, it will be available only in 9.0 as experimental feature.

So Won't Fix for 7.0-updates, Confirmed for 9.0-updates and 10.0

Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :

As Sergey pointed out in #3, this is a known limitation in Kilo, Liberty and Mitaka releases of Nova. It's currently being fixed in upstream, but implies a major refactoring of this part of Nova code (12 patches, 3 of which are yet to be merged to master), which means there is no way this will be backported to neither of 7.0, 8.0 or 9.0 MOS releases.

Also, this is a very specific use case (thusm this issue Medium importance) and by definition is not what is usually backported to stable branches.

Moreover, as Vitaly pointed out in #5, MOS never officially supported NUMA related features prior to 9.0 (in which support of those is claimed to be "experimental").

The bottom line is, this is going to be fixed only in Newton (10.0). For all prior releases the fix implies a major refactoring, is rather risky and does not have enough importance to be backported to stable branches (it just can't due to the fact this feature was not officially supported...), thus Won't Fix.

Roman Vyalov (r0mikiam)
Changed in mos:
milestone: 9.1 → 9.2
Revision history for this message
Alexander Rubtsov (arubtsov) wrote :

Moving back to Nova team, as the clarification regarding migration type had been provided by customer

Changed in mos:
assignee: Alexander Rubtsov (arubtsov) → Sergey Nikitin (snikitin)
Revision history for this message
Roman Podoliaka (rpodolyaka) wrote :
Changed in mos:
status: Confirmed → Fix Committed
tags: added: on-verification
Revision history for this message
Sergey Novikov (snovikov) wrote :

Verified on snapshot #822 (RC2)

tags: removed: on-verification
Changed in mos:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.