'hw:cpu_thread_policy=prefer' misbehaviour

Bug #1578155 reported by Ricardo Noriega on 2016-05-04
36
This bug affects 4 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Stephen Finucane
Newton
Medium
Stephen Finucane

Bug Description

Description
===========

'hw:cpu_thread_policy=prefer' allocates vCPUs in pairs of sibling threads properly. An odd number of vCPUs will allocate pairs and a single one. That single one should not be isolated. So 20 available threads, shall be able to allocate 4 VMs of 5 vCPUs. When booting up the third VM, it is giving an error.

Steps to reproduce
==================

1.- Creating a flavor:

nova flavor-create pinning auto 1024 10 5
nova flavor-key pinning set hw:cpu_policy=dedicated
nova flavor-key pinning set hw:cpu_thread_policy=prefer
nova flavor-key pinning set hw:numa_nodes=1

2.- Booting up simple VMs:

nova boot testPin1 --flavor pinning --image cirros --nic net-id=$NET_ID

In my setup, I have 20 available threads:

  NUMANode L#0 (P#0 32GB)
    Socket L#0 + L3 L#0 (15MB)
      L2 L#0 (256KB) + L1d L#0 (32KB) + L1i L#0 (32KB) + Core L#0
        PU L#0 (P#0)
        PU L#1 (P#12)
      L2 L#1 (256KB) + L1d L#1 (32KB) + L1i L#1 (32KB) + Core L#1
        PU L#2 (P#2)
        PU L#3 (P#14)
      L2 L#2 (256KB) + L1d L#2 (32KB) + L1i L#2 (32KB) + Core L#2
        PU L#4 (P#4)
        PU L#5 (P#16)
      L2 L#3 (256KB) + L1d L#3 (32KB) + L1i L#3 (32KB) + Core L#3
        PU L#6 (P#6)
        PU L#7 (P#18)
      L2 L#4 (256KB) + L1d L#4 (32KB) + L1i L#4 (32KB) + Core L#4
        PU L#8 (P#8)
        PU L#9 (P#20)
      L2 L#5 (256KB) + L1d L#5 (32KB) + L1i L#5 (32KB) + Core L#5
        PU L#10 (P#10)
        PU L#11 (P#22)

  NUMANode L#1 (P#1 32GB) + Socket L#1 + L3 L#1 (15MB)
    L2 L#6 (256KB) + L1d L#6 (32KB) + L1i L#6 (32KB) + Core L#6
      PU L#12 (P#1)
      PU L#13 (P#13)
    L2 L#7 (256KB) + L1d L#7 (32KB) + L1i L#7 (32KB) + Core L#7
      PU L#14 (P#3)
      PU L#15 (P#15)
    L2 L#8 (256KB) + L1d L#8 (32KB) + L1i L#8 (32KB) + Core L#8
      PU L#16 (P#5)
      PU L#17 (P#17)
    L2 L#9 (256KB) + L1d L#9 (32KB) + L1i L#9 (32KB) + Core L#9
      PU L#18 (P#7)
      PU L#19 (P#19)
    L2 L#10 (256KB) + L1d L#10 (32KB) + L1i L#10 (32KB) + Core L#10
      PU L#20 (P#9)
      PU L#21 (P#21)
    L2 L#11 (256KB) + L1d L#11 (32KB) + L1i L#11 (32KB) + Core L#11
      PU L#22 (P#11)
      PU L#23 (P#23)

Using the cpu_thread_policy:prefer, the behaviour is ok for the first two VMs. So 5 threads are allocated in pairs.

[root@nfvsdn-04 ~(keystone_admin)]# virsh vcpupin 2
VCPU: CPU Affinity
----------------------------------
   0: 10
   1: 22
   2: 16
   3: 4
   4: 8

[root@nfvsdn-04 ~(keystone_admin)]# virsh vcpupin 3
VCPU: CPU Affinity
----------------------------------
   0: 17
   1: 5
   2: 3
   3: 15
   4: 11

However, eventhough there are enough threads in order to allocate another 2 VMs with the same flavor, I get the following error booting up the third VM:

INFO nova.filters Filtering removed all hosts for the request with instance ID 'cbb53e29-a7da-4c14-a3ad-4fb3aa04f101'. Filter results: ['RetryFilter: (start: 1, end: 1)', 'AvailabilityZoneFilter: (start: 1, end: 1)', 'RamFilter: (start: 1, end: 1)', 'C[r│omputeFilter: (start: 1, end: 1)', 'ComputeCapabilitiesFilter: (start: 1, end: 1)', 'ImagePropertiesFilter: (start: 1, end: 1)', 'CoreFilter: (start: 1, end: 1)', 'NUMATopologyFilter: (start: 1, endo: 0)']

There should be enough space for 4 VMs allocated with cpu_thread_policy=prefer flavor.

Expected result
===============

To have 4 VMs up&running with flavor 'pinning'.

Actual result
=============

3rd VM fails at scheduling.

Environment
==========

All-in-one environment.

tags: added: numa
removed: cpu prefer thread
Changed in nova:
assignee: nobody → Vladik Romanovsky (vladik-romanovsky)
Changed in nova:
status: New → In Progress

Change abandoned by Artom Lifshitz (<email address hidden>) on branch: master
Review: https://review.openstack.org/344992
Reason: You got there first and have more reviews :)

tags: added: newton-rc-potential
melanie witt (melwitt) on 2016-09-09
Changed in nova:
importance: Undecided → Medium
Matt Riedemann (mriedem) wrote :

Was this a regression in Newton or prevent upgrades to Newton? Otherwise it shouldn't be tagged with newton-rc-potential?

tags: removed: newton-rc-potential

Reviewed: https://review.openstack.org/342709
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=8361d8d6c315bb3ae71c3ff0147f7d5156bc46f3
Submitter: Jenkins
Branch: master

commit 8361d8d6c315bb3ae71c3ff0147f7d5156bc46f3
Author: Stephen Finucane <email address hidden>
Date: Tue Jul 19 14:01:53 2016 -0700

    Allow linear packing of cores

    Given the following single-socket, four-core, HT-enabled CPU topology:

       +---+---+ +---+---+ +---+---+ +---+---+
       | x | x | | x | | | x | | | | |
       +---+---+ +---+---+ +---+---+ +---+---+
         1 4 2 5 3 6 4 7

    Attempting to boot an instance with four cores and no explicit
    'cpu_thread_policy' should be successful, with cores 5,6,4,7 used.
    However, the current implementation of this implicit policy attempts to
    fit the same number of instance cores onto each host CPU. For example,
    a four core instance would result in either a 2*2 layout (two instance
    cores on each of two host CPUs), or a 1*4 layout (one instance core on
    each of four host CPUs). This may be correct behavior *where possible*,
    but if this is not possible then any and all cores should be used.

    Resolve this issue by adding a fallthrough case, whereby if the
    standard fitting policy fails, a linear assignment is used to properly
    fit the instance cores.

    Change-Id: I73f7f771b7514060f1f74066e3dea1da8fe74c21
    Closes-Bug: #1578155
    mitaka-backport-potential

Changed in nova:
status: In Progress → Fix Released
Matt Riedemann (mriedem) on 2016-10-06
Changed in nova:
assignee: Vladik Romanovsky (vladik-romanovsky) → Stephen Finucane (stephenfinucane)

Reviewed: https://review.openstack.org/373889
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=2dcf8c22f845de6b4ae12ae3d75f89041b839e57
Submitter: Jenkins
Branch: stable/newton

commit 2dcf8c22f845de6b4ae12ae3d75f89041b839e57
Author: Stephen Finucane <email address hidden>
Date: Tue Jul 19 14:01:53 2016 -0700

    Allow linear packing of cores

    Given the following single-socket, four-core, HT-enabled CPU topology:

       +---+---+ +---+---+ +---+---+ +---+---+
       | x | x | | x | | | x | | | | |
       +---+---+ +---+---+ +---+---+ +---+---+
         1 4 2 5 3 6 4 7

    Attempting to boot an instance with four cores and no explicit
    'cpu_thread_policy' should be successful, with cores 5,6,4,7 used.
    However, the current implementation of this implicit policy attempts to
    fit the same number of instance cores onto each host CPU. For example,
    a four core instance would result in either a 2*2 layout (two instance
    cores on each of two host CPUs), or a 1*4 layout (one instance core on
    each of four host CPUs). This may be correct behavior *where possible*,
    but if this is not possible then any and all cores should be used.

    Resolve this issue by adding a fallthrough case, whereby if the
    standard fitting policy fails, a linear assignment is used to properly
    fit the instance cores.

    Change-Id: I73f7f771b7514060f1f74066e3dea1da8fe74c21
    Closes-Bug: #1578155
    (cherry picked from commit 8361d8d6c315bb3ae71c3ff0147f7d5156bc46f3)

This issue was fixed in the openstack/nova 14.0.1 release.

This issue was fixed in the openstack/nova 15.0.0.0b1 development milestone.

Change abandoned by Stephen Finucane (<email address hidden>) on branch: stable/mitaka
Review: https://review.openstack.org/427119

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers