Numa topology not calculated for instance with numa_topology after upgrading to Mitaka

Bug #1636338 reported by Erik Olof Gunnar Andersson on 2016-10-25
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Medium
Stephen Finucane

Bug Description

This is related to this bug https://bugs.launchpad.net/nova/+bug/1596119

After upgrading to Mitaka with the above patch, a new bug surfaced. The bug is related to InstanceNUMACell having cpu_policy set to None. This causes cpu_pinning_requested to always return False.
https://github.com/openstack/nova/blob/master/nova/objects/instance_numa_topology.py#L112

This will then trick computes with old NUMA instances into thinking that nothing is pinned, causing new instances with cpu_policy set to CPUAllocationPolicy.DEDICATED to potentially get scheduled on the same NUMA zone.

summary: Numa topology not calculated for instance with numa_topology after
- upgrading from Kilo
+ upgrading to Mitaka
Hans Lindgren (hanlind) wrote :

It looks like [1] changed the way cpu_pinning_requested works without considering backwards compatibility for older instances (instances with InstanceNUMACell < 1.3 that don't store cpu_policy).

     def cpu_pinning_requested(self):
 - return self.cpu_pinning is not None
 + return self.cpu_policy == obj_fields.CPUAllocationPolicy.DEDICATED

Maybe introducing some fallback to the old check if InstanceNUMACell < 1.3 will do the trick.

[1] https://github.com/openstack/nova/commit/dfe6545329e6d7e417615af44f6b5588948699db

tags: added: liberty-backport-potential mitaka-backport-potential needs-attention numa upgrades
tags: added: newton-backport-potential
removed: needs-attention
Changed in nova:
status: New → Confirmed
Prateek Arora (parora) on 2016-11-03
Changed in nova:
assignee: nobody → Prateek Arora (parora)

Fix proposed to branch: master
Review: https://review.openstack.org/396184

Changed in nova:
assignee: Prateek Arora (parora) → Stephen Finucane (stephenfinucane)
status: Confirmed → In Progress

Although the patch above fixes the issue.

I still don't understand why cpu_pinning_requested returning False would make it schedule a VM with cpu_pinning_requested returning True on the same NUMA zone. Shouldn't a dedicated VM always have its own NUMA zone?

On Thu, 2016-11-10 at 19:25 +0000, Erik Olof Gunnar Andersson wrote:
> Although the patch above fixes the issue.
>
> I still don't understand why cpu_pinning_requested returning False
> would
> make it schedule a VM with cpu_pinning_requested returning True on
> the
> same NUMA zone. Shouldn't a dedicated VM always have its own NUMA
> zone?

Not really. If you want to isolate non-pinned instances from pinned
instances, you should use host aggregates. Non-pinned instances don't
respect the requirments of their pinned equivalents.

Stephen

Changed in nova:
importance: Undecided → Medium
Sean Dague (sdague) wrote :

Automatically discovered version mitaka in description. If this is incorrect, please update the description to include 'nova version: ...'

tags: added: openstack-version.mitaka

Change abandoned by Stephen Finucane (<email address hidden>) on branch: master
Review: https://review.openstack.org/396184
Reason: Abandoned in favour of https://review.openstack.org/#/c/485554/, which resolve the same issue in a more comprehensive manner

Change abandoned by Stephen Finucane (<email address hidden>) on branch: master
Review: https://review.openstack.org/485554
Reason: Someone asked for this, but it's been been hanging around for too long. Let's let someone else pick it up if they care enough

melanie witt (melwitt) on 2019-04-10
tags: added: upgrade
removed: upgrades
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers