Numa topology not calculated for instance with numa_topology after upgrading to Mitaka

Bug #1636338 reported by Erik Olof Gunnar Andersson
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Won't Fix
Medium
Unassigned

Bug Description

This is related to this bug https://bugs.launchpad.net/nova/+bug/1596119

After upgrading to Mitaka with the above patch, a new bug surfaced. The bug is related to InstanceNUMACell having cpu_policy set to None. This causes cpu_pinning_requested to always return False.
https://github.com/openstack/nova/blob/master/nova/objects/instance_numa_topology.py#L112

This will then trick computes with old NUMA instances into thinking that nothing is pinned, causing new instances with cpu_policy set to CPUAllocationPolicy.DEDICATED to potentially get scheduled on the same NUMA zone.

summary: Numa topology not calculated for instance with numa_topology after
- upgrading from Kilo
+ upgrading to Mitaka
Revision history for this message
Hans Lindgren (hanlind) wrote :

It looks like [1] changed the way cpu_pinning_requested works without considering backwards compatibility for older instances (instances with InstanceNUMACell < 1.3 that don't store cpu_policy).

     def cpu_pinning_requested(self):
 - return self.cpu_pinning is not None
 + return self.cpu_policy == obj_fields.CPUAllocationPolicy.DEDICATED

Maybe introducing some fallback to the old check if InstanceNUMACell < 1.3 will do the trick.

[1] https://github.com/openstack/nova/commit/dfe6545329e6d7e417615af44f6b5588948699db

tags: added: liberty-backport-potential mitaka-backport-potential needs-attention numa upgrades
tags: added: newton-backport-potential
removed: needs-attention
Changed in nova:
status: New → Confirmed
Prateek Arora (parora)
Changed in nova:
assignee: nobody → Prateek Arora (parora)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/396184

Changed in nova:
assignee: Prateek Arora (parora) → Stephen Finucane (stephenfinucane)
status: Confirmed → In Progress
Revision history for this message
Erik Olof Gunnar Andersson (eandersson) wrote :

Although the patch above fixes the issue.

I still don't understand why cpu_pinning_requested returning False would make it schedule a VM with cpu_pinning_requested returning True on the same NUMA zone. Shouldn't a dedicated VM always have its own NUMA zone?

Revision history for this message
Stephen Finucane (stephenfinucane) wrote : Re: [Bug 1636338] Re: Numa topology not calculated for instance with numa_topology after upgrading to Mitaka

On Thu, 2016-11-10 at 19:25 +0000, Erik Olof Gunnar Andersson wrote:
> Although the patch above fixes the issue.
>
> I still don't understand why cpu_pinning_requested returning False
> would
> make it schedule a VM with cpu_pinning_requested returning True on
> the
> same NUMA zone. Shouldn't a dedicated VM always have its own NUMA
> zone?

Not really. If you want to isolate non-pinned instances from pinned
instances, you should use host aggregates. Non-pinned instances don't
respect the requirments of their pinned equivalents.

Stephen

Changed in nova:
importance: Undecided → Medium
Revision history for this message
Sean Dague (sdague) wrote :

Automatically discovered version mitaka in description. If this is incorrect, please update the description to include 'nova version: ...'

tags: added: openstack-version.mitaka
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Stephen Finucane (<email address hidden>) on branch: master
Review: https://review.openstack.org/396184
Reason: Abandoned in favour of https://review.openstack.org/#/c/485554/, which resolve the same issue in a more comprehensive manner

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Stephen Finucane (<email address hidden>) on branch: master
Review: https://review.openstack.org/485554
Reason: Someone asked for this, but it's been been hanging around for too long. Let's let someone else pick it up if they care enough

melanie witt (melwitt)
tags: added: upgrade
removed: upgrades
Revision history for this message
Matt Riedemann (mriedem) wrote :

Is this still a problem we need to track? Mitaka is long end of life upstream at this point so I'm not even sure this is a problem on upstream stable branches for which we could backport a fix.

Changed in nova:
assignee: Stephen Finucane (stephenfinucane) → nobody
status: In Progress → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.