vcpu_pin_set setting raises exception

Bug #1372829 reported by Irena Berezovsky
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
High
Boden R

Bug Description

once enabled vcpu_pin_set=0-9 in nova.conf, got the following exception:

2014-09-23 11:00:41.603 14427 DEBUG nova.openstack.common.processutils [-] Result was 0 execute /opt/stack/nova/nova/openstack/common/processutils.py:195
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 455, in fire_timers
    timer()
  File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/timer.py", line 58, in __call__
    cb(*args, **kw)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 168, in _do_send
    waiter.switch(result)
  File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 212, in main
    result = function(*args, **kwargs)
  File "/opt/stack/nova/nova/openstack/common/service.py", line 490, in run_service
    service.start()
  File "/opt/stack/nova/nova/service.py", line 181, in start
    self.manager.pre_start_hook()
  File "/opt/stack/nova/nova/compute/manager.py", line 1152, in pre_start_hook
    self.update_available_resource(nova.context.get_admin_context())
  File "/opt/stack/nova/nova/compute/manager.py", line 5922, in update_available_resource
    nodenames = set(self.driver.get_available_nodes())
  File "/opt/stack/nova/nova/virt/driver.py", line 1237, in get_available_nodes
    stats = self.get_host_stats(refresh=refresh)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5760, in get_host_stats
    return self.host_state.get_host_stats(refresh=refresh)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 470, in host_state
    self._host_state = HostState(self)
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6320, in __init__
    self.update_status()
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6376, in update_status
    numa_topology = self.driver._get_host_numa_topology()
  File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4869, in _get_host_numa_topology
    cell.cpuset &= allowed_cpus
TypeError: unsupported operand type(s) for &=: 'set' and 'list'
2014-09-23 11:00:42.032 14427 ERROR nova.openstack.common.threadgroup [-] unsupported operand type(s) for &=: 'set' and 'list'
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup Traceback (most recent call last):
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/openstack/common/threadgroup.py", line 125, in wait
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup x.wait()
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/openstack/common/threadgroup.py", line 47, in wait
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup return self.thread.wait()
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 173, in wait
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup return self._exit_event.wait()
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/event.py", line 121, in wait
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup return hubs.get_hub().switch()
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/hubs/hub.py", line 293, in switch
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup return self.greenlet.switch()
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/usr/local/lib/python2.7/dist-packages/eventlet/greenthread.py", line 212, in main
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup result = function(*args, **kwargs)
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/openstack/common/service.py", line 490, in run_service
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup service.start()
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/service.py", line 181, in start
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup self.manager.pre_start_hook()
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/compute/manager.py", line 1152, in pre_start_hook
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup self.update_available_resource(nova.context.get_admin_context())
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/compute/manager.py", line 5922, in update_available_resource
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup nodenames = set(self.driver.get_available_nodes())
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/virt/driver.py", line 1237, in get_available_nodes
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup stats = self.get_host_stats(refresh=refresh)
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 5760, in get_host_stats
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup return self.host_state.get_host_stats(refresh=refresh)
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 470, in host_state
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup self._host_state = HostState(self)
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6320, in __init__
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup self.update_status()
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 6376, in update_status
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup numa_topology = self.driver._get_host_numa_topology()
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup File "/opt/stack/nova/nova/virt/libvirt/driver.py", line 4869, in _get_host_numa_topology
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup cell.cpuset &= allowed_cpus
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup TypeError: unsupported operand type(s) for &=: 'set' and 'list'
2014-09-23 11:00:42.032 14427 TRACE nova.openstack.common.threadgroup

Tags: libvirt
Revision history for this message
Nikola Đipanov (ndipanov) wrote :

So this happens only on NUMA hosts where we hit the following branch of NUMA fitting code.

if topology:
                # Host is NUMA capable so try to keep the instance in a cell
                viable_cells = [cell for cell in topology.cells
                                if vcpus <= len(cell.cpus) and
                                memory * 1024 <= cell.memory]
                if not viable_cells:
                    # We can't contain the instance in a cell - do nothing for
                    # now.
                    # TODO(ndipanov): Attempt to spread the instance accross
                    # NUMA nodes and expose the topology to the instance as an
                    # optimisation
                    return allowed_cpus, None, None
                else:
                    cell = random.choice(viable_cells)
                    pin_cpuset = set(cpu.id for cpu in cell.cpus)
                    if allowed_cpus:
                        pin_cpuset &= allowed_cpus
                    return pin_cpuset, None, None

The issue is that we pass the return value of hardware.get_vcpu_pin_set() which returns a list directly into this method. which assumes it's an instance of set().

We need to make sure that we conver it to a set first.

This bug will break anyone who uses the vcpu_pin_set config option so should be fixed ASAP.

Changed in nova:
milestone: none → juno-rc1
importance: Undecided → Critical
importance: Critical → High
status: New → Confirmed
Boden R (boden)
Changed in nova:
assignee: nobody → Boden R (boden)
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/123515

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Fix proposed to branch: master
Review: https://review.openstack.org/123696

Changed in nova:
assignee: Boden R (boden) → Nikola Đipanov (ndipanov)
Changed in nova:
assignee: Nikola Đipanov (ndipanov) → Boden R (boden)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on nova (master)

Change abandoned by Nikola Dipanov (<email address hidden>) on branch: master
Review: https://review.openstack.org/123696
Reason: Abandoning in favour of https://review.openstack.org/#/c/123515/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Change abandoned by Nikola Dipanov (<email address hidden>) on branch: master
Review: https://review.openstack.org/123696
Reason: Actually no - the other fix is better... move along :)

tags: added: libvirt
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/123515
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=021202e80da7ce587a4d36c464c7b6835e5bface
Submitter: Jenkins
Branch: master

commit 021202e80da7ce587a4d36c464c7b6835e5bface
Author: Boden R <email address hidden>
Date: Tue Sep 23 12:55:54 2014 -0400

    Return vcpu pin set as set rather than list

    The current implementation of the libvirt driver makes mixed assumptions
    about the vcpu pin set returned from hardware.get_vcpu_pin_set(). Most
    places in the code assume its a set and perform set operations on the
    structure. However a few places assume it's a list. Given the mixed
    assumptions about the structure type, the existing code was trying
    to perform set operations on a list.

    This patch changes the get_vcpu_pin_set() method to return a set
    rather than a list and handles any edge cases where consumers need
    a list. It also updates any relevant unit tests accordingly.

    Change-Id: I66d5cbc0e2d370d9d2d2ab2bad2c5b348bedba6c
    Closes-Bug: 1372829

Changed in nova:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in nova:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: juno-rc1 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.