HostState.consume_from_instance fails when instance has numa topology

Bug #1444021 reported by Przemyslaw Czesnowicz on 2015-04-14
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
High
Przemyslaw Czesnowicz
Kilo
High
Nikola Đipanov

Bug Description

The consume_from_instance method will throw an exception if the instance has numa topology defined.
This happens because 'requests' are not retrieved from pci_requests.

2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher Traceback (most recent call last):
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 142, in _dispatch_and_reply
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher executor_callback))
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 186, in _dispatch
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher executor_callback)
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py", line 130, in _do_dispatch
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher result = func(ctxt, **new_args)
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher File "/usr/lib/python2.7/site-packages/oslo_messaging/rpc/server.py", line 142, in inner
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher return func(*args, **kwargs)
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/scheduler/manager.py", line 86, in select_destinations
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher filter_properties)
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 67, in select_destinations
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher filter_properties)
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/scheduler/filter_scheduler.py", line 163, in _schedule
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher chosen_host.obj.consume_from_instance(instance_properties)
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/scheduler/host_manager.py", line 280, in consume_from_instance
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher pci_requests=pci_requests, pci_stats=self.pci_stats)
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/virt/hardware.py", line 1034, in numa_fit_instance_to_host
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher cells)):
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/pci/stats.py", line 222, in support_requests
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher for r in requests])
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher File "/opt/stack/nova/nova/pci/stats.py", line 196, in _apply_request
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher matching_pools = self._filter_pools_for_spec(pools, request.spec)
2015-04-14 14:41:39.243 TRACE oslo_messaging.rpc.dispatcher AttributeError: 'unicode' object has no attribute 'spec'

If the instance has pci_requests this may not work either as apply_requests will remove pci devices from the pool so
numa_fit_instance_to_host may fail (because there are no devices left)

Changed in nova:
assignee: nobody → Przemyslaw Czesnowicz (pczesno)

Fix proposed to branch: master
Review: https://review.openstack.org/173394

Changed in nova:
status: New → In Progress
Nikola Đipanov (ndipanov) wrote :

Would be really great to have this + https://bugs.launchpad.net/nova/+bug/1438238 in for the kilo release, as without it - one of the features done in Kilo (http://specs.openstack.org/openstack/nova-specs/specs/kilo/implemented/input-output-based-numa-scheduling.html) is completely broken.

tags: added: kilo-backport-potential kilo-rc-potential
Changed in nova:
importance: Undecided → Medium
importance: Medium → High

Reviewed: https://review.openstack.org/173394
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=0913e799e9ce3138235f5ea6f80159f468ad2aaa
Submitter: Jenkins
Branch: master

commit 0913e799e9ce3138235f5ea6f80159f468ad2aaa
Author: Przemyslaw Czesnowicz <email address hidden>
Date: Tue Apr 14 16:28:57 2015 +0100

    Fix handling of pci_requests in consume_from_instance.

    Properly retrieve requests from pci_requests in consume_from_instance.
    Without this the call to numa_fit_instance_to_host will fail because
    it expects the request list.
    And change the order in which apply_requests and numa_fit_instance_to_host
    are called. Calling apply_requests first will remove devices from pools
    and may make numa_fit_instance_to_host fail.

    Change-Id: I41cf4e8e5c1dea5f91e5261a8f5e88f46c7994ef
    Closes-bug: #1444021

Changed in nova:
status: In Progress → Fix Committed

Reviewed: https://review.openstack.org/175789
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=bf79742d26ae66886bcdc55eeaf27e1d7ce24be5
Submitter: Jenkins
Branch: stable/kilo

commit bf79742d26ae66886bcdc55eeaf27e1d7ce24be5
Author: Przemyslaw Czesnowicz <email address hidden>
Date: Tue Apr 14 16:28:57 2015 +0100

    Fix handling of pci_requests in consume_from_instance.

    Properly retrieve requests from pci_requests in consume_from_instance.
    Without this the call to numa_fit_instance_to_host will fail because
    it expects the request list.
    And change the order in which apply_requests and numa_fit_instance_to_host
    are called. Calling apply_requests first will remove devices from pools
    and may make numa_fit_instance_to_host fail.

    Change-Id: I41cf4e8e5c1dea5f91e5261a8f5e88f46c7994ef
    Closes-bug: #1444021
    (cherry picked from commit 0913e799e9ce3138235f5ea6f80159f468ad2aaa)

Thierry Carrez (ttx) on 2015-04-23
tags: removed: kilo-backport-potential kilo-rc-potential
Download full text (18.1 KiB)

Reviewed: https://review.openstack.org/179284
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5228d4e418734164ffa5ccd91d2865d9cc659c00
Submitter: Jenkins
Branch: master

commit 906ab9d6522b3559b4ad36d40dec3af20397f223
Author: He Jie Xu <email address hidden>
Date: Thu Apr 16 07:09:34 2015 +0800

    Update rpc version aliases for kilo

    Update all of the rpc client API classes to include a version alias
    for the latest version implemented in Kilo. This alias is needed when
    doing rolling upgrades from Kilo to Liberty. With this in place, you can
    ensure all services only send messages that both Kilo and Liberty will
    understand.

    Closes-Bug: #1444745

    Conflicts:
     nova/conductor/rpcapi.py

    NOTE(alex_xu): The conflict is due to there are some logs already added
    into the master.

    Change-Id: I2952aec9aae747639aa519af55fb5fa25b8f3ab4
    (cherry picked from commit 78a8b5802ca148dcf37c5651f75f2126d261266e)

commit f191a2147a21c7e50926b288768a96900cf4c629
Author: Hans Lindgren <email address hidden>
Date: Fri Apr 24 13:10:39 2015 +0200

    Add security group calls missing from latest compute rpc api version bump

    The recent compute rpc api version bump missed out on the security group
    related calls that are part of the api.

    One possible reason is that both compute and security group client side
    rpc api:s share a single target, which is of little value and only cause
    mistakes like this.

    This change eliminates future problems like this by combining them into
    one to get a 1:1 relationship between client and server api:s.

    Change-Id: I9207592a87fab862c04d210450cbac47af6a3fd7
    Closes-Bug: #1448075
    (cherry picked from commit bebd00b117c68097203adc2e56e972d74254fc59)

commit a2872a9262985bd0ee2c6df4f7593947e0516406
Author: Dan Smith <email address hidden>
Date: Wed Apr 22 09:02:03 2015 -0700

    Fix migrate_flavor_data() to catch instances with no instance_extra rows

    The way the query was being performed previously, we would not see any
    instances that didn't have a row in instance_extra. This could happen if
    an instance hasn't been touched for several releases, or if the data
    set is old.

    The fix is a simple change to use outerjoin instead of join. This patch
    includes a test that ensures that instances with no instance_extra rows
    are included in the migration. If we query an instance without such a
    row, we create it before doing a save on the instance.

    Closes-Bug: #1447132
    Change-Id: I2620a8a4338f5c493350f26cdba3e41f3cb28de7
    (cherry picked from commit 92714accc49e85579f406de10ef8b3b510277037)

commit e3a7b83834d1ae2064094e9613df75e3b07d77cd
Author: OpenStack Proposal Bot <email address hidden>
Date: Thu Apr 23 02:18:41 2015 +0000

    Updated from global requirements

    Change-Id: I5d4acd36329fe2dccb5772fed3ec55b442597150

commit 8c9b5e620eef3233677b64cd234ed2551e6aa182
Author: Divya <email address hidden>
Date: Tue Apr 21 08:26:29 2015 +0200

    Control create/delete flavor api permissions using policy.json

    The permissions of ...

Thierry Carrez (ttx) on 2015-06-24
Changed in nova:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx) on 2015-10-15
Changed in nova:
milestone: liberty-1 → 12.0.0
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers