can't schedule vm with numa topology and pci device

Bug #1441169 reported by Przemyslaw Czesnowicz
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Compute (nova)
Fix Released
Medium
Przemyslaw Czesnowicz
Kilo
Fix Released
Medium
Nikola Đipanov

Bug Description

NUMATopologyFilter will always return 0 hosts for a VM that has a numa topology defined and has requested a pci device.

this happens because pci numa_node information is converted to string in PciDevicePool,
PciDeviceStats expects numa_node info to be an int.

2015-04-07 14:08:51.399 DEBUG nova.filters [req-d417e042-2d61-4fc5-a38b-8898f4f512d0 admin demo] Starting with 1 host(s) from (pid=47627) get_filtered_objects /shared/stack/nova/nova/filters.py:70
2015-04-07 14:08:51.399 DEBUG nova.filters [req-d417e042-2d61-4fc5-a38b-8898f4f512d0 admin demo] Filter RamFilter returned 1 host(s) from (pid=47627) get_filtered_objects /shared/stack/nova/nova/filters.py:84
2015-04-07 14:08:51.399 DEBUG nova.filters [req-d417e042-2d61-4fc5-a38b-8898f4f512d0 admin demo] Filter ComputeFilter returned 1 host(s) from (pid=47627) get_filtered_objects /shared/stack/nova/nova/filters.py:84
2015-04-07 14:08:51.400 DEBUG nova.filters [req-d417e042-2d61-4fc5-a38b-8898f4f512d0 admin demo] Filter AvailabilityZoneFilter returned 1 host(s) from (pid=47627) get_filtered_objects /shared/stack/nova/nova/filters.py:84
2015-04-07 14:08:51.400 DEBUG nova.filters [req-d417e042-2d61-4fc5-a38b-8898f4f512d0 admin demo] Filter ComputeCapabilitiesFilter returned 1 host(s) from (pid=47627) get_filtered_objects /shared/stack/nova/nova/filters.py:84
2015-04-07 14:08:51.400 DEBUG nova.filters [req-d417e042-2d61-4fc5-a38b-8898f4f512d0 admin demo] Filter ImagePropertiesFilter returned 1 host(s) from (pid=47627) get_filtered_objects /shared/stack/nova/nova/filters.py:84
2015-04-07 14:08:51.400 DEBUG nova.filters [req-d417e042-2d61-4fc5-a38b-8898f4f512d0 admin demo] Filter PciPassthroughFilter returned 1 host(s) from (pid=47627) get_filtered_objects /shared/stack/nova/nova/filters.py:84
2015-04-07 14:08:53.348 INFO nova.filters [req-d417e042-2d61-4fc5-a38b-8898f4f512d0 admin demo] Filter NUMATopologyFilter returned 0 hosts

Changed in nova:
assignee: nobody → Przemyslaw Czesnowicz (pczesno)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/171304

Changed in nova:
status: New → In Progress
tags: added: kilo-rc-potential
Revision history for this message
John Garbutt (johngarbutt) wrote :

due to scope of impact, in terms of number of users, I am thinking its medium, a good one to get in if we could.

Changed in nova:
importance: Undecided → Medium
tags: added: kilo-backport-potential
Revision history for this message
Nikola Đipanov (ndipanov) wrote :

i'd say we should give our best to get this in for kilo, as one of the BPs completed in Kilo is completely broken without the patch + the patch itself is quite small (basically adds a field to an object).

Changed in nova:
assignee: Przemyslaw Czesnowicz (pczesno) → Nikola Đipanov (ndipanov)
Changed in nova:
assignee: Nikola Đipanov (ndipanov) → Przemyslaw Czesnowicz (pczesno)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)

Reviewed: https://review.openstack.org/171304
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7db1ebc66c59205f78829d1e9cd10dcc1201d798
Submitter: Jenkins
Branch: master

commit 7db1ebc66c59205f78829d1e9cd10dcc1201d798
Author: Przemyslaw Czesnowicz <email address hidden>
Date: Tue Apr 7 16:31:05 2015 +0100

    Add numa_node field to PciDevicePool

    Without this field, PciDevicePool.from_dict will treat numa_node key in
    the dict as a tag, which in turn means that the scheduler client will
    drop it when converting stats to objects before reporting.

    Converting it back to dicts on the scheduler side thus will not have
    access to the numa_node information which would cause any requests that
    will look for the exact match between the device and instance NUMA nodes
    in the NUMATopologyFilter to fail.

    Change-Id: I7381f909620e8e787178c0be9a362f8d3eb9ff7d
    Closes-Bug: #1441169

Changed in nova:
status: In Progress → Fix Committed
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/175788

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (stable/kilo)

Reviewed: https://review.openstack.org/175788
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=7a609f153808f7cee1edbbb36accc292fa8df0d0
Submitter: Jenkins
Branch: stable/kilo

commit 7a609f153808f7cee1edbbb36accc292fa8df0d0
Author: Przemyslaw Czesnowicz <email address hidden>
Date: Tue Apr 7 16:31:05 2015 +0100

    Add numa_node field to PciDevicePool

    Without this field, PciDevicePool.from_dict will treat numa_node key in
    the dict as a tag, which in turn means that the scheduler client will
    drop it when converting stats to objects before reporting.

    Converting it back to dicts on the scheduler side thus will not have
    access to the numa_node information which would cause any requests that
    will look for the exact match between the device and instance NUMA nodes
    in the NUMATopologyFilter to fail.

    Closes-Bug: #1441169
    (cherry picked from commit 7db1ebc66c59205f78829d1e9cd10dcc1201d798)

    Conflicts:
     nova/tests/unit/objects/test_objects.py

    Change-Id: I7381f909620e8e787178c0be9a362f8d3eb9ff7d

Thierry Carrez (ttx)
tags: removed: kilo-backport-potential kilo-rc-potential
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to nova (master)

Fix proposed to branch: master
Review: https://review.openstack.org/179284

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to nova (master)
Download full text (18.1 KiB)

Reviewed: https://review.openstack.org/179284
Committed: https://git.openstack.org/cgit/openstack/nova/commit/?id=5228d4e418734164ffa5ccd91d2865d9cc659c00
Submitter: Jenkins
Branch: master

commit 906ab9d6522b3559b4ad36d40dec3af20397f223
Author: He Jie Xu <email address hidden>
Date: Thu Apr 16 07:09:34 2015 +0800

    Update rpc version aliases for kilo

    Update all of the rpc client API classes to include a version alias
    for the latest version implemented in Kilo. This alias is needed when
    doing rolling upgrades from Kilo to Liberty. With this in place, you can
    ensure all services only send messages that both Kilo and Liberty will
    understand.

    Closes-Bug: #1444745

    Conflicts:
     nova/conductor/rpcapi.py

    NOTE(alex_xu): The conflict is due to there are some logs already added
    into the master.

    Change-Id: I2952aec9aae747639aa519af55fb5fa25b8f3ab4
    (cherry picked from commit 78a8b5802ca148dcf37c5651f75f2126d261266e)

commit f191a2147a21c7e50926b288768a96900cf4c629
Author: Hans Lindgren <email address hidden>
Date: Fri Apr 24 13:10:39 2015 +0200

    Add security group calls missing from latest compute rpc api version bump

    The recent compute rpc api version bump missed out on the security group
    related calls that are part of the api.

    One possible reason is that both compute and security group client side
    rpc api:s share a single target, which is of little value and only cause
    mistakes like this.

    This change eliminates future problems like this by combining them into
    one to get a 1:1 relationship between client and server api:s.

    Change-Id: I9207592a87fab862c04d210450cbac47af6a3fd7
    Closes-Bug: #1448075
    (cherry picked from commit bebd00b117c68097203adc2e56e972d74254fc59)

commit a2872a9262985bd0ee2c6df4f7593947e0516406
Author: Dan Smith <email address hidden>
Date: Wed Apr 22 09:02:03 2015 -0700

    Fix migrate_flavor_data() to catch instances with no instance_extra rows

    The way the query was being performed previously, we would not see any
    instances that didn't have a row in instance_extra. This could happen if
    an instance hasn't been touched for several releases, or if the data
    set is old.

    The fix is a simple change to use outerjoin instead of join. This patch
    includes a test that ensures that instances with no instance_extra rows
    are included in the migration. If we query an instance without such a
    row, we create it before doing a save on the instance.

    Closes-Bug: #1447132
    Change-Id: I2620a8a4338f5c493350f26cdba3e41f3cb28de7
    (cherry picked from commit 92714accc49e85579f406de10ef8b3b510277037)

commit e3a7b83834d1ae2064094e9613df75e3b07d77cd
Author: OpenStack Proposal Bot <email address hidden>
Date: Thu Apr 23 02:18:41 2015 +0000

    Updated from global requirements

    Change-Id: I5d4acd36329fe2dccb5772fed3ec55b442597150

commit 8c9b5e620eef3233677b64cd234ed2551e6aa182
Author: Divya <email address hidden>
Date: Tue Apr 21 08:26:29 2015 +0200

    Control create/delete flavor api permissions using policy.json

    The permissions of ...

Thierry Carrez (ttx)
Changed in nova:
milestone: none → liberty-1
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in nova:
milestone: liberty-1 → 12.0.0
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.