Fuel for OpenStack

[dpdk]OpenvSwitch pmd threads fail to start due to incorrect cpu pinning if host has more than 1 NUMA node

Bug #1584006 reported by Mikhail Chernik on 2016-05-20

This bug affects 1 person

	Status	Importance	Assigned to	Milestone
Fuel for OpenStack	Fix Committed	High	Arthur Svechnikov	Fuel for OpenStack 10.0
Mitaka	Fix Released	High	Arthur Svechnikov	Fuel for OpenStack 9.0
Newton	Fix Committed	High	Arthur Svechnikov	Fuel for OpenStack 10.0

Bug Description

Environment: MOS 9.0 ISO 370, hardware lab

Steps to reproduce:
1. Create environment, add a host with >1 NUMA nodes as compute
2. Configure hugepages and DPDK CPU pinning (min 4096 2M hugepages, at least one CPU pinned for DPDK)
3. Turn on DPDK on one interface and move Private network to this interface, deploy cluster
4. Check OVS process CPU utilization and threads on compute, e.g. with "top -n 1 -bH -p `pgrep ovs-vswitchd`"

Expected result:
There are OVS threads with name pmdXX, which fully utilize 1 CPU core each

Actual result:
No pmd threads, error message in /var/log/openwswitch/ovs-vswitchd.log:
2016-05-20T02:51:21.404Z|00021|dpif_netdev|ERR|Cannot create pmd threads due to out of unpinned cores on numa node

Additional information:

PMD threads are successfully started, if NIC and all cores in pmd-cpu-mask are on same NUMA node

Additionally, format of pmd-cpu-mask causes a warning in /var/log/openwswitch/ovs-vswitchd.log:
2016-05-20T02:51:20.779Z|00018|ovs_numa|WARN|Invalid cpu mask: x

Diagnostic snapshot: http://mos-scale-share.mirantis.com/fuel-snapshot-2016-05-20_08-51-41.tar.xz

See original description

Tags:

Revision history for this message

Mikhail Chernik (mchernik) wrote on 2016-05-20:

node-2-logs.tar.gz Edit (10.5 KiB, application/x-tar)

Revision history for this message

Mikhail Chernik (mchernik) wrote on 2016-05-20:

node-4-logs.tar.gz Edit (11.3 KiB, application/x-tar)

Dmitry Klenov (dklenov) on 2016-05-20

tags:

added: area-python

Mikhail Chernik (mchernik) on 2016-05-20

description:

updated

Mikhail Chernik (mchernik) on 2016-05-20

summary:

[dpdk]OpenvSwitch pmd threads fail to start due to incorrect cpu pinning
- if hoast has more than 1 NUMA node
+ if host has more than 1 NUMA node

Revision history for this message

Mikhail Chernik (mchernik) wrote on 2016-05-20:

To sum up:
There are 2 CPU masks for OVS+DPDK in astute.yaml:
1) ovs_core_mask, which is populated to /etc/default/openvswitch-switch ( -c 0xXXX )
2) ovs_pmd_core_mask, shich is populated to OVS database ( get Open_vSwitch . other_config:pmd-cpu-mask ). It should be without leading 0x

For successful operation both parameters should mask CPU cores from NUMA node to which NIC is attached.

Revision history for this message

Atsuko Ito (yottatsa) wrote on 2016-05-25:

It's basically correct. There is full summary for further understanding and docs team.

Cores for PMD processes SHOULD be from NUMA/Cluster*, where the NICs is located, minimum 1 PMD per NUMA/Cluster. Additional cores MAY be scheduled from other NUMA/Clusters*, where NICs or instances located (see perf notes**).

Core for ovs core process should be from one of NUMA/Cluster*, where PMD is scheduled.
Memory SHOULD be allocated on NUMA, where PMD is scheduled.

* Cluster is used when box is in cluster-on-die mode, cluster == socket.

** For performance, more than 1 PMD per NUMA could be scheduled. For VM-to-wire it should be NUMA/cluster with NICs, for VM-to-VM inside box it should be NUMA where the instances is running. Rule of thumb: 1 PMD could process 3 Mpps of traffic. n-dpdk-rxqs should be adjusted to number of PMD per NIC.

E.g. to utilize 10GigE interface bidirectionally we need 12 Mpps each direction, so we need 8 PMD (8PMD * 3 Mpps = 12 Mpps * 2 (in/out)). ovs-vsctl set Open_vSwitch . other_config:n-dpdk-rxqs=8.

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-05-27: Fix proposed to fuel-web (master)

Fix proposed to branch: master
Review: https://review.openstack.org/322212

Changed in fuel:
status:	Confirmed → In Progress

OpenStack Infra (hudson-openstack) on 2016-06-02

Changed in fuel:
assignee:	Arthur Svechnikov (asvechnikov) → Fedor Zhadaev (fzhadaev)

OpenStack Infra (hudson-openstack) on 2016-06-03

Changed in fuel:
assignee:	Fedor Zhadaev (fzhadaev) → Arthur Svechnikov (asvechnikov)

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-06-07: Fix merged to fuel-web (master)

Reviewed: https://review.openstack.org/322212
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=76e270ef966dd7735eac3e87f94bb0a39e49388c
Submitter: Jenkins
Branch: master

commit 76e270ef966dd7735eac3e87f94bb0a39e49388c
Author: Artur Svechnikov <email address hidden>
Date: Fri May 27 17:43:24 2016 +0300

Change CPU distribution

CPU distribution mechanism should be changed due
to incorect requirements to nova and dpdk CPUs allocation

    Changes:
     * Change CPU distribution
     * Add function for recognizing DPDK NICs for node
     * Remove requirement of enabled hugepages for
       DPDK NICs (it's checked before deployment)
     * Change HugePages distribution. Now it take into
       account Nova CPUs placement

    Requirements Before:
     DPDK's CPUs should be located on the same NUMAs as
     Nova CPUs

    Requirements Now:
     1. DPDK component CPU pinning has two parts:
         * OVS pmd core CPUs - These CPUs must be placed on the
           NUMAs where DPDK NIC is located. Since DPDK NIC can
           handle about 12 Mpps/s and 1 CPU can handle about
           3 Mpps/s there is no necessity to place more than
           4 CPUs per NIC. Let's name all remained CPUs as
           additional CPUs.
         * OVS Core CPUs - 1 CPU is enough and that CPU should
           be taken from any NUMA where at least 1 OVS pmd core
           CPU is located

     2. To improve Nova and DPDK performance, all additional CPUs
        should be distributed along with Nova's CPUs as
        OVS pmd core CPUs.

Change-Id: Ib2adf39c36b2e1536bb02b07fd8b5af50e3744b2
Closes-Bug: #1584006

Changed in fuel:
status:	In Progress → Fix Committed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-06-07: Fix proposed to fuel-web (stable/mitaka)

Fix proposed to branch: stable/mitaka
Review: https://review.openstack.org/326392

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2016-06-07: Fix merged to fuel-web (stable/mitaka)

Reviewed: https://review.openstack.org/326392
Committed: https://git.openstack.org/cgit/openstack/fuel-web/commit/?id=260b0b8f99bdfeb784be1a0b7374cd284d3b68e9
Submitter: Jenkins
Branch: stable/mitaka

commit 260b0b8f99bdfeb784be1a0b7374cd284d3b68e9
Author: Artur Svechnikov <email address hidden>
Date: Fri May 27 17:43:24 2016 +0300

Change CPU distribution

CPU distribution mechanism should be changed due
to incorect requirements to nova and dpdk CPUs allocation

    Requirements Before:
     DPDK's CPUs should be located on the same NUMAs as
     Nova CPUs

     2. To improve Nova and DPDK performance, all additional CPUs
        should be distributed along with Nova's CPUs as
        OVS pmd core CPUs.

    Change-Id: Ib2adf39c36b2e1536bb02b07fd8b5af50e3744b2
    Closes-Bug: #1584006
    (cherry picked from commit 76e270ef966dd7735eac3e87f94bb0a39e49388c)

Revision history for this message

Sergii (sgudz) wrote on 2016-06-14:

Verified on MOS9.0 ISO 459. Fixed

Revision history for this message

OpenStack Infra (hudson-openstack) wrote on 2017-02-08: Change abandoned on fuel-qa (master)

#10

Change abandoned by Vladimir Khlyunev (<email address hidden>) on branch: master
Review: https://review.openstack.org/320932
Reason: 8 month ago; also this test scenario was included to multiqueue tests

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Bug attachments

Add attachment

Remote bug watches

Bug watches keep track of this bug in other bug trackers.