AIO: Some platform processes affined to the wrong cores

Bug #1900174 reported by Jim Gauld
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Jim Gauld

Bug Description

Brief Description
-----------------
On AIO low-latency system, noticed a few tasks with floating affinity masks.
This does not align with the process engineering to isolate platform, from applications.

Inspecting output from: ps-sched.sh, see the following:
- AIO systems with > 64 cpus, drbd_w_* affined to specific cores including isolcpus and application cores
- kswapd<x> process floating across entire numa nodes

Root cause is known.

Severity
--------
Major: For low-latency systems, these tasks can wake up and cause degraded performance.

Steps to Reproduce
------------------
AIO, just install system.
Inspect output of: ps-sched.sh (also included in 'collect'.)

Expected Behavior
------------------
All platform tasks should have platform affinity mask.

Actual Behavior
----------------
In the case when the DRBD tasks don't get correct affinity, we see kernel log like this:
2020-10-13T20:55:34.079 controller-0 kernel: warning [ 269.423462] drbd drbd-dockerdistribution: Overflow in bitmap_parse(300000003), truncating to 64 bits

Taking ps-sched.sh, see the drbd_w_* tasks don't have 0x300000003 :
controller-0:~$ ps-sched.sh | grep drbd
 84375 84375 2 S TS -20 - 0 0x300000003 32 drbd-reissue [drbd-reissue]
 90678 90678 2 S TS -20 - 0 0x300000003 33 drbd8_submit [drbd8_submit]
 90682 90682 2 S TS -20 - 0 0x300000003 33 drbd7_submit [drbd7_submit]
 90687 90687 2 S TS -20 - 0 0x300000003 1 drbd5_submit [drbd5_submit]
 90692 90692 2 S TS -20 - 0 0x300000003 1 drbd0_submit [drbd0_submit]
 90697 90697 2 S TS -20 - 0 0x300000003 1 drbd2_submit [drbd2_submit]
 90702 90702 2 S TS -20 - 0 0x300000003 0 drbd1_submit [drbd1_submit]
 90709 90709 2 S TS 0 - 20 0x1 0 drbd_w_drbd-doc [drbd_w_drbd-doc]
 90715 90715 2 S TS 0 - 20 0x2 1 drbd_w_drbd-etc [drbd_w_drbd-etc]
 90725 90725 2 R TS 0 - 20 0x4 2 drbd_w_drbd-ext [drbd_w_drbd-ext]
 90731 90731 2 R TS 0 - 20 0x8 3 drbd_w_drbd-pgs [drbd_w_drbd-pgs]
 90737 90737 2 R TS 0 - 20 0x10 4 drbd_w_drbd-pla [drbd_w_drbd-pla]
 90746 90746 2 S TS 0 - 20 0x20 5 drbd_w_drbd-rab [drbd_w_drbd-rab]
 90749 90749 2 S TS 0 - 20 0x300000003 33 drbd_r_drbd-doc [drbd_r_drbd-doc]
 90751 90751 2 S TS 0 - 20 0x300000003 33 drbd_r_drbd-etc [drbd_r_drbd-etc]
 90754 90754 2 S TS 0 - 20 0x300000003 1 drbd_r_drbd-ext [drbd_r_drbd-ext]
 90756 90756 2 S TS 0 - 20 0x300000003 33 drbd_r_drbd-pgs [drbd_r_drbd-pgs]
 90758 90758 2 S TS 0 - 20 0x300000003 33 drbd_r_drbd-pla [drbd_r_drbd-pla]
 90760 90760 2 S TS 0 - 20 0x300000003 1 drbd_r_drbd-rab [drbd_r_drbd-rab]
 93637 93637 2 S TS 0 - 20 0x300000003 32 jbd2/drbd8-8 [jbd2/drbd8-8]
 93954 93954 2 S TS 0 - 20 0x300000003 1 jbd2/drbd5-8 [jbd2/drbd5-8]
 93961 93961 2 S TS 0 - 20 0x300000003 32 jbd2/drbd7-8 [jbd2/drbd7-8]
 93984 93984 2 S TS 0 - 20 0x300000003 33 jbd2/drbd2-8 [jbd2/drbd2-8]
 94214 94214 2 S TS 0 - 20 0x300000003 1 jbd2/drbd1-8 [jbd2/drbd1-8]
 94245 94245 2 S TS 0 - 20 0x300000003 1 jbd2/drbd0-8 [jbd2/drbd0-8]

On a different lab, the kswapd* has per numa affinity mask:
controller-0:~$ ps-sched.sh |grep kswapd
   468 468 2 S TS 0 - 20 0x3fffff 1 kswapd0 [kswapd0]
   469 469 2 S TS 0 - 20 0xfffffc00000 22 kswapd1 [kswapd1]

Reproducibility
---------------
100 percent reproducible

System Configuration
--------------------
DRBD: AIO low-latency with >= 64 cpus
kswapd: AIO low-latency

Branch/Pull Time/Commit
-----------------------
- current load

Last Pass
---------
- day one issue

Timestamp/Logs
--------------
- na

Test Activity
-------------
Evaluation.

Workaround
----------
Manually use taskset to change affinity of tasks to match platform cores, but this does not survive reboot.

Jim Gauld (jgauld)
Changed in starlingx:
assignee: nobody → Jim Gauld (jgauld)
Revision history for this message
Ghada Khalil (gkhalil) wrote :

stx.5.0 / medium priority - issue results in unpredictable performance, but doesn't cause functional issues.
If someone faces a serious issue in a previous release, this can be considered to port back then.

tags: added: stx.5.0 stx.config
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/758621

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to utilities (master)

Fix proposed to branch: master
Review: https://review.opendev.org/758625

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/758621
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=b2dd5fb464590e6253070cb604d62313c90d1791
Submitter: Zuul
Branch: master

commit b2dd5fb464590e6253070cb604d62313c90d1791
Author: Jim Gauld <email address hidden>
Date: Fri Oct 16 15:26:35 2020 -0400

    Fix DRBD task affinity to platform cores

    This changes the input format of DRBD resource config option
    cpu-mask so it is correctly parsed in the kernel. The underlying
    bitmap_parse routine expects large hex values delimited every 8
    characters with a comma.

    e.g., On large-cpu systems, we would see the following kern.log :
    2020-10-13T20:55:34.079 controller-0 kernel: warning [ 269.423462] drbd
    drbd-dockerdistribution: Overflow in bitmap_parse(300000003), truncating
    to 64 bits

    This resulted in drbd_w_* tasks affined to individual cores instead of
    platform cores.

    Partial-Bug: 1900174
    Change-Id: Ib31d3c8b6d59b94f06d172143497678b0c9a7bc1
    Signed-off-by: Jim Gauld <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to utilities (master)

Reviewed: https://review.opendev.org/758625
Committed: https://git.openstack.org/cgit/starlingx/utilities/commit/?id=db2156eaddeafec0523aeccda45447b3069eb059
Submitter: Zuul
Branch: master

commit db2156eaddeafec0523aeccda45447b3069eb059
Author: Jim Gauld <email address hidden>
Date: Fri Oct 16 15:41:34 2020 -0400

    Affine kswapd* kernel threads to platform cores

    The kswapd* kernel tasks are per NUMA node and have floating
    cpu affinity masks spanning those nodes.

    On AIO low-latency systems, this affines the kswapd* kernel tasks to
    platform cores. This is a performance improvement for low-latency
    sensitive applications.

    Partial-Bug: 1900174
    Change-Id: I20db19978362997b23a69bf591b8e7c23096f492
    Signed-off-by: Jim Gauld <email address hidden>

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to integ (master)

Fix proposed to branch: master
Review: https://review.opendev.org/761464

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to integ (master)

Reviewed: https://review.opendev.org/761464
Committed: https://git.openstack.org/cgit/starlingx/integ/commit/?id=9dd333fbb77fdc5b9d958f678548acfd2c369d59
Submitter: Zuul
Branch: master

commit 9dd333fbb77fdc5b9d958f678548acfd2c369d59
Author: Jim Gauld <email address hidden>
Date: Wed Oct 28 17:31:03 2020 -0400

    Format DRBD resource cpu-mask to support 64 or larger cpus

    This changes the input format of DRBD resource config option
    cpu-mask so it is correctly parsed in the kernel. The underlying
    bitmap_parse routine expects large hex values delimited every 8
    characters with a comma.

    e.g., On large cpu systems, we would see the following kern.log :
    2020-10-13T20:55:34.079 controller-0 kernel: warning [ 269.423462] drbd
    drbd-dockerdistribution: Overflow in bitmap_parse(300000003), truncating
    to 64 bits

    This resulted in drbd_w_* tasks affined to individual cores instead of
    platform cores.

    Change-Id: I59caaa293af0c905eddef00b7b03da921e4510b7
    Closes-Bug: 1900174
    Signed-off-by: Jim Gauld <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to utilities (f/centos8)

Fix proposed to branch: f/centos8
Review: https://review.opendev.org/c/starlingx/utilities/+/792213

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to utilities (f/centos8)
Download full text (29.4 KiB)

Reviewed: https://review.opendev.org/c/starlingx/utilities/+/792213
Committed: https://opendev.org/starlingx/utilities/commit/c4d042615e6fe8944a4628fa1a29e86e012a9bf5
Submitter: "Zuul (22348)"
Branch: f/centos8

commit 557cada006fd5a3bd81ad5af387c37657801f8c5
Author: Fernando Theirs <email address hidden>
Date: Thu May 13 16:21:47 2021 -0300

    Collect is missing etcdctl output

    When the collect tool is run, it does not include the contents
    of the etcd database. Fixes have been made for this to dump the
    contents in "etcd_database.dump" file.

    Verify if etcd access is secured. In that case, certificates
    will be used.

    Closes-Bug: 1911935

    Signed-off-by: Fernando Theirs <email address hidden>
    Change-Id: Idbc60edffa978a7a6bead939a4eb54f4abae29a6

commit 6045b1b8a0d8ed6a94d06cdfc994bf1a5fa9dbb5
Author: Jim Gauld <email address hidden>
Date: Thu May 6 11:58:34 2021 -0400

    Provide utility script is-rootdisk-device.sh

    This provides a utility script to determine which disk contains the root
    filesystem. This can also be used as a helper function for io-scheduler
    udev rules that require specific configuration for root disk.

    Example usage:
    /usr/local/bin/is-rootdisk-device.sh
    ROOTDISK_DEVICE=sda

    /usr/local/bin/is-rootdisk-device.sh /dev/sda
    ROOTDISK_DEVICE=sda

    /usr/local/bin/is-rootdisk-device.sh /dev/sdb
    (i.e., no output)

    Partial-Bug: 1927515
    Signed-off-by: Jim Gauld <email address hidden>
    Change-Id: Ib0d4a161a407b08d294c5ff9aa0b7590961e18c9

commit 88a678f142cfe86c58b6405aae6babbc08de0e8f
Author: Chen, Haochuan Z <email address hidden>
Date: Fri Mar 26 09:09:41 2021 +0800

    Add packages to stx-ceph-manager image

    This update installs ceph-mgr, ceph-mon, ceph-osd packages as part
    of stx-ceph-manager image.

    Partial-Bug: 1920882

    Change-Id: I4afde8b1476e14453fac8561f1edde7360b8ee96
    Signed-off-by: Chen, Haochuan Z <email address hidden>

commit 09b3542fcc6cc0300a9cae0d302225e6977780f3
Author: Scott Little <email address hidden>
Date: Thu Mar 25 11:49:49 2021 -0400

    Set SW_VERSION 21.05

    Prep for the StarlingX 5.0 release.
    SW_VERSION, also known as PLATFORM_RELEASE, uses YY.MM format.

    Story: 2008055
    Task: 42115
    Signed-off-by: Scott Little <email address hidden>
    Change-Id: If7c91a2b523358269ae4850961cf4189ffcd7a75

commit ae4cefd0e2a0001476782c31e1003810da2b4838
Author: Chris Friesen <email address hidden>
Date: Thu Mar 4 18:04:12 2021 -0500

    add dcmanager-audit-worker to patch restart script

    Need to add the new process to the patch restart script.

    Story: 2007267
    Task: 41999
    Signed-off-by: Chris Friesen <email address hidden>
    Change-Id: If5faa806bd0d52ddbf1343b064959f4207cf975a

commit 27fce5a52321f3014fa8ae9181d344bc774289da
Author: Enzo Candotti <email address hidden>
Date: Mon Feb 1 12:47:38 2021 -0300

    Add resource CPU and memory info in collect

    This adds commands to collect more data to debug
    resource allocations and...

tags: added: in-f-centos8
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.