isolcpu not working properly in ubuntu 2204

Bug #2013265 reported by ping
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-realtime
Incomplete
Medium
Joseph Salisbury

Bug Description

in recent tests I noticed in ubuntu22.04, the isolcpus does not work as expected:

    $ cat /proc/cmdline
    BOOT_IMAGE=/boot/vmlinuz-5.15.0-67-generic root=/dev/mapper/vg5d7s5-root ro crashkernel=128M resume=UUID=b7cb66df-53d3-4a47-a930-4fd51a5b7e2e rd.lvm.lv=5a5s11-vg00/lv_root rhgb quiet default_hugepagesz=1GB hugepagesz=1G hugepages=100 hugepagesz=2M hugepages=4096 intel_iommu=on iommu=pt nohz_full=2-21,26-45,50-69,74-93 isolcpus=2-21,50-69,26-45,74-93 rcu_nocb=2-21,26-45,50-69,74-93 quiet splash vt.handoff=7

    $ ps -eT -o psr,tid,comm,pid,ppid,cmd,pcpu,stat | grep " 27 "
      2 27 idle_inject/2 27 2 [idle_inject/2] 0.0 S
     27 177 cpuhp/27 177 2 [cpuhp/27] 0.0 S
     27 178 idle_inject/27 178 2 [idle_inject/27] 0.0 S
     27 179 migration/27 179 2 [migration/27] 0.0 S
     27 180 ksoftirqd/27 180 2 [ksoftirqd/27] 0.0 S
     27 182 kworker/27:0H-k 182 2 [kworker/27:0H-kblockd] 0.0 I<
     27 762 irq/41-pciehp 762 2 [irq/41-pciehp] 0.0 S
     27 1554 systemd-network 1554 1 /lib/systemd/systemd-networ 0.0 Ss
     27 1646 containerd 1626 1 /usr/local/bin/containerd 0.0 Ssl
     27 13923 containerd 1626 1 /usr/local/bin/containerd 0.0 Ssl
     27 22258 containerd 1626 1 /usr/local/bin/containerd 0.0 Ssl
     27 1555307 containerd 1626 1 /usr/local/bin/containerd 0.0 Ssl
     27 2203 node-cache 2125 2048 /node-cache -localip 169.25 0.0 Ssl
     27 2209 node-cache 2125 2048 /node-cache -localip 169.25 0.0 Ssl
     27 2246 node-cache 2125 2048 /node-cache -localip 169.25 0.0 Ssl
     27 2274 kube-proxy 2251 2150 /usr/local/bin/kube-proxy - 0.0 Ssl
     27 2453 kube-proxy 2251 2150 /usr/local/bin/kube-proxy - 0.0 Ssl
     27 930836 kube-proxy 2251 2150 /usr/local/bin/kube-proxy - 0.0 Ssl
     27 3141 kworker/27:1H-k 3141 2 [kworker/27:1H-kblockd] 0.0 I<
     27 200707 cainjector 3468 2299 /app/cmd/cainjector/cainjec 0.0 Ssl
     27 3615 webhook 3602 2282 /app/cmd/webhook/webhook -- 0.0 Ssl
     27 3616 webhook 3602 2282 /app/cmd/webhook/webhook -- 0.0 Ssl
     27 3620 webhook 3602 2282 /app/cmd/webhook/webhook -- 0.0 Ssl
     27 1843240 lcore-worker-10 1843230 1842888 /contrail-vrouter-dpdk --no 99.6 RLl
     27 1917469 kworker/27:0-ev 1917469 2 [kworker/27:0-events] 0.0 I
     27 1938541 kworker/27:1-ev 1938541 2 [kworker/27:1-events] 0.0 I
     22 1945514 grep 1945514 1499070 grep --color=auto 27 0.0 S+

so according to the cmdline output, core#27 should be isolated (not under OS
level scheduling), the purpose is to manually assign this core to my dpdk
applications. but I still see many other unrelated processes are running on
this core. I understand some applications (like "containerd" here) may ask for
it explicitly, but my questions here are:

* is there a simpler way that I can isolate the core completely out of OS scheduling
* is there anything changes related to isolcpus between Ubuntu20.04 (kernel
  5.4) and Ubuntu22.04(kernel 5.15)?

Changed in ubuntu-realtime:
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This appears to be a duplicate of bug 1992164. Can you review that bug an confirm or deny?

There are some kernel threads that need to run on every cpu regardless of isolation level.

You may also want to also add the following boot parameters (Specify house keeping cores for irqs and add managed_irq and domain to isolcpus):

kthread_cpus=HOUSEKEEPING_CORE_NUM(s)
irqaffinity=HOUSEKEEPING_CORE_NUM(s)
isolcpus=managed_irq,domain,2-21,50-69,26-45,74-93

Revision history for this message
ping (itestitest) wrote :

Thanks Joseph. that looks like a possible fix.
I'm going to test it and feedback.

Revision history for this message
ping (itestitest) wrote :
Download full text (3.8 KiB)

I'm not sure if it is getting "better", but essentially I still see the issue:

this is my current /etc/default/grub

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash cpuhp.off"

GRUB_CMDLINE_LINUX="crashkernel=128M resume=UUID=b7cb66df-53d3-4a47-a930-4fd51a5b7e2e rd.lvm.lv=5a5s11-vg00/lv_root rhgb quiet default_hugepagesz=1GB hugepagesz=1G hugepages=100 hugepagesz=2M hugepages=4096 intel_iommu=on iommu=pt nohz_full=2-21,26-45,50-69,74-93 rcu_nocb=2-21,26-45,50-69,74-93 kthread_cpus=0,1,22,23,24,25,46,47,48,49,70,71,72,73 irqaffinity=0,1,22,23,24,25,46,47,48,49,70,71,72,73 isolcpus=managed_irq,domain,2-21,50-69,26-45,74-93"

and after reboot I got this:

== ps -eT -o psr,tid,comm,pid,ppid,cmd,pcpu,stat | awk '$1==4'

----
  4 38 cpuhp/4 38 2 [cpuhp/4] 0.0 S
  4 39 idle_inject/4 39 2 [idle_inject/4] 0.0 S
  4 40 migration/4 40 2 [migration/4] 0.0 S
  4 41 ksoftirqd/4 41 2 [ksoftirqd/4] 0.0 S
  4 43 kworker/4:0H-kb 43 2 [kworker/4:0H-kblockd] 0.0 I<
  4 2611 cainjector 2172 2111 /app/cmd/cainjector/cainjec 0.0 Ssl
  4 14395 kube-proxy 2315 2192 /usr/local/bin/kube-proxy - 0.0 Ssl
  4 25549 kube-proxy 2315 2192 /usr/local/bin/kube-proxy - 0.0 Ssl
  4 2344 node-cache 2322 2212 /node-cache -localip 169.25 0.0 Ssl
  4 3082 kworker/4:1H-ev 3082 2 [kworker/4:1H-events_highpr 0.0 I<
  4 136438 kworker/4:2-eve 136438 2 [kworker/4:2-events] 0.0 I
  4 152474 lcore-worker-10 152430 152249 /contrail-vrouter-dpdk --no 99.7 RLl
  4 186524 kworker/4:1-eve 186524 2 [kworker/4:1-events] 0.0 I
  4 194343 kworker/4:0-mm_ 194343 2 [kworker/4:0-mm_percpu_wq] 0.0 I
11
core: 5
----

== ps -eT -o psr,tid,comm,pid,ppid,cmd,pcpu,stat | awk '$1==5'

----
  5 44 cpuhp/5 44 2 [cpuhp/5] 0.0 S
  5 45 idle_inject/5 45 2 [idle_inject/5] 0.0 S
  5 46 migration/5 46 2 [migration/5] 0.0 S
  5 47 ksoftirqd/5 47 2 [ksoftirqd/5] 0.0 S
  5 49 kworker/5:0H-ev 49 2 [kworker/5:0H-events_highpr 0.0 I<
  5 602 kworker/5:1-eve 602 2 [kworker/5:1-events] 0.0 I
  5 784 kworker/5:2 784 2 [kworker/5:2] 0.0 I
  5 30615 webhook 2086 2036 /app/cmd/webhook/webhook -- 0.0 Ssl
  5 32648 kworker/5:1H-kb 32648 2 [kworker/5:1H-kblockd] 0.0 I<
  5 152475 lcore-worker-11 152430 152249 /contrail-vrouter-dpdk --no 99.9 RLl
12
core: 6
----

== ps -eT -o psr,tid,comm,pid,ppid,cmd,pcpu,stat | awk '$1==6'

----
  6 50 cpuhp/6 50 2 [cpuhp/6] 0.0 S
  6 51 idle_inject/6 51 2 [idle_inject/6] 0.0 S
  6 52 migration/6 52 2 [migration/6] 0.0 S
  6 53 ksoftirqd/6 53 2 [ksoftirqd/6] 0.0 S
  6 55 kworker/6:0H-ev 55 2 [kworker/6:0H-events_highpr ...

Read more...

Revision history for this message
ping (itestitest) wrote :

and I got this:

root@5d7s5:~# dmesg | grep "passed to"
[ 1.917649] Unknown kernel command line parameters "splash BOOT_IMAGE=/boot/vmlinuz-5.15.0-67-generic nohz_full=2-21,26-45,50-69,74-93 rcu_nocb=2-21,26-45,50-69,74-93 kthread_cpus=0,1,22,23,24,25,46,47,48,49,70,71,72,73", will be passed to user space.

So it sounds like these parameters are not supported with my kernel.
any advice?

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

It looks like you are actually using the -generic kernel and not the real-time kernel.

Revision history for this message
ping (itestitest) wrote :

Joseph:
I'm not sure. will that make any difference?
the problem here is that the isolcpu does not work as expected.
we're trying to reach high performance with DPDK applications, and that requires cpu isolations but that does not seem to be working well. I'm still seeing some kernel or user threads are scheduled on the isolated CPUs.
will the realtime kernel, or patch (as described here https://discourse.ubuntu.com/t/enable-real-time-ubuntu-22-04-lts-beta-kernel/28189), fix the issue by any chance.
what is the guideline to troubleshoot this isolcpu issue?

thank you.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

The isolcpus feature isolates one or more cores from the scheduler. Those coresare then only accessible to a process via an explicitly set affinity. The isolcpus feature reduces interruptions on the isolated cores, but it does not fully eliminate them, since isolcpus is only applied to user-space threads.

Some housekeeping work is still necessary on all cores. For example, the timer IRQ still needs to happen under certain conditions, and some kernel activities need to dispatch a kworker for all online cores. Other such kernel threads that may need to run on isolated cores are ksoftirqd, kworker, rcu callbacks and migration to name a few. There are boot parameters that allow you to move rcu offload callbacks away from isolated cores. In addition, if your isolated real-time task makes a system call, that kernel will handle that system call on the core where that rt thread is running.

Also, isolcpus is a deprecated kernel parameter and cpusets in cgroups is the current preferred mechanism.

For the kworker/u* threads, you use a special interface to do it. The interface is the following file:

/sys/devices/virtual/workqueue/cpumask

It was implemented on the following commit:

042f7df workqueue: Allow modifying low level unbound workqueue cpumask

The default CPU mask is the "cpu_possible_mask".

So in summary, even with isolcpus, you will still see some kernel threads run on isolated CPUs in the current Linux kernel real-time implementation.

Changed in ubuntu-realtime:
status: Triaged → Incomplete
assignee: nobody → Joseph Salisbury (jsalisbury)
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.