High CPU usage of kworker/ksoftirqd

Bug #1488426 reported by AceLan Kao on 2015-08-25
126
This bug affects 23 people
Affects Status Importance Assigned to Milestone
HWE Next
Undecided
Unassigned
linux (Ubuntu)
Undecided
Unassigned

Bug Description

kworker consuming 71.5% cpu resource
ksoftirqd consuming 28.9% cpu resource

It leads to power consumption issue and sometimes leads to BT does not work.

AceLan Kao (acelankao) wrote :

Tasks: 128 total, 2 running, 126 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 24.6 sy, 0.0 ni, 75.0 id, 0.0 wa, 0.0 hi, 0.4 si, 0.0 st
KiB Mem: 16324316 total, 405308 used, 15919008 free, 46152 buffers
KiB Swap: 32855036 total, 0 used, 32855036 free. 220780 cached Mem

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
   23 root 20 0 0 0 0 R 71.6 0.0 17:30.13 kworker/2:0
   22 root 20 0 0 0 0 S 28.6 0.0 7:00.61 ksoftirqd/2

Changed in linux (Ubuntu):
assignee: nobody → AceLan Kao (acelankao)
status: New → In Progress
AceLan Kao (acelankao) wrote :

perf report:

+ 73.46% 0.00% kworker/2:0 [kernel.kallsyms] [k] ret_from_fork
+ 73.46% 0.00% kworker/2:0 [kernel.kallsyms] [k] kthread
+ 73.46% 0.00% kworker/2:0 [kernel.kallsyms] [k] worker_thread
+ 73.35% 0.10% kworker/2:0 [kernel.kallsyms] [k] process_one_work
+ 72.13% 0.07% kworker/2:0 [kernel.kallsyms] [k] rpm_idle
+ 71.68% 0.09% kworker/2:0 [kernel.kallsyms] [k] rpm_suspend
+ 71.52% 0.02% kworker/2:0 [kernel.kallsyms] [k] pm_runtime_work
+ 71.30% 0.06% kworker/2:0 [kernel.kallsyms] [k] __rpm_callback
+ 71.26% 0.00% kworker/2:0 [kernel.kallsyms] [k] usb_runtime_idle
+ 71.23% 0.02% kworker/2:0 [kernel.kallsyms] [k] __pm_runtime_suspend
+ 70.96% 0.00% kworker/2:0 [kernel.kallsyms] [k] rpm_callback
+ 70.91% 0.01% kworker/2:0 [kernel.kallsyms] [k] usb_runtime_suspend
+ 70.86% 0.04% kworker/2:0 [kernel.kallsyms] [k] usb_suspend_both
+ 68.64% 0.01% kworker/2:0 [kernel.kallsyms] [k] usb_resume_interface.isra.6
+ 68.63% 0.02% kworker/2:0 [kernel.kallsyms] [k] hub_resume
+ 68.59% 0.44% kworker/2:0 [kernel.kallsyms] [k] hub_activate
+ 67.22% 0.28% kworker/2:0 [kernel.kallsyms] [k] hub_port_status
+ 66.42% 0.32% kworker/2:0 [kernel.kallsyms] [k] usb_control_msg
+ 63.88% 0.24% kworker/2:0 [kernel.kallsyms] [k] usb_start_wait_urb
+ 58.25% 0.08% kworker/2:0 [kernel.kallsyms] [k] usb_submit_urb
+ 58.14% 0.44% kworker/2:0 [kernel.kallsyms] [k] usb_submit_urb.part.6
+ 56.05% 1.75% kworker/2:0 [kernel.kallsyms] [k] usb_hcd_submit_urb
+ 43.00% 42.17% kworker/2:0 [kernel.kallsyms] [k] xhci_hub_control
+ 15.27% 0.00% ksoftirqd/2 [kernel.kallsyms] [k] ret_from_fork
+ 15.27% 0.00% ksoftirqd/2 [kernel.kallsyms] [k] kthread
+ 14.88% 0.95% ksoftirqd/2 [kernel.kallsyms] [k] smpboot_thread_fn
+ 8.78% 0.54% kworker/2:0 [kernel.kallsyms] [k] usb_hcd_giveback_urb
+ 7.35% 0.78% kworker/2:0 [kernel.kallsyms] [k] __tasklet_schedule
+ 7.19% 0.05% kworker/2:0 [kernel.kallsyms] [k] wakeup_softirqd
+ 7.16% 0.14% kworker/2:0 [kernel.kallsyms] [k] wake_up_process
+ 7.00% 0.12% ksoftirqd/2 [kernel.kallsyms] [k] run_ksoftirqd
+ 6.91% 0.11% ksoftirqd/2 [kernel.kallsyms] [k] schedule
+ 6.69% 0.76% ksoftirqd/2 [kernel.kallsyms] [k] __do_softirq
+ 6.56% 1.33% ksoftirqd/2 [kernel.kallsyms] [k] __schedule
+ 6.47% 0.49% kworker/2:0 [kernel.kallsyms] [k] try_to_wake_up
+ 5.37% 5.37% kworker/2:0 [kernel.kallsyms] [k] __switch_to
+ 5.30% 0.91% ksoftirqd/2 [kernel.kallsyms] [k] tasklet_action
+ 5.15% 5.15% ksoftirqd/2 [kernel.kallsyms] [k] __switch_to
+ 5.09% 0.11% kworker/2:0 [kernel.kallsyms] [k] ttwu_do_activate.constprop.94
+ 5.04% 0.28% kworker/2:0 [kernel.kallsyms] [k] _cond_resched
+ 4.83% 0.34% kworker/2:0 [kernel.kallsyms] [k] wait_for_completion_timeout
+ 4.70% 1.68% kworker/2:0 [kernel.kallsyms] [k] __schedule

AceLan Kao (acelankao) wrote :

The commit fixes the issue

commit 9250aea76bfcbf4c2a7868e5566281bf2bb7af27
Author: Ulf Hansson <email address hidden>
Date: Fri Mar 27 12:15:15 2015 +0100

    mmc: core: Enable runtime PM management of host devices

    Currently those host drivers which have deployed runtime PM, deals with
    the runtime PM reference counting entirely by themselves.

    Since host drivers don't know when the core will send the next request
    through some of the host_ops callbacks, they need to handle runtime PM
    get/put between each an every request.

    In quite many cases this has some negative effects, since it leads to a
    high frequency of scheduled runtime PM suspend operations. That due to
    the runtime PM reference count will normally reach zero in-between
    every request.

    We can decrease that frequency, by enabling the core to deal with
    runtime PM reference counting of the host device. Since the core often
    knows that it will send a seqeunce of requests, it makes sense for it
    to keep a runtime PM reference count during these periods.

    More exactly, let's increase the runtime PM reference count by invoking
    pm_runtime_get_sync() from __mmc_claim_host(). Restore that action by
    invoking pm_runtime_mark_last_busy() and pm_runtime_put_autosuspend()
    in mmc_release_host(). In this way a runtime PM reference count will be
    kept during the complete cycle of a claim -> release host.

    Signed-off-by: Ulf Hansson <email address hidden>
    Acked-by: Adrian Hunter <email address hidden>
    Acked-by: Konstantin Dorfman <email address hidden>

description: updated
AceLan Kao (acelankao) wrote :

Here is the kernel with the patches
http://people.canonical.com/~acelan/lp1475248/

Keng-Yu Lin (lexical) on 2015-09-01
Changed in hwe-next:
assignee: nobody → AceLan Kao (acelankao)
lpuser (lpuser) wrote :
Download full text (7.0 KiB)

I'm also affected by this issue and it's very annoying as it sucks all my battery!
I have a Dell Inspiron 5555 powered by an AMD A8-7410 and 4G of RAM (which is BTW certified for Ubuntu 14.04).

Here's the top output:

$ top -b -n1
top - 11:20:15 up 42 min, 2 users, load average: 1,77, 1,72, 1,62
Tasks: 217 total, 2 running, 215 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3,9 us, 22,2 sy, 0,0 ni, 70,5 id, 1,5 wa, 0,0 hi, 1,9 si, 0,0 st
KiB Mem: 3400808 total, 1822148 used, 1578660 free, 64956 buffers
KiB Swap: 7812092 total, 0 used, 7812092 free. 764228 cached Mem

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
   75 root 20 0 0 0 0 R 56,7 0,0 22:56.39 kworker/3:2
   29 root 20 0 0 0 0 S 37,8 0,0 16:05.42 ksoftirqd/3

And this is the perf report:

+ 65,62% 0,00% kworker/3:2 [kernel.kallsyms] [k] ret_from_fork ▒
+ 65,62% 0,00% kworker/3:2 [kernel.kallsyms] [k] kthread ▒
+ 65,51% 0,11% kworker/3:2 [kernel.kallsyms] [k] worker_thread ▒
+ 65,26% 0,67% kworker/3:2 [kernel.kallsyms] [k] process_one_work ▒
+ 61,64% 0,78% kworker/3:2 [kernel.kallsyms] [k] rpm_idle ▒
+ 60,01% 0,32% kworker/3:2 [kernel.kallsyms] [k] pm_runtime_work ▒
+ 59,27% 0,99% kworker/3:2 [kernel.kallsyms] [k] rpm_suspend ▒
+ 58,95% 0,14% kworker/3:2 [kernel.kallsyms] [k] __rpm_callback ▒
+ 58,76% 0,22% kworker/3:2 [kernel.kallsyms] [k] usb_runtime_idle ▒
+ 58,37% 0,24% kworker/3:2 [kernel.kallsyms] [k] __pm_runtime_suspend ▒
+ 56,79% 0,24% kworker/3:2 [kernel.kallsyms] [k] rpm_callback ▒
+ 56,41% 0,25% kworker/3:2 [kernel.kallsyms] [k] usb_runtime_suspend ▒
+ 56,11% 0,75% kworker/3:2 [kernel.kallsyms] [k] usb_suspend_both ▒
+ 42,00% 0,36% kworker/3:2 [kernel.kallsyms] [k] usb_resume_interface.isra.6 ▒
+ 41,69% 0,24% kworker/3:2 [kernel.kallsyms] [k] hub_resume ▒
+ 41,30% 0,81% kworker/3:2 [kernel....

Read more...

lpuser (lpuser) wrote :

One more thing that it's worth mentioning.
My Dell laptop came with ubuntu 14.04 preinstalled.
I tried to install all the updates (around 600MB of packages), had some issues and decided to download and install the latest LTS image (14.04.3).
What I'm trying to say is that I only noticed this bug after I reinstalled Ubuntu so I'm not sure if it also affected the Dell preinstalled Ubuntu image which might have had some additional proprietary divers included.
At this point I'm unable to reinstall the Dell stock Ubuntu image because I didn't create a recovery disk.

lpuser (lpuser) wrote :

As said before I'm willing to help with debugging and testing.
If someone knows how to debug this please let me know.

On my laptop I can easily reproduce it by following these steps:
- install chromium browser:
$ sudo apt-get install chromium-browser
- install the Adobe flash plugin (you might need to enable the Canonical Partners repo)
$ sudo apt-get install adobe-flashplugin
- open Chromium browser and go to youtube or any other website with flash videos

AceLan Kao (acelankao) wrote :

lpuser,
I don't think you have the same symptom as mine. The symptom is that just boot up the machine and observe the cpu loading from 'top'. I don't need to do anything or open any apps to encounter this issue.

But your perf report is very similar to mine, you can try to plug USB devices in all USB ports on your machine and see if it helps.

lpuser (lpuser) wrote :

Your hunch was correct.
I plugged in USB devices in all USB ports and the issue went away.
Then I started to unplug them one by one and I noticed that the problem is related only to one of the 3 external USB ports.
As soon as I unplug the device from that USB port kworker and ksoftirqd will instantly start to abuse the CPU.

USB devices:

$ lsusb
Bus 004 Device 006: ID 0cf3:e005 Atheros Communications, Inc.
Bus 004 Device 002: ID 0438:7900 Advanced Micro Devices, Inc.
Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 004: ID 06cb:75bf Synaptics, Inc.
Bus 003 Device 003: ID 0bda:0129 Realtek Semiconductor Corp. RTS5129 Card Reader Controller
Bus 003 Device 002: ID 0438:7900 Advanced Micro Devices, Inc.
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 002: ID 0bda:5684 Realtek Semiconductor Corp.
Bus 001 Device 008: ID 0458:003a KYE Systems Corp. (Mouse Systems) NetScroll+ Mini Traveler / Genius NetScroll 120 <== I connected a mouse to the USB port that I mentioned earlier
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

Zupemen (catalin-virbanescu) wrote :
Download full text (4.2 KiB)

hy i am having the same problem on my del inspiron 5555
-------------- my top listing shows this
  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
 3400 root 20 0 0 0 0 R 61,8 0,0 64:46.63 kworker/0:0
    3 root 20 0 0 0 0 S 37,8 0,0 39:13.48 ksoftirqd/0
  827 root 20 0 407788 120076 98268 S 2,0 3,5 2:02.02 Xorg
 5014 flori 20 0 621176 32344 25432 S 1,7 1,0 0:02.00 gnome-terminal-
    9 root 20 0 0 0 0 S 0,3 0,0 0:23.10 rcuos/0
 1099 flori 20 0 1201488 141244 91212 S 0,3 4,2 4:04.16 compiz
 1515 flori 20 0 583176 32960 27260 S 0,3 1,0 0:07.11 nm-applet

---------------my cat /proc/interrupts
            CPU0 CPU1 CPU2 CPU3
   0: 49 0 0 0 IO-APIC-edge timer
   1: 190 52 45 636 IO-APIC-edge i8042
   7: 1 0 0 0 IO-APIC-edge
   8: 0 0 0 1 IO-APIC-edge rtc0
   9: 0 0 0 0 IO-APIC-fasteoi acpi
  16: 104 117 191 146 IO-APIC 16-fasteoi mmc0, snd_hda_intel
  18: 12810 12796 15208 52505 IO-APIC 18-fasteoi ehci_hcd:usb3, ehci_hcd:usb4
  25: 0 0 0 0 PCI-MSI-edge PCIe PME, pciehp
  27: 0 0 0 0 PCI-MSI-edge PCIe PME
  29: 0 0 0 0 PCI-MSI-edge PCIe PME
  30: 2 10 6 30 PCI-MSI-edge xhci_hcd
  31: 0 0 0 0 PCI-MSI-edge xhci_hcd
  32: 0 0 0 0 PCI-MSI-edge xhci_hcd
  33: 0 0 0 0 PCI-MSI-edge xhci_hcd
  34: 0 0 0 0 PCI-MSI-edge xhci_hcd
  36: 0 1 0 795 PCI-MSI-edge eth0
  37: 8895 10818 13275 52909 PCI-MSI-edge 0000:00:11.0
  38: 0 0 0 0 PCI-MSI-edge ccp-0
  39: 0 0 0 0 PCI-MSI-edge ccp-1
  43: 43 41 33 67 PCI-MSI-edge snd_hda_intel
  44: 2859971 11 26 86 IO-APIC 8-fasteoi ath9k
  45: 195622 140337 161305 908424 PCI-MSI-edge fglrx[0]@PCI:0:1:0
  46: 0 0 0 0 PCI-MSI-edge fglrx[1]@PCI:1:0:0
 N...

Read more...

AceLan Kao (acelankao) wrote :

Hi,

Could you give me an ack or nack that the kernel on comment #4 works or not?
Thanks.

Aditya (code-aditya) wrote :

Hi AceLan,

I am also affected by this issue and using the patched kernel available at http://people.canonical.com/~acelan/lp1475248/ solved the issue for me.

Now the update manager gives me an upgrade for the three linux packages. Can I install them? Please guide me how do I remain updated with latest security fixes while still having the patch applied to the kernel?

I have installed 14.04.3 (with vivid HWE).

Regards,
Aditya

AceLan Kao (acelankao) wrote :

Aditya, could you post the result of the command 'lspci' here, so that I can identify which platform you are using.
I submitted the patch for SRU couple months ago, but been asked to test the patch on more machines.
So, I need your feedback to push the patch into ubuntu kernel, so that you can use the updated ubuntu kernel and get rid of this issue.

Aditya (code-aditya) wrote :

Hi AceLan,

Thanks for your efforts. Here is the result of the running the command `lspci`:

$ lspci
00:00.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1422
00:01.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Kaveri [Radeon R7 200 Series]
00:01.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Device 1308
00:02.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1424
00:03.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1424
00:03.1 PCI bridge: Advanced Micro Devices, Inc. [AMD] Device 1426
00:04.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 1424
00:10.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 09)
00:10.1 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB XHCI Controller (rev 09)
00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD] FCH SATA Controller [AHCI mode] (rev 40)
00:12.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11)
00:12.2 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller (rev 11)
00:13.0 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11)
00:13.2 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB EHCI Controller (rev 11)
00:14.0 SMBus: Advanced Micro Devices, Inc. [AMD] FCH SMBus Controller (rev 16)
00:14.2 Audio device: Advanced Micro Devices, Inc. [AMD] FCH Azalia Controller (rev 01)
00:14.3 ISA bridge: Advanced Micro Devices, Inc. [AMD] FCH LPC Bridge (rev 11)
00:14.4 PCI bridge: Advanced Micro Devices, Inc. [AMD] FCH PCI Bridge (rev 40)
00:14.5 USB controller: Advanced Micro Devices, Inc. [AMD] FCH USB OHCI Controller (rev 11)
00:18.0 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 141a
00:18.1 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 141b
00:18.2 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 141c
00:18.3 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 141d
00:18.4 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 141e
00:18.5 Host bridge: Advanced Micro Devices, Inc. [AMD] Device 141f
01:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller (rev 0c)

I would be happy to help you troubleshoot the problem. Let me know whatever info you require.

Regards,

Peter Curtis (pdcurtis) wrote :
Download full text (6.1 KiB)

I also have a similar problem which does Not seem to be solved by the patched kernel although the perf report looks very similar and plugging in a USB device clears the problem. The starting and inhibiting of the kworker spinning is very specific and rather bizare on my new Skylake machine running Linux Mint 17.3 beta (Cinnamon 2.8 desktop) with kernels between 3.19 and 4.2 during test.

It only starts when external power is off and the machine is suspended/resumed - I have not so far been able to provoke it any other way.
If power is restored excess usage stops but restarts if power unplugged.
If power is on and a suspend/resume cycle is carried out external power can be unplugged without the excess usage.
Any USB 2 device stops the excess usage when plugged into the USB2 port including USB3 devices.
USB 3 devices plugged into a USB3 port do not stop the power usage.
Turning on the webcam stops the usage as long as it is on - the webcam is on the usb2 hub
Turning Bluetooth off and on with the function key stops the high usage as long as it still remains on. Software disabling and enabling does not work. When bluetooth switched by function key it disappears/appears in lsusb under the USB 2 hub

In other words everything seems to point to the USB 2 system.
perf report
+ 68.46% 0.00% kworker/0:3 [kernel.kallsyms] [k] ret_from_fork ▒
+ 68.46% 0.00% kworker/0:3 [kernel.kallsyms] [k] kthread ▒
+ 68.46% 0.00% kworker/0:3 [kernel.kallsyms] [k] worker_thread ▒
+ 68.43% 0.02% kworker/0:3 [kernel.kallsyms] [k] process_one_work ▒
+ 67.44% 0.07% kworker/0:3 [kernel.kallsyms] [k] rpm_idle ▒
+ 66.89% 0.07% kworker/0:3 [kernel.kallsyms] [k] rpm_suspend ▒
+ 66.84% 0.02% kworker/0:3 [kernel.kallsyms] [k] pm_runtime_work ▒
+ 66.47% 0.00% kworker/0:3 [kernel.kallsyms] [k] usb_runtime_idle ▒
+ 66.47% 0.00% kworker/0:3 [kernel.kallsyms] [k] __rpm_callback ▒
+ 66.44% 0.00% kworker/0:3 [kernel.kallsyms] [k] __pm_runtime_suspend ▒
+ 66.23% 0.02% kworker/0:3 [kernel.kallsyms] [k] rpm_callback ▒
+ 66.21% 0.02% kworker/0:3 [kernel.kallsyms] [k] usb_runtime_suspend ▒
+ 66.11% 0.02% kworker/0:3 [kernel.kallsyms] [k] usb_suspend_both ▒
+ 64.11% 0.02% kworker/0:3 [kernel.kallsyms] [k] hub_resume ▒
+ 64.11% 0.02% kworker/0:3 [kernel.kallsyms] [k] usb_resume_interface.isra.6 ▒
+ 64.04% 0.52% kworker/0:3 [kernel.kallsyms] [k] hub_activate ▒
+ 62.49% 0.24% kworker/0:3 [kernel.kallsyms] [k] hub_port_status ▒
+ 62.02% 0.21% kworker/0:3 [kernel.kallsyms] [k] usb_control_msg ▒
+ 59.15% 0.26% ...

Read more...

Peter Curtis (pdcurtis) wrote :

I have carried out some more comprehensive tests using the kernel in #4 over the weekend and found it does improve my situation far more than my initial quick test showed. Most of the rather bizarre behaviour has gone include the changes with power from battery to external and the importance of suspending.
Once kworker spinning has started it can now be _predictably_ stopped when using your kernel by:
    Any USB 2 device being plugged into any USB port or a USB 3 device plugged into the USB 2 port
    The internal Bluetooth being turned on so that it shows up as USB device
    The internal Web cam being used (by cheese during my test).
Note running the web cam does not change what is shown by lsusb -t, everything else which stops spinning shows as a new device being connected in dmesg and appears in lsusb.
Although the kernel in #4 does not completely solve the problem it is obviously addresses the correct area and in my case makes the machine usable without having anything plugged into a usb port, keeping bluetooth on is sufficient.

@AceLan What is the status of the patch and other work on the issue? Is there anything else I can do to progress the work?

Aditya (code-aditya) wrote :

The above working patch has been merged in linux-4.1.y branch of linux-stable repository as can been seen here: https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/log/?qt=grep&q=mmc%3A+core%3A+Enable+runtime+PM+management+of+host+devices&h=linux-4.1.y

Since Ubuntu 15.10 comes with Kernel version 4.2, I guess using Ubuntu 15.10 would fix the issue. I would wait for 14.04.4 to come out in Feb 2016 with updated kernel as depicted here: https://wiki.ubuntu.com/Kernel/LTSEnablementStack#Kernel.2BAC8-Support.A14.04.x_Ubuntu_Kernel_Support

And then upgrade to Ubuntu 16.04 LTS.

Peter Curtis (pdcurtis) wrote :
Download full text (4.8 KiB)

@Aditya Thanks for the clarification. I had been trying to find out if the patch had been merged.

The issue is not however fixed for me with 4.2 nor was it for lpuser in #8 who tried a number of kernels. However it an improvement as I said in #17

The following shows the problem first without BT on then with showing the problem disappearing:

:~$ uname -r
4.2.0-19-generic
pete@Helios-Ubuntu:~$ top

in, 2 users, load average: 0.71, 0.28, 0.11
Tasks: 238 total, 2 running, 236 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.8 us, 25.0 sy, 0.0 ni, 73.8 id, 0.0 wa, 0.0 hi, 0.4 si, 0.0 st
KiB Mem: 8082944 total, 1196884 used, 6886060 free, 61504 buffers
KiB Swap: 10239996 total, 0 used, 10239996 free. 542960 cached Mem

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
   50 root 20 0 0 0 0 R 71.2 0.0 0:15.14 kworker/0:1
    3 root 20 0 0 0 0 S 28.9 0.0 0:11.58 ksoftirqd/0
  729 root 20 0 294360 11812 7432 S 1.7 0.1 0:08.33 polkitd
 1182 pete 20 0 1451368 183784 60088 S 1.7 2.3 0:25.51 cinnamon
  145 root 20 0 0 0 0 S 0.3 0.0 0:00.10 kworker/u8+
  638 root 20 0 462888 19056 13656 S 0.3 0.2 0:03.32 NetworkMan+
  644 message+ 20 0 44192 5004 3480 S 0.3 0.1 0:03.83 dbus-daemon
  796 root 20 0 405348 86652 76272 S 0.3 1.1 0:07.15 Xorg
    1 root 20 0 37748 5784 3892 S 0.0 0.1 0:01.13 systemd
    2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
    4 root 20 0 0 0 0 S 0.0 0.0 0:13.54 kworker/0:0
    5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:+
    6 root 20 0 0 0 0 S 0.0 0.0 0:00.03 kworker/u8+
    7 root 20 0 0 0 0 S 0.0 0.0 0:00.39 rcu_sched
    8 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_bh
    9 root 20 0 0 0 0 S 0.0 0.0 0:00.35 rcuos/0
   10 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuob/0
pete@Helios-Ubuntu:~$ lsusb -t
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/12p, 480M
    |__ Port 4: Dev 3, If 0, Class=Video, Driver=uvcvideo, 480M
    |__ Port 4: Dev 3, If 1, Class=Video, Driver=uvcvideo, 480M

=========== Now Switch on Bluetooth ==============
~$ lsusb -t
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/6p, 5000M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/12p, 480M
    |__ Port 3: Dev 4, If 0, Class=Wireless, Driver=btusb, 12M
    |__ Port 3: Dev 4, If 1, Class=Wireless, Driver=btusb, 12M
    |__ Port 4: Dev 3, If 0, Class=Video, Driver=uvcvideo, 480M
    |__ Port 4: Dev 3, If 1, Class=Video, Driver=uvcvideo, 480M
$ top

top - 09:21:47 up 9 min, 2 users, load average: 0.79, 0.69, 0.32
Tasks: 246 total, 1 running, 245 sleeping, 0 stopped, 0 zombie
%Cpu(s): 2.1 us, 0.3 sy, 0.0 ni, 97.6 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 8082944 tot...

Read more...

Peter Curtis (pdcurtis) wrote :

I have tried the test patch provided by Mathias Nyman in http://linux-kernel.2935.n7.nabble.com/TESTPATCH-v2-xhci-fix-usb2-resume-timing-and-races-tc1250796.html#a1256745 and rebuilt a Wily 4.2 kernel and that has prevented High CPU usage of kworker/ksoftirqd under all the circumstances I have tried. Note this is marked as a TESTPATCH and Mathias Nyman mentions his intention is to " do some minor cleanups and add it to the queue ". At best it will be in the 4.4 kernel and then available for backports but it good to know that there is a solution in hand.

@acelan Is there any way for you to also push the Mathias Nyman patch into the ubuntu kernel, assuming it also solves your original problem. Then normal users can just use an updated ubuntu kernel to get rid of this issue. As it is a timing and race problem it may well turn out to have a much wider impact than current reports show, in particular as it seems to affect Skylake machines.

Michel-Ekimia (michel.ekimia) wrote :

Any news on which kernel it might be integrated ?

We really need this for some skylake oem project

Zupemen (catalin-virbanescu) wrote :

my laptop's status is still the same
CPU is still at 90% on one core (as seen in system monitor) when it should be idle even when i plug in a usb device. The only thing that helps is the webcam being active :(
The core that goes up to 90% usage is random, it is not the same one every time i login

flori@flori-Inspiron-5555:~$ uname -a
Linux flori-Inspiron-5555 4.2.0-23-generic #28-Ubuntu SMP Sun Dec 27 17:47:31 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

flori@flori-Inspiron-5555:~$ lsusb -t
/: Bus 04.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/2p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 1: Dev 4, If 0, Class=Wireless, Driver=btusb, 12M
        |__ Port 1: Dev 4, If 1, Class=Wireless, Driver=btusb, 12M
/: Bus 03.Port 1: Dev 1, Class=root_hub, Driver=ehci-pci/2p, 480M
    |__ Port 1: Dev 2, If 0, Class=Hub, Driver=hub/4p, 480M
        |__ Port 1: Dev 3, If 0, Class=Vendor Specific Class, Driver=rtsx_usb, 480M
        |__ Port 2: Dev 4, If 0, Class=Human Interface Device, Driver=usbhid, 12M
        |__ Port 2: Dev 4, If 1, Class=Human Interface Device, Driver=usbhid, 12M
        |__ Port 2: Dev 4, If 2, Class=Human Interface Device, Driver=usbhid, 12M
        |__ Port 4: Dev 5, If 0, Class=Human Interface Device, Driver=usbhid, 12M
/: Bus 02.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 5000M
/: Bus 01.Port 1: Dev 1, Class=root_hub, Driver=xhci_hcd/2p, 480M
    |__ Port 2: Dev 2, If 0, Class=Video, Driver=uvcvideo, 480M
    |__ Port 2: Dev 2, If 1, Class=Video, Driver=uvcvideo, 480M

Zupemen (catalin-virbanescu) wrote :

flori@flori-Inspiron-5555:~$ cat /proc/6506/status
Name: kworker/1:0
State: R (running)
Tgid: 6506
Ngid: 0
Pid: 6506
PPid: 2
TracerPid: 0
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 64
Groups:
NStgid: 6506
NSpid: 6506
NSpgid: 0
NSsid: 0
Threads: 1
SigQ: 0/13422
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: ffffffffffffffff
SigCgt: 0000000000000000
CapInh: 0000000000000000
CapPrm: 0000003fffffffff
CapEff: 0000003fffffffff
CapBnd: 0000003fffffffff
Seccomp: 0
Cpus_allowed: 2
Cpus_allowed_list: 1
Mems_allowed: 00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 5197
nonvoluntary_ctxt_switches: 51522882

i don't know if this helps, i'm just throwing all this here :D

Jeremy LaCroix (j-jlacroix) wrote :

HI guys, I'm being bit by this too in Ubuntu 15.10. I have a System76 Lemur (2015, Skylake) and one of my cores is always at 90% or above usage, unless I plug in a USB device, in which case the problem goes away.

I tried installing kernel 4.3 from the mainline repo, but with that kernel, my system won't even boot at all.

Jeremy LaCroix (j-jlacroix) wrote :

Kernel 4.4 solves the problem for me. I used the kernel from the following site:
http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-intel-next/2016-01-11-wily/

I hope that this gets fixed in Ubuntu-proper so that I don't need to rely on a PPA for my system to work properly, though.

Michel-Ekimia (michel.ekimia) wrote :

Hi Jerely , 16.04-proposed is still 4.3 and does not fix the problem.

As you swaw, Official fix is in 4.4 and 16.04 will includ 4.4, so official fix may be only available in april, or canonical should push the fix to 4.2 which might be difficult.

I downloaded the sources for the 3.19.0-43-generic kernel (Ubuntu 14.04.3).
Then I applied this patch:
*http://linux-kernel.2935.n7.nabble.com/TESTPATCH-v2-xhci-fix-usb2-resume-timing-and-races-td1250796.html
<http://linux-kernel.2935.n7.nabble.com/TESTPATCH-v2-xhci-fix-usb2-resume-timing-and-races-td1250796.html>*

I haven't experienced any issues so far with the new patched kernel.

Any idea when this patch will hit the official Ubuntu 14.04.3 kernel?

2016-01-23 17:45 GMT+02:00 Ekimia <email address hidden>:

> Hi Jerely , 16.04-proposed is still 4.3 and does not fix the problem.
>
>
> As you swaw, Official fix is in 4.4 and 16.04 will includ 4.4, so
> official fix may be only available in april, or canonical should push the
> fix to 4.2 which might be difficult.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1488426
>
> Title:
> High CPU usage of kworker/ksoftirqd
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/hwe-next/+bug/1488426/+subscriptions
>

DDS (dds-sp) wrote :

Have this problem on Dell Inspiron i7559-763BLK. Workaround for me is to plug in USB mouse.
Ubuntu MATE 15.10, Linux 4.2.0-27-generic #32-Ubuntu SMP Fri Jan 22 04:49:08 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

lpuser (lpuser) wrote :

The issue seems to be fixed with the latest Ubuntu 14.04.3 kernel:

$ uname -a
Linux quasar 3.19.0-49-generic #55~14.04.1-Ubuntu SMP Fri Jan 22 11:24:31
UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Can someone double check and confirm?

Thanks!

2016-02-01 0:52 GMT+02:00 DDS <email address hidden>:

> Have this problem on Dell Inspiron i7559-763BLK. Workaround for me is to
> plug in USB mouse.
> Ubuntu MATE 15.10, Linux 4.2.0-27-generic #32-Ubuntu SMP Fri Jan 22
> 04:49:08 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1488426
>
> Title:
> High CPU usage of kworker/ksoftirqd
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/hwe-next/+bug/1488426/+subscriptions
>

Aditya (code-aditya) wrote :

Yes, the issue seems to be fixed for me as well with kernal release 3.19.0-49-generic.

Would still upgrade to 14.04.4 once it releases this week.

Peter Curtis (pdcurtis) wrote :

I can also confirm that the 3.19.0.49-generic kernel solves the problem for me.

Michel-Ekimia (michel.ekimia) wrote :

Congrats to the Ubuntu team who backport this on 3.19 which enable to bring some skylake to Ubuntu computer market.

lpuser (lpuser) wrote :

The issue seems to be fixed with the ubuntu 14.04.4 as well (kernel 4.2.0):

$ uname -a
Linux quasar 4.2.0-34-generic #39~14.04.1-Ubuntu SMP Fri Mar 11 11:38:02
UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

2016-02-26 16:41 GMT+02:00 Ekimia <email address hidden>:

> Congrats to the Ubuntu team who backport this on 3.19 which enable to
> bring some skylake to Ubuntu computer market.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1488426
>
> Title:
> High CPU usage of kworker/ksoftirqd
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/hwe-next/+bug/1488426/+subscriptions
>

landei16 (vogt-ruediger) wrote :

Hi all,

yeterday i started my PC with Kernel 4.2.0-34-generic (Ubuntu 14.04.4) running on an i3 Skylake Prozessor.
And still kworker /0.1 (PID 50) is consuming ~98% of CPU.

Regards
Landei16

MShepanski (mjs7231) wrote :

I'm hitting this problem on Ubuntu 16.04. Process kworker/0:1 is using 70% of one of my cores.

$ uname -a
Linux pkkid-work 4.4.0-22-generic #39-Ubuntu SMP Thu May 5 16:53:32 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

$ lsusb
Bus 002 Device 003: ID 046d:c051 Logitech, Inc. G3 (MX518) Optical Mouse
Bus 002 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 001 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 004 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 003 Device 002: ID 046d:c31c Logitech, Inc. Keyboard K120
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

$ lspci
00:00.0 Host bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200 v2/3rd Gen Core processor PCI Express Root Port (rev 09)
00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)
00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04)
00:19.0 Ethernet controller: Intel Corporation 82579LM Gigabit Network Connection (rev 04)
00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04)
00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev a4)
00:1f.0 ISA bridge: Intel Corporation Q77 Express Chipset LPC Controller (rev 04)
00:1f.2 RAID bus controller: Intel Corporation SATA Controller [RAID mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04)
01:00.0 VGA compatible controller: NVIDIA Corporation GK107 [GeForce GT 640 OEM] (rev a1)
01:00.1 Audio device: NVIDIA Corporation GK107 HDMI Audio Controller (rev a1)

MShepanski (mjs7231) wrote :

This solution worked for me. I'm not exactly sure what I disabled, but I'll take my chances at a broken thing over a core spinning at 100% constantly.
https://bbs.archlinux.org/viewtopic.php?id=184913

1. find out the "gpe" that is causing the bad stuff with something like:
   $ grep . -r /sys/firmware/acpi/interrupts/

2. Check for an high value (mine was gpe08 with a value like 200K). Change the
   following accordingly. Backup the gpe file..
   $ cp /sys/firmware/acpi/interrupts/gpe13 /pathtobackup

3. Create a crontab entry to disable the gpe on reboot:
   $ sudo crontab -e
   @reboot echo "disable" > /sys/firmware/acpi/interrupts/gpe08

4. To make it work after wakeup from suspend:
   $ touch /etc/pm/sleep.d/30_disable_gpe13
   $ chmod +x /etc/pm/sleep.d/30_disable_gpe13
   $ vim /etc/pm/sleep.d/30_disable_gpe13

4a: Add this as the script in step 4:

#!/bin/bash
case "$1" in
  thaw|resume)
    echo disable > /sys/firmware/acpi/interrupts/gpe13 2>/dev/null
    ;;
  *)
    ;;
esac
exit $?

fabrixx (fabrixx) wrote :

Thank you MShepanski (mjs7231). same problem on my new core i7 6700.
Your fix work on my Debian.

Fuujuhi (fuujuhi) wrote :

For info, I have same symptoms as OP (Aceland Kao) reported (same perf report), on Ubuntu 16.04 with Kernel 4.4.0-59.80 generic. This is on a laptop (HP EliteBook G1 480) on a docking station.

I could apparently fix the problem by undocking and docking back the laptop while still powered.
Actually the problem appeared while docking the laptop when sleeping, then waking up.

To be exact, here the complete sequence I followed:
- Dock the PC, and wake it up.
- Notice the CPU usage, and start 'perf' to debug
- Undock the PC
- Plug many USB devices in all ports. No effect.
- Remove the USB devices.
- Close the lid (for suspend), then open it again. No effect.
- Dock back the PC (when powered up) --> problem when away.

AceLan Kao (acelankao) on 2018-07-24
Changed in linux (Ubuntu):
status: In Progress → Confirmed
assignee: AceLan Kao (acelankao) → nobody
Changed in hwe-next:
assignee: AceLan Kao (acelankao) → nobody
danj (dan-julius) wrote :

Had similar issue using Ubuntu 18.04 on my Desktop
Upgraded kernel to 4.18.11 - still happens occasionally, usually after screen saver
As soon as I plugged USB devices into all ports the cpu dropped down again

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Related questions