Ubuntu 18.04 is overheating after upgrade from 16.04

Bug #1768976 reported by Caramba
412
This bug affects 33 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
High
Unassigned
xserver-xorg-video-nouveau (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Ubuntu is overheating at my laptop. Opening youtube on firefox is enough for critical temperature shutdown.

Using lm-sensors for monitoring on 18.04 the temp varies between 70 and 85°C with only firefox or chrome open and doing nothing.
On my old 16.04 with same using, the temp varies between 55 and 70°C.

First thought was the driver nouveau is the problem, and finally I was able to install by add "nouveau.modeset=0" at livecd boot options, without temp shutdown.

After install I disable the nouveau at modprobe blacklist, but the system continues overheating and shutdown with basic usage.

I have no idea what's happening with the bionic at my laptop.

My laptop is a Samsung RF411 i5 2nd Generation and Geforce 540M.

ProblemType: Bug
DistroRelease: Ubuntu 18.04
Package: ubuntu-release-upgrader-core 1:18.04.17
ProcVersionSignature: Ubuntu 4.15.0-20.21-generic 4.15.17
Uname: Linux 4.15.0-20-generic x86_64
NonfreeKernelModules: wl
ApportVersion: 2.20.9-0ubuntu7
Architecture: amd64
CrashDB: ubuntu
CurrentDesktop: ubuntu:GNOME
Date: Thu May 3 16:22:40 2018
InstallationDate: Installed on 2018-04-27 (6 days ago)
InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Release amd64 (20180426)
PackageArchitecture: all
SourcePackage: ubuntu-release-upgrader
Symptom: dist-upgrade
UpgradeStatus: No upgrade log present (probably fresh install)
VarLogDistupgradeAptHistorylog:
 Start-Date: 2018-04-27 15:46:02
 End-Date: 2018-04-27 15:46:02
VarLogDistupgradeAptlog:
 Log time: 2018-04-27 15:45:39.753331
 Starting pkgProblemResolver with broken count: 0
 Starting 2 pkgProblemResolver with broken count: 0
 Done
 Log time: 2018-04-27 15:46:04.859979
VarLogDistupgradeApttermlog:
 Log started: 2018-04-27 15:46:02
 Log ended: 2018-04-27 15:46:02
---
.tmp.unity_support_test.0:

ApportVersion: 2.20.9-0ubuntu7.1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: edir 2354 F.... pulseaudio
CompositorRunning: None
DistUpgraded: Fresh install
DistroCodename: bionic
DistroRelease: Ubuntu 18.04
DistroVariant: ubuntu
DkmsStatus:
 bcmwl, 6.30.223.271+bdcom, 4.15.0-20-generic, x86_64: installed
 bcmwl, 6.30.223.271+bdcom, 4.15.0-22-generic, x86_64: installed
GraphicsCard:
 Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0116] (rev 09) (prog-if 00 [VGA controller])
   Subsystem: Samsung Electronics Co Ltd 2nd Generation Core Processor Family Integrated Graphics Controller [144d:c0a5]
   Subsystem: Samsung Electronics Co Ltd GF108M [GeForce GT 540M] [144d:c0a5]
HibernationDevice: RESUME=UUID=e7a61aee-64c2-4c88-b4e1-4de481d0f88d
InstallationDate: Installed on 2018-04-27 (36 days ago)
InstallationMedia: Ubuntu 18.04 LTS "Bionic Beaver" - Release amd64 (20180426)
MachineType: SAMSUNG ELECTRONICS CO., LTD. RF511/RF411/RF711
NonfreeKernelModules: wl
Package: xserver-xorg-video-nouveau 1:1.0.15-2
PackageArchitecture: amd64
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.15.0-22-generic root=UUID=db38a22c-0e9f-4e1a-b9f7-f7aac2544394 ro quiet splash nouveau.runpm=0
ProcVersionSignature: Ubuntu 4.15.0-22.24-generic 4.15.17
PulseList:
 Error: command ['pacmd', 'list'] failed with exit code 1: Home directory not accessible: Permission denied
 No PulseAudio daemon running, or not running as session daemon.
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-22-generic N/A
 linux-backports-modules-4.15.0-22-generic N/A
 linux-firmware 1.173.1
Tags: bionic ubuntu
Uname: Linux 4.15.0-22-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True
dmi.bios.date: 04/26/2011
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 10HX.M034.20110426.SSH
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: RF511/RF411/RF711
dmi.board.vendor: SAMSUNG ELECTRONICS CO., LTD.
dmi.board.version: 10HX
dmi.chassis.asset.tag: No Asset Tag
dmi.chassis.type: 9
dmi.chassis.vendor: SAMSUNG ELECTRONICS CO., LTD.
dmi.chassis.version: N/A
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr10HX.M034.20110426.SSH:bd04/26/2011:svnSAMSUNGELECTRONICSCO.,LTD.:pnRF511/RF411/RF711:pvr10HX:rvnSAMSUNGELECTRONICSCO.,LTD.:rnRF511/RF411/RF711:rvr10HX:cvnSAMSUNGELECTRONICSCO.,LTD.:ct9:cvrN/A:
dmi.product.family: To be filled by O.E.M.
dmi.product.name: RF511/RF411/RF711
dmi.product.version: 10HX
dmi.sys.vendor: SAMSUNG ELECTRONICS CO., LTD.
version.compiz: compiz 1:0.9.13.1+18.04.20180302-0ubuntu1
version.libdrm2: libdrm2 2.4.91-2
version.libgl1-mesa-dri: libgl1-mesa-dri 18.0.0~rc5-1ubuntu1
version.libgl1-mesa-glx: libgl1-mesa-glx 18.0.0~rc5-1ubuntu1
version.xserver-xorg-core: xserver-xorg-core 2:1.19.6-1ubuntu4
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev N/A
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:18.0.1-1
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.917+git20171229-1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.15-2

Revision history for this message
Caramba (ecaramba) wrote :
Revision history for this message
Mauro (mascia-mauro) wrote :

I have the same heating issue after a clean install of Ubuntu 18.04. Before that I was using Ubuntu 16.04, without this problem.

Fans are more active than before and the laptop is constantly overheated, with 70-80 degrees doing nothing but only Chrome or even with the shell opened.

Nvidia drivers has been removed (I can live with the integrated Intel).
Kworker issue (https://bugs.launchpad.net/ubuntu/+source/linux/+bug/887793) already fixed.

This happens on a Samsung 7 Chronos NP700Z7C (CPU i7 3rd gen)

affects: ubuntu-release-upgrader (Ubuntu) → linux (Ubuntu)
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1768976

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Stefan (pinguinpanic) wrote :

samwe problem here. I have a lenovo e570
cpu Intel® Core™ i5-7200U CPU @ 2.50GHz × 4
graphics GeForce GTX 950M/PCIe/SSE2

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in xserver-xorg-video-nouveau (Ubuntu):
status: New → Confirmed
Revision history for this message
Stefan (pinguinpanic) wrote :

I had a similar problem. turns out it was actual high load that forced the cpu clock speed to always be more than 3GHz on all (!) cores. namely the xorg (>100%), systemd-journald (>50%) and rsyslogd(>30%) were consuming a lot cpu power. in the journal i could see the following line over and over again:

    SynPS/2 Synaptics TouchPad: Read error 9

After a bit of digging I found that the synaptics driver should be replaced with the xserver-xorg-input-libinput. (which I had already installed, maybe that was the problem)

see: https://wiki.ubuntuusers.de/Touchpad/

So after removing the package xserver-xorg-input-synaptics the cpu load went down significantly and so did the cpu-clock and heat.

problem solved for me

Revision history for this message
Caramba (ecaramba) wrote : AlsaInfo.txt

apport information

tags: added: apport-collected ubuntu
description: updated
Revision history for this message
Caramba (ecaramba) wrote : BootLog.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : CRDA.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : CurrentDmesg.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : Dependencies.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : DpkgLog.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : IwConfig.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : LightdmDisplayLog.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : LightdmLog.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : Lspci.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : Lsusb.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : ProcCpuinfo.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : ProcCpuinfoMinimal.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : ProcEnviron.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : ProcInterrupts.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : ProcModules.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : RfKill.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : UdevDb.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : UnitySupportTest.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : WifiSyslog.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : XorgLog.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : XorgLogOld.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : Xrandr.txt

apport information

Revision history for this message
Caramba (ecaramba) wrote : xdpyinfo.txt

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.17 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.17

Changed in linux (Ubuntu):
importance: Undecided → High
status: Confirmed → Incomplete
tags: added: kernel-da-key
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
status: Confirmed → Incomplete
Revision history for this message
viny (vinyoliver-ti) wrote :

Facing the same problem here... Dell inspiron 7000.

Revision history for this message
jmkeller (jmkeller) wrote :

Same issue started post 18.04 upgrade. AMD FX at 4.7 GHz goes into thermal shutdown about a minute after getting booted up and into the desktop. I installed cpuefreqd and set the policy to power save high (limiting the frequency scaling to about 50%) in order to keep the CPU temp below the thermal cut off limit.

Revision history for this message
Filip Sabo (filopator) wrote :

Same here, fresh installation of Ubuntu, Samsung npv300.

Revision history for this message
László Tóth (premissa72) wrote :

Same problem: Lenovo Z50, Radeon R6 R7 graphics, AMD FX 7500 CPU. The bug is definitely corresponds to the new Ubuntu version because the previously 16.04 version worked almost smoothly. (I changed it because of very slow boot process - it is a bit faster in 18.04).

Nect to the overheating the system often freezes. The desktop stops, freezes and after half a minit everything is will be almost fine. I could check it and the same time CPU usage was extremely high.

The problem presents both Wayland and Xorg but uder Xorg it is less frequently.

Revision history for this message
Eugene Minov (minov-eug) wrote :

I having the same problem. Laptop Asus N551JX. Intel® Core™ i7-4720HQ CPU @ 2.60GHz × 8
Fresh install, tried with and without nvidia drivers, tried to stop thermald, didn't help.
/var/log/kern.log keeps reporting of critical cpu package temp or something before shootdown...
That feeling like critical halt temp has lowered in kernel.
Returned to 16.04 meanwhile.
Loved 18.04 very much!! Please fix this soon.

Revision history for this message
DJ (dj-dj) wrote :

The same problem here.
Upgrage from 16 to 18
Dell XPS

Revision history for this message
Caramba (ecaramba) wrote : Re: [Bug 1768976] Re: Ubuntu 18.04 is overheating after upgrade from 16.04

Hi,

I am tested latest kernel version, 4.17 and the same of 16.04, 4.4.0 and
the result is same problem.
Now I think might be some module as nvidia or some kernel setting what is
overclocking even with thermald and tlp.

Em seg, 4 de jun de 2018 às 10:01, Joseph Salisbury <
<email address hidden>> escreveu:

> Did this issue start happening after an update/upgrade? Was there a
> prior kernel version where you were not having this particular problem?
>
> Would it be possible for you to test the latest upstream kernel? Refer
> to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest
> v4.17 kernel[0].
>
> If this bug is fixed in the mainline kernel, please add the following
> tag 'kernel-fixed-upstream'.
>
> If the mainline kernel does not fix this bug, please add the tag:
> 'kernel-bug-exists-upstream'.
>
> Once testing of the upstream kernel is complete, please mark this bug as
> "Confirmed".
>
>
> Thanks in advance.
>
> [0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.17
>
>
> ** Changed in: linux (Ubuntu)
> Importance: Undecided => High
>
> ** Changed in: linux (Ubuntu)
> Status: Confirmed => Incomplete
>
> ** Tags added: kernel-da-key
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1768976
>
> Title:
> Ubuntu 18.04 is overheating after upgrade from 16.04
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1768976/+subscriptions
>

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Would it be possible for you to do a kernel bisection?

First, find the last good -rc kernel and the first bad -rc kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/

Then,
$ sudo apt build-dep linux
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
$ cd linux
$ git bisect start
$ git bisect good $(the good version you found)
$ git bisect bad $(the bad version found)
$ make localmodconfig
$ make -j`nproc` deb-pkg
Install the newly built kernel, then reboot with it.
If the issue still happens,
$ git bisect bad
Otherwise,
$ git bisect good
Repeat to "make -j`nproc` deb-pkg" until you find the commit that causes the regression.

Revision history for this message
misha (misha-manulis) wrote :

Having the same issue on Precision 5520. Started happening after upgrade form 16.04 to 18.04 as well. I don't have Synaptic drivers installed, using libinput.

Running the same kernel as for 16.04, 4.16.18-041618-generic installed from http://kernel.ubuntu.com/~kernel-ppa/mainline/

What kind of logs would you need to help debug this?

Revision history for this message
vasanag (vasanag) wrote :

same problem here. Laptop is overheating.

HP Pavilion Power Laptop 15-cb0xx

Revision history for this message
Simon Ádám (adtewa) wrote :

I had the same issue with clean install of Ubuntu 18.04.
16.04 was working fine with my laptop.
Laptop: Dell Studio XPS 1340, Processor: Intel Core 2 Duo P8600, VC: GeForce 9400MGE

Revision history for this message
esl (esl) wrote :

The same problem. Clean install. Asus S46CM.

Revision history for this message
Renaud Denis (renaud.denis) wrote :

Same problem for me.
Upgrade to 18.04.1 from 16.04.x.
Dell Precision M3800

Nvidia drivers 340 and 390 (both tested).
But I'm always on the Intel Graphics.

Temp regularly above 70°C. Never had this problem before.

Revision history for this message
Simon Ádám (adtewa) wrote :

UPDATE:
I wanted to try out installing 18.04.1
After having the live mode started and opening the menu my laptop shutted down suddenly and it is hot now.
Also it somehow messing up the fan. I have to restart from Windows to have it sound and work fine.
Windows is running like charm, no overheats.

Revision history for this message
Cyril Soler (csoler-users) wrote :

Same for me: ubuntu 18.04 just installed on Dell Precision M3800.

Before I was on 16.04 and CPU temp during normal session was about 47-55 deg.

Now, it's constantly at 63-80 deg.

I tried lots of tweaks (install thermald, tlp, cpufreq-selector, etc). Nothing worked until now. I'm seriously considering to go back to 16.04, or switch to anything else that would not slowly kill my hardware.

Revision history for this message
Eko Eryanto (ekoeryanto) wrote :

I have the same issue, migrated with fresh install from windows 10 (with no overheat problem)

Revision history for this message
Caramba (ecaramba) wrote :
Revision history for this message
Mostafa Aghajani (aghajani) wrote :

Same for me: Ubuntu 18.04 LTS fresh install on Surface Book 2017
Constant high CPU usage (> 90%) on kworker/0:1 and sometimes results to overheating and screen goes blank for seconds.

Revision history for this message
anewbie (anewbie) wrote :
Download full text (4.2 KiB)

I have the same problem (not over heating but xorg now constant spinning). Prior to upgrade from 16.04 to 18.04 never had this issue. At first I thought it might be related to application but even when nothing is running the x server just spins.

A small detail - I actually have a 2500K but there was a regression in the xserver with 18.04 (16.04 worked fine but 18.04 cause a bug that existed in 14.04 (I think - i had this problem after 12.04 I think but it was a long time ago) to reoccur and the intel graphic would cause crashes) so I purchased a R72402364P - anyway an strace shows this small tidbit with the xserver spinning):

(I filtered out other stuff the xserver is doing but as you can see from the timing it is spinning on ioctl calls):
--
1540153775.186129 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3f50) = 0
1540153775.186613 ioctl(16, DRM_IOCTL_RADEON_GEM_CREATE, 0x7ffd14ed45c0) = 0
1540153775.186650 ioctl(16, DRM_IOCTL_RADEON_GEM_VA, 0x7ffd14ed45a0) = 0
1540153775.186690 ioctl(16, DRM_IOCTL_RADEON_GEM_VA, 0x7ffd14ed4650) = 0
1540153775.186723 ioctl(16, DRM_IOCTL_GEM_CLOSE, 0x7ffd14ed4648) = 0
1540153775.186757 ioctl(16, DRM_IOCTL_RADEON_GEM_VA, 0x7ffd14ed4650) = 0
1540153775.186788 ioctl(16, DRM_IOCTL_GEM_CLOSE, 0x7ffd14ed4648) = 0
1540153775.186818 ioctl(16, DRM_IOCTL_RADEON_GEM_VA, 0x7ffd14ed4650) = 0
1540153775.186849 ioctl(16, DRM_IOCTL_GEM_CLOSE, 0x7ffd14ed4648) = 0
1540153775.186993 ioctl(16, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0x7ffd14ed46e0) = 0
1540153775.187389 ioctl(16, DRM_IOCTL_RADEON_GEM_CREATE, 0x7ffd14ed45c0) = 0
1540153775.187425 ioctl(16, DRM_IOCTL_RADEON_GEM_VA, 0x7ffd14ed45a0) = 0
1540153775.187460 ioctl(16, DRM_IOCTL_RADEON_GEM_VA, 0x7ffd14ed4650) = 0
1540153775.187492 ioctl(16, DRM_IOCTL_GEM_CLOSE, 0x7ffd14ed4648) = 0
1540153775.187611 ioctl(16, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0x7ffd14ed46e0) = 0
1540153775.188095 ioctl(16, DRM_IOCTL_RADEON_GEM_CREATE, 0x7ffd14ed45c0) = 0
1540153775.188130 ioctl(16, DRM_IOCTL_RADEON_GEM_VA, 0x7ffd14ed45a0) = 0
1540153775.188264 ioctl(16, DRM_IOCTL_RADEON_GEM_WAIT_IDLE, 0x7ffd14ed46e0) = 0
1540153775.188579 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188605 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188628 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188652 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188675 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188698 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188721 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188744 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188768 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188793 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188816 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188840 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188863 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188904 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd14ed3df0) = 0
1540153775.188930 ioctl(16, DRM_IOCTL_RADEON_GEM_BUSY, 0x7ffd1...

Read more...

Revision history for this message
gamer (random45726) wrote :

I want to add an observation. I upgraded from 16.04 to 18.04 and the first boot already was sluggish and the mouse pointer is slow. First I suspected a baloo_file process to be the cause. It's not. I noticed the fans of my thinkpad spin up. I saw no load on the cpu (as normal user). Also the disk io is low. So what is causing this? Then the sensors still show high temps and it got hot. I noticed also the gpu temp is hot. Then i disabled wayland and rebooted. Still it is slow. Then i booted only into recovery root shell and noticed only as root is a kworker/u16:0 process visible, which produces high load. So now I suspect some other kernel level process is out of control, but I don't know what this kworker process belongs to..

Revision history for this message
gamer (random45726) wrote :

Well, the new 4.15 kernel has that problem. I still have a 4.4 kernel as alternative, which does not have that problem. Something in between has gone bad. This does not really narrow it down enough to know the cause, but it's not the drivers or the hardware or other software alone.

Revision history for this message
gamer (random45726) wrote :

I want to add one more info, since most people might focus on the gpu or drivers. The gpu temp might only indirectly show cpu temps for my thinkpad, since they both are connected to the same heatpipe. Hot cpu => warm gpu. Does not mean gpu is in use.

Revision history for this message
gamer (random45726) wrote :

And another observation. After booting the old kernel several times.. the 4.15 kernel starts without problems also, currently it works without the heat and performance problems. I did not change anything. I have no idea what happened there. Weird.

Revision history for this message
gamer (random45726) wrote :

Aaand it's back again. Overheating. No real load. Hot as it can get.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

It is possible to try thermald package in Xenial?

Revision history for this message
Simon Ádám (adtewa) wrote :

Has the same issue with Linux Mint 19.1, I read that it is based on Ubuntu 18.04

Revision history for this message
Simon Ádám (adtewa) wrote :

The issue persists in version 18.10

Revision history for this message
Eugene Minov (minov-eug) wrote :

So, is it working workaround to downgrade to 4.4 kernel?
Can someone else confirm it please?

Cause I can't wait to upgrade my 16.04... again

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

If we can confirm thermald is not the culprit, we can do a kernel bisection to find the regression commit.

First, find the last good -rc kernel and the first bad -rc kernel from http://kernel.ubuntu.com/~kernel-ppa/mainline/

Then,
$ sudo apt build-dep linux
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
$ cd linux
$ git bisect start
$ git bisect good $(the good version you found)
$ git bisect bad $(the bad version found)
$ make localmodconfig
$ make -j`nproc` deb-pkg
Install the newly built kernel, then reboot with it.
If the issue still happens,
$ git bisect bad
Otherwise,
$ git bisect good
Repeat to "make -j`nproc` deb-pkg" until you find the commit that causes the regression.

Revision history for this message
Simon Ádám (adtewa) wrote :

Ubuntu offered me a security update. It was offering it for a long time and I didn't want to take it, because I didn't wanted to mess it up. But installed all, and it messed it up with kernel 4.15: it couldn't install the 340 nvidia driver I need into the new kernel. It became an endless loop when ran the new kernel version (4.15). I read somewhere that this might be because of video driver. I started the last working linux kernel and activated nouveau. 4.15 kernel started this way, but overheated in a minute and shutted down.

What I did is: started linux in command line mode and blacklisted nouveau, downloaded 340 nvidia from nvidia website and installed it manually (without DKMS, because it didn't work).
Now there is no overheat. Nouveau is the one to blame imo.

Revision history for this message
Simon Ádám (adtewa) wrote :

Left out: the kernel which is working now is the same, that had the overheat with nouveau: 4.15

Revision history for this message
RastaPopoulos (flbl) wrote :

Same symptoms here and no nvidia GPU, only "simple" Intel.

Since this version, temp is very high. Between 70 and 80 by default, and up to 90 and more with a youtube video and even more with a 3D game like Minetest… and then it shutdowns (like max 5-10min on Minetest).

It was clearly not the case with previous versions. Last year I remember watching hours of videos or playing hours of Tuxracer or Minetest with my son, and no overheating nor shutdowns. Never.

Processeur : Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
Mémoire : 12180MB (4073MB utilisé)
Machine Type : Notebook
Systeme d'exploitation : Ubuntu 18.04.1 LTS
Résolution : 1600x900 pixels
OpenGL Renderer : Mesa DRI Intel(R) Ivybridge Mobile
X11 Vendor : The X.Org Foundation

Revision history for this message
indigocat (indigocat) wrote :

Same problem here:

Lenovo ThinkPad T410s
Intel i5 (1st gen, 2.53Ghz)
NVidia NVS3100M graphics

Kernel 4.4 in 16.04 idles at 35°C - 37°C (fresh, high-quality thermal paste applied a week ago).

Kernel 4.15 in 18.04 idles at 48°C - 52°C.

Only Dropbox and Megasync in the background; htop shows very low CPU use from these 2 apps.

Overall system CPU usage (idle) doesn't go above 1.3% - 2% range.

Same workflow from 16.04 + kernel 4.4 runs 20°C hotter in 18.04 + kernel 4.15.

Regular operation with Chromium and Emacs drives temperature to 76°C - 80°C.

Critical temperature is 95°C before shutdown.

This is running really hot.

Revision history for this message
indigocat (indigocat) wrote :

I'll leave my hardware specs here, it could help pinning down the problem cause.
I'm using NVidia driver 340.107.

The overheating occurs under kernel 4.15, both in 16.04 and 18.04, *but* 18.04 runs about 8°C - 10°C hotter.

Kernel 4.18 runs just as hot in 18.04.

Revision history for this message
indigocat (indigocat) wrote :

According to this link, disabling Intel pstate governor brings the temperature down back to 16.04 levels:

https://askubuntu.com/questions/1063363/laptop-cpugpu-overheating-after-update-to-18-04-lts/1064534#1064534

Revision history for this message
Radovan (chicany) wrote :

I confirm, that disabling Intel pstate driver helps to solve overheating. Read instructions below.

https://brezular.com/2019/02/05/ubuntu-18-04-overheating/

Revision history for this message
indigocat (indigocat) wrote :

Followed instructions from #67.

Running 'cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_driver' to check for frequency scaling driver shows 'acpi-cpufreq' instead of 'intel_pstate', even *before* adding 'intel_pstate=disable' to /etc/default/grub.

After adding 'intel_pstate=disable' to /etc/default/grub and rebooting, 'cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_driver' shows 'acpi-cpufreq' as frequency scaling driver, too.

In my case, using kernel 4.15.0-45-generic, 'intel_pstate' frequency scaling driver has never been shown as active, yet the processor overheating continues.

My laptop's processor is an Intel i5-540M @ 2.53GHz, to be precise:
https://ark.intel.com/en/products/43544/Intel-Core-i5-540M-Processor-3M-Cache-2-53-GHz-

Video is NVidia Quadro NVS3100M with binary driver 340.107:
https://www.nvidia.com/object/nvs_techspecs.html

Revision history for this message
viny (vinyoliver-ti) wrote :

Recently I formatted and reinstalled the version 18.04.1 Now the overheating problem is gone. Everything is working just fine

Revision history for this message
Brian Wharton (brnwharton) wrote :

I have the same problem on my Vostro 3500. I'm using the 4.15.0-47-generic kernel with Gnome 3.28.2 on 18.04.2 LTS. Temperatures are at 67-81C and using Win10 on the same laptop the temperatures are 45-57C with no apps running in either OS.

Revision history for this message
Simon Ádám (adtewa) wrote :

UPDATE:
After blacklisting nuoveau and installing nvidia driver, there was no overheat shutdown, but the cpu is on full speed all the time. and the tempetature is 87 celsius constantly.
I tried disabling pstate, using tlp, neither worked so far.
Here is a usual frame from top:

top - 21:19:30 up 27 min, 1 user, load average: 2,84, 2,52, 2,26
Tasks: 251 total, 3 running, 197 sleeping, 0 stopped, 1 zombie
%Cpu(s): 65,2 us, 20,7 sy, 0,1 ni, 13,3 id, 0,1 wa, 0,0 hi, 0,7 si, 0,0 st
KiB Mem : 3780788 total, 152620 free, 2611464 used, 1016704 buff/cache
KiB Swap: 8000508 total, 7999228 free, 1280 used. 794276 avail Mem

  PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
  275 root 20 0 227032 182460 3284 R 94,4 4,8 24:46.69 systemd-udevd
  283 root 20 0 45892 5740 4568 R 16,7 0,2 5:12.21 systemd-udevd
  758 root 20 0 1077448 26772 12596 S 11,1 0,7 1:48.09 snapd
 4408 adtewa 20 0 52876 3992 3396 R 11,1 0,1 0:00.02 top
    7 root 20 0 0 0 0 S 5,6 0,0 0:02.51 ksoftirqd/0
  261 root 19 -1 164940 55772 54552 S 5,6 1,5 1:07.56 systemd-journal
    1 root 20 0 159928 9212 6696 S 0,0 0,2 0:01.86 systemd

systemd-udevd is running near 100% all the time.

Revision history for this message
Simon Ádám (adtewa) wrote :

UPDATE:

Solved by disabling bluetooth in bios according to:

https://askubuntu.com/questions/1028883/ubuntu-18-04-systemd-udevd-uses-high-cpu-conflict-with-wifi

It is silent and cool now.

Under the link you can find guys saying that this systemd-udevd 100% cpu usage is because of the dell bluetooth. Others might have similar issues with this. There is a probable solution here too if you want to use your bluetooth too. I didn't try it though.

In summary: blacklisting nouveau, installing nvidia driver and disabling bluetooth solved the overheat problem for me.

I feel this is a workaround though, and linux shouldn't overheat in any case.

Revision history for this message
Chris Stone (habile) wrote :

I'm not sure if this is related but just in case. I upgraded from 16.04.1 LTS to 18.04.2 - straight after I started getting alarms from my SNMP monitoring, low voltage VBat etc. Upon checking the sensors output I can see it's all messed up:

Linux host.no.domain 4.15.0-48-generic #51-Ubuntu SMP Wed Apr 3 08:28:49 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
root@host:/# sensors
it8728-isa-0a20
Adapter: ISA adapter
in0: +2.24 V (min = +2.32 V, max = +0.77 V) ALARM
in1: +2.24 V (min = +2.83 V, max = +0.67 V) ALARM
in2: +3.06 V (min = +1.68 V, max = +1.54 V) ALARM
+3.3V: +3.41 V (min = +0.77 V, max = +0.31 V) ALARM
in4: +2.24 V (min = +0.84 V, max = +0.65 V) ALARM
in5: +2.24 V (min = +1.58 V, max = +1.64 V) ALARM
in6: +2.24 V (min = +0.05 V, max = +2.36 V)
3VSB: +3.41 V (min = +1.66 V, max = +4.73 V)
Vbat: +0.14 V
fan1: 0 RPM (min = 13 RPM) ALARM
fan2: 2027 RPM (min = 11 RPM)
fan3: 0 RPM (min = 25 RPM) ALARM
temp1: +54.0°C (low = +101.0°C, high = +14.0°C) ALARM sensor = thermal diode
temp2: +55.0°C (low = +40.0°C, high = +7.0°C) ALARM sensor = Intel PECI
temp3: -128.0°C (low = +17.0°C, high = +26.0°C) sensor = disabled
intrusion0: ALARM

acpitz-virtual-0
Adapter: Virtual device
temp1: +27.8°C (crit = +106.0°C)
temp2: +29.8°C (crit = +106.0°C)

coretemp-isa-0000
Adapter: ISA adapter
Package id 0: +57.0°C (high = +87.0°C, crit = +105.0°C)
Core 0: +55.0°C (high = +87.0°C, crit = +105.0°C)
Core 1: +56.0°C (high = +87.0°C, crit = +105.0°C)

CPU temperatures and fan speed are probably OK but voltages are certainly not.

May be a red-herring - just thought I should bring it up.

Brad Figg (brad-figg)
tags: added: cscc
Revision history for this message
xtro (grailswebdev) wrote :

Have the same overheating problem on all my Lenovo laptops.

T520i (Intel graphics), Kernel: 4.6.0-040600
W510 (Intel/Nvidia), Kernel: 5.0.0-25
T530 (Intel/Nvidia), Kernel: 4.4.0-157

intel_pstate is not solving the problem. But I noticed it only overheats when watching videos in fullscreen mode.

cat /sys/devices/system/cpu/cpu*/cpufreq/scaling_driver
gives -> acpi-cpufreq

Revision history for this message
Pavel Zharov (trother555) wrote :

Same problem. XPS 15 9570, freshly installed Ubuntu 18.04.3. 70-99 degrees while using browser + IDE. Disabling turbo boost keeps it under 80 degrees, but that's not a solution. intel_pstate is enabled, disabling it does not help.

Revision history for this message
Davide Sangalli (davide-sangalli) wrote :

Same problem here.
Ubuntu 18.04.3 LTS
I tried kernels 4.15, 5.0 and 5.3 and the problem persists.
With default installation parameters CPU temperatures goes above 95° very easily.

The only way I managed to keep the CPU temp down is:
a) setting
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_pstate=disable"
and then choosing "PowerSave" mode in cpufreq

b) Installed laptop-mode-tools
Edited file: /etc/laptop-mode/laptop-mode.conf
It changed the file /proc/sys/vm/laptop_mode to 2
After that no significant changes in temperature when moving from AC to Battery

Attached hardinfo_report.html run after the doing the procedures above mentioned

information type: Public → Public Security
Revision history for this message
Artem (pathfinder1010) wrote :

Acer aspire a715-71g. Same problem. Laptop overheating when i switch on nvidia proprietary driver. On ubuntu 17.10 it worked fine.

Revision history for this message
Tomoki Kosugi (bp-kosugi) wrote :

Same problem. I use laptop PC, Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz, GPU GeForce GTX 1080 * 2.
DISTRIB_DESCRIPTION="Ubuntu 18.04.3 LTS".
Linux kernel version is "Linux anomalydetect 4.15.0-66-generic".

I tried "cpupower" that controls CPU clock frequency.
cf. https://wiki.archlinux.org/index.php/CPU_frequency_scaling

# cpupower frequency-set -u clock_freq

I tried the above with "clock_freq" set to "3GHz". (My CPU default max frequency is "4.5GHz")

Then, "cpupower" was able to keep CPU temperature lowly without slowing down in my application!!
What do you think about this solution?

Revision history for this message
Davide Sangalli (davide-sangalli) wrote :

An update. After some digging the conclusion I is that the problem in my case is not linux but might be the Asus BIOS. Even just enetering the BIOS the CPU temperature is rising up to 70-80 degrees. I'v e also upgraded the BIOS, but without significant improvements.

I have the impression the BIOS has problems in dealing with the intel turbo boost technology of my i7-8th generation. Disabling the turbo from linux side (I couldn't from the BIOS) the temperatures of my laptop are now fine.

Revision history for this message
Davide Sangalli (davide-sangalli) wrote :

I also mention that, monitoring with i7z the clock of my CPU, when intel turbo is not working I get different freqs for each core
        Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % Temp VCore
        Core 1 [0]: 1126.19 (11.23x) 7.99 93.9 1 1 40 0.6274
        Core 2 [1]: 1239.87 (12.36x) 3.2 95.1 1 2.11 41 0.6208
        Core 3 [2]: 1577.66 (15.73x) 3.7 95.4 1 1 38 0.6257
        Core 4 [3]: 1316.89 (13.13x) 7.23 81.9 1 12.8 39 0.6257
        Core 5 [4]: 1572.58 (15.68x) 10.2 87 1 4.73 39 0.6257
        Core 6 [5]: 1916.98 (19.11x) 8.02 90.6 1 1.44 39 0.6266

However, as soon as I load 1 core to 100%, all goes at the maximum frequency (here I'm limiting the maximum and also the core is under-volted to keep the temperature stable.
        Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % Temp VCore
        Core 1 [0]: 2891.95 (28.83x) 100 0 0 0 52 0.8234
        Core 2 [1]: 2891.77 (28.83x) 2.41 92.2 1 3.6 47 0.8229
        Core 3 [2]: 2891.72 (28.83x) 1.73 94.4 1 2.37 44 0.8225
        Core 4 [3]: 2891.81 (28.83x) 1 91.5 1 6.46 44 0.8230
        Core 5 [4]: 2892.39 (28.83x) 1 96.9 1 1.37 44 0.8242
        Core 6 [5]: 2891.94 (28.83x) 25 62.9 1 3.3 43 0.8243

Revision history for this message
Doug Smythies (dsmythies) wrote :

@Davide Sangalli: What you describe and show in your comment #80 is correct and exactly what should happen. Your processor is spending an extraordinary amount of time in C1 as opposed to deeper idle states. At what sampling frequency do you run i7z? I would suggest once every 15 seconds or so, so that you don't wake up CPUs just to ask them status stuff.

Everyone with this problem: If you don't bisect the kernel to isolate the offending commit, as has been asked for a couple of times now, then there will never be any progress for this bug report.

Revision history for this message
Davide Sangalli (davide-sangalli) wrote :

@Doug Smythies: thanks for the reply.
I'm sampling every second.

Anyway, just to better understand.
When my system is idle all cores spend in C0 less than 10% of the time and the frequency of each core reported by i7z is different and below 2.2GHz. When I run a single core (and single thread) task, there is just core "0" which is 100% of the time in C0 state (as it should) but I would have expected only such core to reach ~2.9GHz and the other to remain in below. Instead all cores goes to ~2.9GHz.
https://www.youtube.com/watch?v=HBKfkiOgIv4 (sorry for the oversimplified reference)

Revision history for this message
Doug Smythies (dsmythies) wrote :

There is only one PLL (Phase Locked Loop) in the processor, all CPUs get the resulting clock. When they go into deep idle states (deeper than C1, at least for my processor) then they give up their vote into the PLL as to what the frequency should be. Your single 100% task is dedicating the CPU frequency for all, for when they are active.

Revision history for this message
Davide Sangalli (davide-sangalli) wrote :

I guess you mean "there is only one PPL in the CPU, all cores get the resulting clock". This would mean the readout of i7z in the idle case, where each core has a different multiplier and a different clock, is fake.

Revision history for this message
Doug Smythies (dsmythies) wrote :

Each CPU can have a different request into the PLL (phase locked loop), but the highest one wins, and their vote does not count if they are in an idle state deeper than C1. You can observe this manually by reading the pstate request and granted MSRs directly (requires msr-tools, and the msr module must already be loaded). Example (requested, granted, 100% load on CPU 3):

root@s15:/home/doug/temp-git-git# rdmsr --bitfield 15:8 -d -a 0x199
16
16
16
38
16
16
16
16
root@s15:/home/doug/temp-git-git# rdmsr --bitfield 15:8 -d -a 0x198
36
35
36
36
36
36
36
36

Where 38 is the max and 16 in the min for my i7-2600K.
Why only 36 granted? because actually I guess three cores are active at the time of the sample. From turbostat:

cpu0: MSR_TURBO_RATIO_LIMIT: 0x23242526
35 * 100.0 = 3500.0 MHz max turbo 4 active cores
36 * 100.0 = 3600.0 MHz max turbo 3 active cores
37 * 100.0 = 3700.0 MHz max turbo 2 active cores
38 * 100.0 = 3800.0 MHz max turbo 1 active cores

Note: I ran the above as root because there is a lot over overhead running it under "sudo", which tends to result in another CPU requesting a high pstate by the time the sample is actually taken.

Now, the different reported CPU frequencies, for your example where there is no dominant load, is another story. While there is only one PLL, your mostly idle CPUs would be "awake" at different times, with the PLL ramping up and down based on which incoming votes and requests are being used.

Hope this helps, and apologies for this digression on this bug report.

Revision history for this message
Davide Sangalli (davide-sangalli) wrote :

Yeap it helps and it is quite clear.
Still another remark. Sorry for going on with the digression here.
If there is a more proper channel we can move there.

In my old laptop when I load it with a single thread application I get

  TURBO ENABLED on 2 Cores, Hyper Threading ON
  Max Frequency without considering Turbo 1894.72 MHz (99.72 x [19])
  Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is 27x/25x/25x/25x
  Real Current Frequency 2663.14 MHz [99.72 x 26.71] (Max of below)
      Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp VCore
      Core 1 [0]: 2663.14 (26.71x) 99.3 0 0 0 0 58 0.9207
      Core 2 [2]: 2492.89 (25.00x) 4.8 5.78 1 1 85.6 55 0.9207

Now the remarkable difference is that the second core spends 86% of the time in C7 state.
On the other hand in the new laptop all cores stay in C1 state most of the time.
Same happens without load (see previous post for data on new laptop).

  TURBO ENABLED on 2 Cores, Hyper Threading ON
  Max Frequency without considering Turbo 1894.72 MHz (99.72 x [19])
  Max TURBO Multiplier (if Enabled) with 1/2/3/4 Cores is 27x/25x/25x/25x
  Real Current Frequency 1111.72 MHz [99.72 x 11.15] (Max of below)
      Core [core-id] :Actual Freq (Mult.) C0% Halt(C1)% C3 % C6 % C7 % Temp VCore
      Core 1 [0]: 1019.47 (10.22x) 11.9 9.92 1 0 82.3 46 0.7955
      Core 2 [2]: 1111.72 (11.15x) 12 16.1 1 0 75.5 50 0.7955

In both cases the driver is the intel_pstate and n boh case I'm monitoring with i7z, sudo and 1 sec freq
Old processor is Intel Core i5-3337U
New processor is Intel Core i7-8750H

To post a comment you must log in.
This report contains Public Security information  
Everyone can see this security related information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.