Fanless systems with DPTF shutdown before using any passive cooling device

Bug #1803360 reported by Kai-Heng Feng
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
thermald (Ubuntu)
Fix Released
High
Colin Ian King
Xenial
Won't Fix
Undecided
Unassigned
Bionic
Fix Released
High
Colin Ian King
Cosmic
Fix Released
High
Colin Ian King
Disco
Fix Released
High
Colin Ian King

Bug Description

== SRU Justification, Cosmic, Bionic, Xenial ==

There are some new fanless platforms use DPTF's virtual sensor instead
of INT340X devices.

Because of that, the _PSV is no longer in use, at least not directly,
hence its value may set higher then _CRT. To a fanless system that means
no cooling device gets activated before _CRT, so the system will be
considered overheated by Linux kernel, and gets shutdown by the kernel.

== Fix ==

Upstream fix https://github.com/intel/thermal_daemon/commit/97976782dd26b4d592ccb97eb89c2a3a871a22a9

== Testing ==

Exercise CPUs on a fanless INT340X device with _CRT ACPI objects and try and reach the trip point. Without the fix thermal overrun occurs and this trips CPU shutdown. With the fix, thermald will start to throttle the system and get it out of the thermal overrun zone.

== Regression Potential ==

This modifies the behavior for just INT340X devices with ACPI _CRT objects, specifically, now to honor this setting. This is a small subset of devices with these objects and the change will in face make thermald catch systems before they hit thermal overrun, so the risk of regression is small. This fix also has been reviewed by the thermal experts at Intel, so it seems like a very reasonable workaround for these specific use cases.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :
Changed in thermald (Ubuntu):
assignee: nobody → Colin Ian King (colin-king)
importance: Undecided → High
status: New → In Progress
Revision history for this message
Colin Ian King (colin-king) wrote :

Pending on what upstream says about this fix.

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

@cking,

The PR is now merged, please backport it to Ubuntu's thermald.

Thanks!

Revision history for this message
Colin Ian King (colin-king) wrote :

OK, will sort that out this week.

description: updated
Changed in thermald (Ubuntu Xenial):
assignee: nobody → Colin Ian King (colin-king)
Changed in thermald (Ubuntu Bionic):
assignee: nobody → Colin Ian King (colin-king)
Changed in thermald (Ubuntu Cosmic):
assignee: nobody → Colin Ian King (colin-king)
status: New → In Progress
importance: Undecided → High
Changed in thermald (Ubuntu Bionic):
importance: Undecided → High
Changed in thermald (Ubuntu Xenial):
importance: Undecided → High
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package thermald - 1.8.0-1ubuntu1

---------------
thermald (1.8.0-1ubuntu1) disco; urgency=medium

  * Honor ACPI _CRT for processor thermal zone (LP: #1803360)
    There are some new fanless platforms use DPTF's virtual sensor
    instead of INT340X devices. Because of that, the _PSV is no
    longer in use, at least not directly, hence its value may set
    higher then _CRT. To a fanless system that means no cooling
    device gets activated before _CRT, so the system will be
    considered overheated by Linux kernel, and gets shutdown by the
    kernel. Upstream fix:
     - 97976782dd26 Honor ACPI _CRT for processor thermal zone

 -- Colin King <email address hidden> Mon, 14 Jan 2019 23:29:41 +0000

Changed in thermald (Ubuntu Disco):
status: In Progress → Fix Released
Revision history for this message
Colin Ian King (colin-king) wrote :

The xenial version of thermald does not support the INT340X, so I'm reluctant to back port the fix as I need to add the INT340X support in which is quite a few changes. As this is adding extra functionality to thermald I believe this falls outside the remit of a SRU.

Changed in thermald (Ubuntu Bionic):
status: New → In Progress
Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Ok, then we can leave Xenial out.

Changed in thermald (Ubuntu Xenial):
status: New → Won't Fix
assignee: Colin Ian King (colin-king) → nobody
importance: High → Undecided
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Kai-Heng, or anyone else affected,

Accepted thermald into cosmic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/thermald/1.7.0-8ubuntu1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-cosmic to verification-done-cosmic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-cosmic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in thermald (Ubuntu Cosmic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-cosmic
Changed in thermald (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed-bionic
Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Kai-Heng, or anyone else affected,

Accepted thermald into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/thermald/1.7.0-5ubuntu2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Colin Ian King (colin-king) wrote :

@Kai-Heng,

do you mind giving these packages a test now?

Thanks.

Revision history for this message
Alex Tu (alextu) wrote :

verified on LNG5-DVT1-C2 for bionic , it works as expected.
image: X36
BIOS:0.5.4
package: thermald 1.7.0-5ubuntu2

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Colin Ian King (colin-king) wrote :

@Alex, I don't have the H/W to verify this update for xenial. Do you mind testing it for Xenial too so I can get this released? Thanks!

Revision history for this message
Colin Ian King (colin-king) wrote :

@Alex, ignore the message in comment #12, I meant to ask you to test it for Cosmic. That would be really helpful. Thanks!

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package thermald - 1.7.0-5ubuntu2

---------------
thermald (1.7.0-5ubuntu2) bionic; urgency=medium

  * Honor ACPI _CRT for processor thermal zone (LP: #1803360)
    There are some new fanless platforms use DPTF's virtual sensor
    instead of INT340X devices. Because of that, the _PSV is no
    longer in use, at least not directly, hence its value may set
    higher then _CRT. To a fanless system that means no cooling
    device gets activated before _CRT, so the system will be
    considered overheated by Linux kernel, and gets shutdown by the
    kernel. Upstream fixes:
     - 7af3eef07dc7 Ignore _TRT and B0D4 device if passive 1 UUID not present
       (prerequisite)
     - 97976782dd26 Honor ACPI _CRT for processor thermal zone

 -- Colin King <email address hidden> Mon, 14 Jan 2019 23:29:41 +0000

Changed in thermald (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Adam Conrad (adconrad) wrote : Update Released

The verification of the Stable Release Update for thermald has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

tags: added: verification-done-cosmic
removed: verification-needed-cosmic
Revision history for this message
Colin Ian King (colin-king) wrote :

I see that the verification-done-cosmic tag has been set, I'll also set the verification-done tag too

tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package thermald - 1.7.0-8ubuntu1

---------------
thermald (1.7.0-8ubuntu1) cosmic; urgency=medium

  * Honor ACPI _CRT for processor thermal zone (LP: #1803360)
    There are some new fanless platforms use DPTF's virtual sensor
    instead of INT340X devices. Because of that, the _PSV is no
    longer in use, at least not directly, hence its value may set
    higher then _CRT. To a fanless system that means no cooling
    device gets activated before _CRT, so the system will be
    considered overheated by Linux kernel, and gets shutdown by the
    kernel. Upstream fixes:
     - 7af3eef07dc7 Ignore _TRT and B0D4 device if passive 1 UUID not present
       (prerequisite)
     - 97976782dd26 Honor ACPI _CRT for processor thermal zone

 -- Colin King <email address hidden> Mon, 14 Jan 2019 23:29:41 +0000

Changed in thermald (Ubuntu Cosmic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.