[SRF] performance: hwmon: (coretemp) Fix core count limitation

Bug #2058668 reported by Thibf
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
New
Undecided
Unassigned
Jammy
Invalid
Undecided
Thibf

Bug Description

This is a public version of https://bugs.launchpad.net/bugs/2049792

Backport: [SRF] performance: hwmon: (coretemp) Fix core count limitation (merged upstream in 6.9) to jammy

[Impact]

In linux 6.8 the coretemp driver supports at most 128 cores per package.
Cores higher than 128 will lose their core temperature information.

There is an upstream patch set that allows to support more than 128
cores per package, it's applied to linux-next, then to Noble.

We should apply the patch set to the Jammy 5.15 kernel, so that we can
properly support systems with a large amount of cores per package.

[Test case]

Read temperature info from /sys/class/hwmon on a system with > 128 cores
per package (that means we don't have a proper test case to verify the
fix at the moment).

[Fix]

A series of patch is part of this improvement:

1a793caf6f69 hwmon: (coretemp) Use dynamic allocated memory for core temp_data
18b24a5f9ca3 hwmon: (coretemp) Remove redundant temp_data->is_pkg_data
326241f71f3d hwmon: (coretemp) Split package temp_data and core temp_data
b0b01414a261 hwmon: (coretemp) Abstract core_temp helpers
87eb801925a0 hwmon: (coretemp) Remove redundant pdata->cpu_map[]
18d8f5583388 hwmon: (coretemp) Replace sensor_device_attribute with device_attribute
25f8e01baa05 hwmon: (coretemp) Remove unnecessary dependency of array index
c8c2074020a8 hwmon: (coretemp) Introduce enum for attr index

And some patch are required to make the backporting clean:

34cf8c657cf03 hwmon: (coretemp) Enlarge per package core count limit
fdaf0c8629d45 hwmon: (coretemp) Fix bogus core_id to attr name mapping
4e440abc89458 hwmon: (coretemp) Fix out-of-bounds memory access
a2930f6dc90f0 hwmon: (coretemp) Delete an obsolete comment
6c2b659913ad9 hwmon: (coretemp) Delete tjmax debug message
0f8b916bc5b5d hwmon: (coretemp) avoid RDMSR interrupts to isolated CPUs
fae30e3c203e0 hwmon: (coretemp) Add support for dynamic ttarget
c0c67f8761cec hwmon: (coretemp) Add support for dynamic tjmax
2bc0e6d07ee50 hwmon: (coretemp) rearrange tjmax handing code
5c0e64dde80ff hwmon: (coretemp) Remove obsolete temp_data->valid

Only 5c0e64dde80ff has to be modified as it's deleting a variable which changed type
because of a refactoring.

There is a number of commits, but they are only changing one file.

[Regression potential]

We may experience hwmon-related regressions, either systems reading
incorrect temperature information or even bugs/crashes when accessing
data from /sys/class/hwmon.

Thibf (thibf)
description: updated
Changed in linux (Ubuntu Jammy):
status: New → In Progress
assignee: nobody → Thibf (thibf)
Revision history for this message
Thibf (thibf) wrote :

Patchset submitted to the mailing list

Thibf (thibf)
Changed in linux (Ubuntu Jammy):
status: In Progress → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.