aticonfig hangs if discrete GPU is disabled, causing indicator-sensors to hang

Bug #1016896 reported by htrex on 2012-06-23
20
This bug affects 4 people
Affects Status Importance Assigned to Milestone
Hardware Sensors Indicator
Undecided
Unassigned
fglrx
Confirmed
Medium

Bug Description

Installed 0.2-1 from PPA, after reboot it doesn't show anything on the panel if the discrete GPU is disabled.

I'm using a multiGPU laptop with an intel i7 sandy bridge + ATI 6770 and catalyst 12.04, hopefully the HP DV6-6xxx laptop I'm using has a bios switch to set fixed mode GPU switching and I can successfully change GPU just restarting X.

When launching indicator sensors from the command line I'm seeing it hangs while searching ATI sensors and takes 100% of one CPU core indefinitely, while there's no problem if the discrete GPU is enabled, I can't use it most of the time as usually working with the integrated GPU is enough.

aticonfig --list-adapters
* 0. 01:00.0 AMD Radeon HD 6700M Series

* - Default adapter

and

aticonfig --px-list-active-gpu
PowerXpress: Integrated GPU is active (Power-Saving mode).

could be usefull to check if indicator-sensors is executing on a multiGPU setup and if the discrete GPU is active.

Please have a check at it, a 40°C summer is here! :)

indicator-sensors
[manager] WARNING: Failed to load sensor configs from file /home/htrex/.config/indicator-sensors/sensors: No such file or directory
[nvidia] DEBUG: searching for sensors
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sdc3 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sdc4 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sdc5 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sdc does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sdc6 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sdc7 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sdb does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sdc8 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sr0 does not support SMART monitoring, ignoring...
[store] DEBUG: inserted sensor udisks/sda with label SAMSUNG SSD 830 Series
[application] MESSAGE: Unable to restore saved label for sensor udisks/sda: Key file does not have group 'udisks/sda'
[application] MESSAGE: Unable to restore saved alarm-value for sensor udisks/sda: Key file does not have group 'udisks/sda'
[application] MESSAGE: Unable to restore saved alarm-mode for sensor udisks/sda: Key file does not have group 'udisks/sda'
[application] MESSAGE: Unable to restore saved low-value for sensor udisks/sda: Key file does not have group 'udisks/sda'
[application] MESSAGE: Unable to restore saved high-value for sensor udisks/sda: Key file does not have group 'udisks/sda'
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda1 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda2 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda3 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda4 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda5 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda6 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sdc1 does not support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sdc2 does not support SMART monitoring, ignoring...
[libsensors] DEBUG: searching for sensors
[store] DEBUG: inserted sensor libsensors/acpitz-virtual-0/0 with label temp1
[application] MESSAGE: Unable to restore saved label for sensor libsensors/acpitz-virtual-0/0: Key file does not have group 'libsensors/acpitz-virtual-0/0'
[application] MESSAGE: Unable to restore saved alarm-value for sensor libsensors/acpitz-virtual-0/0: Key file does not have group 'libsensors/acpitz-virtual-0/0'
[application] MESSAGE: Unable to restore saved alarm-mode for sensor libsensors/acpitz-virtual-0/0: Key file does not have group 'libsensors/acpitz-virtual-0/0'
[application] MESSAGE: Unable to restore saved low-value for sensor libsensors/acpitz-virtual-0/0: Key file does not have group 'libsensors/acpitz-virtual-0/0'
[application] MESSAGE: Unable to restore saved high-value for sensor libsensors/acpitz-virtual-0/0: Key file does not have group 'libsensors/acpitz-virtual-0/0'
[aticonfig] DEBUG: searching for sensors

Are you sure it has hung, it could be that it just hasn't detected any
valid ati gpu.

It works perfectly when switching to the discrete GPU, it hangs when using integrated, so I'm guessing that when the discrete GPU is disabled sensor-indicator still checks for it's sensor but it doesn't receive any response because it's turned off?

Alex Murray (alexmurray) wrote :

I think you are misunderstanding the issue - aticonfig lists only one adapter:

aticonfig --list-adapters
* 0. 01:00.0 AMD Radeon HD 6700M Series

Which is the discrete ATI GPU - so its not a bug that when you disable this GPU that it isn't detected.

The integrated GPU is the Intel one and hence can't be read by aticonfig anyway and so again it isn't surprising that aticonfig can't read any temperature when the discrete GPU is disabled.

indicator-sensors hasn't hung either BTW - it just hasn't detected any ATI GPUs when the discrete one is disabled as there is not active ATI GPU to detect and hence no temperature to display.

Closing as invalid, please reopen if you think this is not the case.

Changed in indicator-sensors:
status: New → Invalid
htrex (hantarex) wrote :

On my setup indicator-sensors works only when the discrete GPU is active, while when the integrated GPU is active it doesn't start at all and eats 100% of a CPU core.

I'm not expecting indicator-sensors to read the discrete GPU temp when is inactive but the contrary, it shouldn't try to read the sensor when that GPU is switched off, this seems to be the cause of the hang I'm seeing.

aticonfig --px-list-active-gpu

says

PowerXpress: Integrated GPU is active (Power-Saving mode).

when the discrete GPU is not in use, in this case indicator-sensors should give up looking for the ATI sensor.

Hope the problem is more clear now.
Thanks

Changed in indicator-sensors:
status: Invalid → New
Alex Murray (alexmurray) wrote :

Ahh okay sorry I misunderstood.

Can you please post the full output of the following commands when the discrete GPU is disabled and I'll see if I can reproduce it using this output and try and fix the problem:

aticonfig --list-adapters
aticonfig --od-gettemperature --adapter=0
aticonfig --pplib-cmd "get fanspeed 0"

htrex (hantarex) wrote :

Here the results:

aticonfig --list-adapters
* 0. 01:00.0 AMD Radeon HD 6700M Series

* - Default adapter

aticonfig --od-gettemperature --adapter=0
^^^ this command doesn't give any result, hangs and uses 100% CPU until CTRL+C to stop it
^^^ (this line it's a comment of mine, not an actual command response !!!)

aticonfig --pplib-cmd "get fanspeed 0"
PPLIB command execution has failed!
ati_pplib_cmd: execute "get" failed!

Alex Murray (alexmurray) wrote :

Ahh so its aticonfig itself which is hanging which inturn causes indicator-sensors to hang as indicator-sensors is waiting for aticonfig to return - I'll see what I can do about trying to workaround this and stop aticonfig from hanging indicator-sensors - I suggest you report a bug to ATI if possible since their tool is obviously buggy.

summary: - indicator-sensors hangs on a multi-GPU laptop
+ aticonfig hangs if discrete GPU is disabled, causing indicator-sensors
+ to hang
htrex (hantarex) wrote :

 I'll try to report the bug but I'm not sure where and if they care enough to fix it, my real hopes are in you! ;)

For the workaround, the most solid way I can think to determine if a GPU is active or not is looking for "VGA controller" near to it in lspci.

----------------------------
DISCRETE ENABLED
----------------------------
lspci -vnnn | grep VGA
00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0116] (rev 09) (prog-if 00 [VGA controller])
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices [AMD] nee ATI Whistler XT [AMD Radeon HD 6700M Series] [1002:6740] (prog-if 00 [VGA controller])

----------------------------
DISCRETE DISABLED
----------------------------
lspci -vnnn | grep VGA
00:02.0 VGA compatible controller [0300]: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0116] (rev 09) (prog-if 00 [VGA controller])
01:00.0 VGA compatible controller [0300]: Advanced Micro Devices [AMD] nee ATI Whistler XT [AMD Radeon HD 6700M Series] [1002:6740] (rev ff) (prog-if ff)

from the command output in my laptop is clear that the intel integrated VGA is never really disabled (from other literature we know that in any case it routes the VGA signal to the monitor), so the most solid evidence is that "VGA controller", not present on a disabled GPU.

Hope this helps and thanks again for looking into that :)

htrex (hantarex) wrote :

I've created a bug for aticonfig on the unofficial bug tracker http://ati.cchtml.com/show_bug.cgi?id=543 listed on the official AMD Catalyst for linux page http://support.amd.com/us/gpudownload/linux/Pages/radeon_linux.aspx

It will probably sit forever there as NEW, thought.

Alex Murray (alexmurray) wrote :

Hopefully ATI can fix it but I'll see what I can do about implementing a work-around nonetheless.

htrex (hantarex) wrote :

Great Alex, consider me as your first beta tester ;)

htrex (hantarex) wrote :

update: catalyst 12.8 was released and I can confirm this nifty bug is still there, no joy so far....

Nick S (pinkcoons) wrote :

Can confirm this on ATI 7730M. Guess we'll have to wait until ATI fixes this in Catalyst (i.e. forever). Until then, maybe you could just set a timeout for the aticonfig command?

In the meantime I've "fixed" it by moving the aticonfig plugin:

sudo mv /usr/lib/indicator-sensors/plugins/aticonfig/aticonfig.plugin{,.broken}

Alex Murray (alexmurray) wrote :

If I could easily set a timeout for the command I would - but there is no existing API to do this - instead a separate thread has to be created and the command run from there and then the result passed back etc which is a reasonable bit of work to implement.. hopefully I can implement a workaround soon though - just need to find the time....

htrex (hantarex) wrote :

tnx for the workaround Nick.

Alex what about looking for this?
aticonfig --pplib-cmd "get fanspeed 0"

exits with an error when the discrete GPU is disabled.

Alex Murray (alexmurray) wrote :

Yeah actually it turns out that even if we run the command in a separate thread there is still no nice way to kill it if it hasn't returned.

@htrex - what if there a valid card which doesn't support reading the fan speed - in that case we will get the error from running getfanspeed but it would still be valid and would work to try and get the temperature of such a card...

Nick Andrik (andrikos) wrote :

@Alex:
What about if you run BOTH
> aticonfig --pxl
PowerXpress: Integrated GPU is active (Power-Saving mode).
AND
> aticonfig --pplib-cmd "get fanspeed 0"
PPLIB command execution has failed!
ati_pplib_cmd: execute "get" failed!

This should be enough indication not to try and get the ATI GPU temperature, no?

Personally, I would check only the first one, since this shows that an hybrid system exists and the discrete (ATI) GPU is disabled.

What do you think?

Changed in fglrx:
importance: Unknown → Medium
status: Unknown → Confirmed
Alex Murray (alexmurray) wrote :

Okay, I've added a workaround[1] as you suggest simply checking the output of aticonfig --pxl containing the expression "Integrated GPU is active" in which case we bail - this should be in the next daily build[2] so if you could please test that and let me know if it avoids this issue that'd be great.

[1] https://github.com/alexmurray/indicator-sensors/commit/e2f06c4fa272d69604b311fe611bdc0452548ae3
[2] https://code.launchpad.net/~alexmurray/+archive/indicator-sensors-daily

Nick Andrik (andrikos) wrote :

Hi Alex,

I tried your new version, but the indicator still hangs.
My guess is that this happens due to the get_temperature call in is-aticonfig-plugin.c:328

I have also verified that this command never returns when iGPU is enabled:
aticonfig --od-gettemperature --adapter=0

Alex Murray (alexmurray) wrote :

@Nick - can you try running indicator-sensors from a terminal and attaching the output (I've just committed some stuff which should add a bit more debugging output as well so the next daily build may be more useful but if you could post what output you're currently getting that'd be great.

Download full text (3.2 KiB)

This is what I get when I have the discrete gpu enabled

Nikos

> indicator-sensors
[indicator] DEBUG: new primary sensor path
libsensors/acpitz-virtual-0/0 (previously (null))
[indicator] DEBUG: Setting primary sensor path to: libsensors/acpitz-virtual-0/0
[nvidia] DEBUG: searching for sensors
[libsensors] DEBUG: searching for sensors
[store] DEBUG: inserted sensor libsensors/acpitz-virtual-0/0 with label temp1
[indicator] DEBUG: Creating menu item for newly enabled sensor
libsensors/acpitz-virtual-0/0
[indicator] DEBUG: Using sensor with path
libsensors/acpitz-virtual-0/0 as primary
[indicator] DEBUG: Checking new primary sensor item
[store] DEBUG: inserted sensor libsensors/acpitz-virtual-0/2 with label temp2
[indicator] DEBUG: Creating menu item for newly enabled sensor
libsensors/acpitz-virtual-0/2
[store] DEBUG: inserted sensor libsensors/coretemp-isa-0000/0 with
label Physical id 0
[store] DEBUG: inserted sensor libsensors/coretemp-isa-0000/4 with label Core 0
[store] DEBUG: inserted sensor libsensors/coretemp-isa-0000/8 with label Core 1
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda1 does not
support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda2 does not
support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda3 does not
support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda4 does not
support SMART monitoring, ignoring...
[store] DEBUG: inserted sensor udisks/sda with label TOSHIBA MK6459GSXP
[indicator] DEBUG: Creating menu item for newly enabled sensor udisks/sda
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda5 does not
support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/mmcblk0 does not
support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda7 does not
support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sda6 does not
support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/sr0 does not
support SMART monitoring, ignoring...
[udisks] DEBUG: drive /org/freedesktop/UDisks/devices/mmcblk0p1 does
not support SMART monitoring, ignoring...
[aticonfig] DEBUG: searching for sensors
[store] DEBUG: inserted sensor aticonfig/GPU0Temperature with label
AMD Radeon 6600M and 6700M Series
[indicator] DEBUG: Creating menu item for newly enabled sensor
aticonfig/GPU0Temperature
ati_pplib_cmd: execute "get" failed!
[aticonfig] WARNING: Error getting fanpeed for adapter 0: Error
reading fanspeed value for GPU 0
[dbus-plugin] DEBUG: Acquired a message bus connection

[dbus-plugin] DEBUG: Creating an ActiveSensor at path
/com/github/alexmurray/IndicatorSensors/ActiveSensors/libsensors/acpitz_virtual_0/0

[dbus-plugin] DEBUG: Creating an ActiveSensor at path
/com/github/alexmurray/IndicatorSensors/ActiveSensors/libsensors/acpitz_virtual_0/2

[dbus-plugin] DEBUG: Creating an ActiveSensor at path
/com/github/alexmurray/IndicatorSensors/ActiveSensors/udisks/sda

[dbus-plugin] DEBUG: Creating an ActiveSensor at path
/com/github/alexmurray/IndicatorSensors/ActiveSens...

Read more...

Alex Murray (alexmurray) wrote :

@Nick - it looks like you're still running the version without the workaround since it should say something like:

[aticonfig] DEBUG: Checking for hybrid system with integrated GPU active

Can you make sure you're running the latest version from the daily ppa - also this bug relates to aticonfig hanging when the INTEGRATED GPU is active so can you please make sure you test it with the integrated one active, not the discrete one.

Nick Andrik (andrikos) wrote :
Nick Andrik (andrikos) wrote :

I attach both logs, i.e. when discrete or integrated is selected.
In both cases, the applet works ok ( no more 100% cpu).
If integrated is selected I get no temperature sensor for AMD in the applet

Alex Murray (alexmurray) wrote :

@Nick- great, this is the expected result. Am closing this bug then as 'fixed' against indicator-sensors, but if the real bug in aticonfig gets fixed then we can look at backing out this workaround...

Changed in indicator-sensors:
status: New → Fix Committed
Nick Andrik (andrikos) wrote :

Please let us know when you will have a new official release out, so
that we can upgrade.
Actually it would be perfect if the daily builds have a higher version
than the official ones ;)

Thanks for your project
Nikos

Alex Murray (alexmurray) on 2013-02-09
Changed in indicator-sensors:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.