Dell 7140 avarage power consumption in idle increased by 1.3-1.5 times due to events by INT3432

Bug #1719795 reported by RussianNeuroMancer
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Invalid
Medium
linux (Ubuntu)
Invalid
Medium
Unassigned

Bug Description

On Dell Venue 11 Pro 7140 average power consumption in idle increased by 1.3-1.5 times due to events coming from INT343A. According to powertop since Linux 4.10 INT3432:00 generate around two hundred events on average, in /sys/devices/pci0000:00/INT3432:00/i2c-6 there is two devices: INT343A and SMO91D0. AFAIK INT343A is rt286.

With Linux 4.9.0-4.9.45, Linux 4.11.0-4.11.12 in idle there is around 100 wakeups per second in sum, battery discharge rate around 3-3.5 Watts per second.
But with Linux 4.9.46-4.9.51, Linux 4.10.0-4.10.17, Linux 4.12.0rc1-4.13.3 - around 300 wakeups per second on average, due to events coming from INT3432:00. With Linux 4.13.3 battery discharge rate around 4.5 Watts per second.
Probably some commit was backported to Linux 4.9 between .45 and .46 releases.
I have no idea why issue is not reproducible on any Linux 4.11 release I tried.

Sometimes INT3432 events rate fall from two hundred to one hundred for shorts period of time (for example I observe this right now on Linux 4.10.0 while removing/installing packages).

Message like this sometimes appear in dmesg:
[ 731.226730] i2c_hid i2c-SMO91D0:00: i2c_hid_get_input: incomplete report (53/13568)

Complete dmesg with Linux 4.13.3 is attached.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 258613
dmesg with Linux 4.13.3

On Dell Venue 11 Pro 7140 average power consumption in idle increased by 1.3-1.5 times due to events coming from INT343A. According to powertop since Linux 4.10 INT3432:00 generate around two hundred events on average, in /sys/devices/pci0000:00/INT3432:00/i2c-6 there is two devices: INT343A and SMO91D0. AFAIK INT343A is rt286.

With Linux 4.9.0-4.9.45, Linux 4.11.0-4.11.12 in idle there is around 100 wakeups per second in sum, battery discharge rate around 3-3.5 Watts per second.
But with Linux 4.9.46-4.9.51, Linux 4.10.0-4.10.17, Linux 4.12.0rc1-4.13.3 - around 300 wakeups per second on average, due to events coming from INT3432:00. With Linux 4.13.3 battery discharge rate around 4.5 Watts per second.
Probably some commit was backported to Linux 4.9 between .45 and .46 releases.
I have no idea why issue is not reproducible on any Linux 4.11 release I tried.

Sometimes events rate fall from two hundred to one hundred for shorts period of time (for example I observe this right now on Linux 4.10.0 while removing/installing packages).

Message like this sometimes appear in dmesg:
[ 731.226730] i2c_hid i2c-SMO91D0:00: i2c_hid_get_input: incomplete report (53/13568)

Complete dmesg with Linux 4.13.3 is attached.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

> Sometimes events rate fall from two hundred to one hundred for shorts period
> of time

Correction: here I talk about events coming especially from INT343A, not total events rate.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Sorry, another correction, just to be sure:

> Sometimes events rate fall from two hundred to one hundred for shorts period
> of time

Here I talk about events coming especially from *INT3432*, not total events rate.

Revision history for this message
RussianNeuroMancer (russianneuromancer) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.14 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.14-rc2/

Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
RussianNeuroMancer (russianneuromancer) wrote :

Hello, Joseph!

Issue is still reproducible with 4.14rc3.

tags: added: kernel-bug-exists-upstream
Revision history for this message
In , kai.heng.feng (kai.heng.feng-linux-kernel-bugs) wrote :

Can you do a bisect between 4.9.45 and 4.9.46?

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

since there are not too many changes between 4.9.45 and 4.9.46, please do git bisect to find out which commit introduces the problem.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

> Can you do a bisect between 4.9.45 and 4.9.46?

> since there are not too many changes between 4.9.45 and 4.9.46, please do git
> bisect to find out which commit introduces the problem.

Thanks for advice! I'll try to do so, as soon as it will be possible. (There is some issues with hardware I usually use for building kernels.)

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

any updates?

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Not yet, as issues mentioned above remain unresolved, so I still can't rebuild kernel.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Hardware I usually use for building kernels is operational again, so I hope to do git bisect between 4.9.45 and 4.9.46 in next couple of weeks.

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

ping ...

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Hello!

Albeit I started it with (unexpected for me) delay and doing it slow (due to age of hardware I using for building kernel) bisect is in progress right now.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

On every step I done testing three times but unfortunately commit I come to seems like doesn't make any sense for this issue:

5f81b1f51b9cfcbfbe7a1abea09962c91bf485e7 is the first bad commit
commit 5f81b1f51b9cfcbfbe7a1abea09962c91bf485e7
Author: Florian Westphal <email address hidden>
Date: Fri Jul 7 13:07:17 2017 +0200

    netfilter: nat: fix src map lookup

    commit 97772bcd56efa21d9d8976db6f205574ea602f51 upstream.

    When doing initial conversion to rhashtable I replaced the bucket
    walk with a single rhashtable_lookup_fast().
...

I will re-done bisect doing ten tests on every step this time.

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

so I suppose we will have some update about the bisect?

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Yes, with additional tests I find that assumption about Linux 4.9.45 was wrong - 4.9.45 is affected too. Blame commit seems like somewhere between 4.9.0-4.9.8. (Sorry for slow progress on this, hardware for building is old and slow, and every test takes much more time now.)

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

any updates?

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

As I proceed with bisect (due to various reasons now I have to build inside virtual machine instead of bare metal hardware, so this slow down building by few times) I have difficulties with determining what build have to be marked as good, and what build have to be marked as bad. For example, with 4.9.44 I seen issue reproduced couple of times, but most of the time it doesn't happen with this release. With 4.9.3 issue seems like doesn't happen at all, but if I boot 4.17.0 and then reboot to 4.9.3 - it's there. With 4.9.8 it's seems like the same, but I can't be 100% sure, as situation could be the similar to 4.9.44, where issue is rare but happen sometimes.
But with 4.9.46 issue happen every time.

In your opinion, do I need to chase for commit that makes issue reproduced on every boot in 100% of attempts, or it's better to search for commit that makes issue reproduced at least once? Should I care about reboots from affected kernel into unaffected, like in 4.17.0->4.9.3 or it's possible that newer kernel somehow put hardware into failed state which makes issue happen on unaffected kernels too?

Changed in linux:
importance: Unknown → Medium
status: Unknown → Incomplete
Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

(In reply to RussianNeuroMancer from comment #15)
> As I proceed with bisect (due to various reasons now I have to build inside
> virtual machine instead of bare metal hardware, so this slow down building
> by few times) I have difficulties with determining what build have to be
> marked as good, and what build have to be marked as bad. For example, with
> 4.9.44 I seen issue reproduced couple of times, but most of the time it
> doesn't happen with this release. With 4.9.3 issue seems like doesn't happen
> at all, but if I boot 4.17.0 and then reboot to 4.9.3 - it's there.

this is important, please attach the output of "turbostat --debug" and "powertop --html=foo" for both good and bad case, in 4.9.3 kernel.

As the problem can also be reproduced on 4.9.3, remove the regression flag for now.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 276553
turbostat on Linux 4.9.3 normal boot

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 276555
powertop on Linux 4.9.3 normal boot

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 276557
turbostat on Linux 4.9.3 boot after Linux 4.17.0

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 276559
powertop on Linux 4.9.3 boot after Linux 4.17.0

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

> hmm, can you please attach the acpidump output, and also the output of
"cat /proc/interrupts" and "grep . /sys/firmware/acpi/interrupts/*" for both
good and bad case.

Sure, all data is uploaded below:

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 277031
acpidump output on Linux 4.9.3 normal boot

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 277033
/proc/interrupts content on Linux 4.9.3 normal boot

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 277035
/sys/firmware/acpi/interrupts/ content on Linux 4.9.3 normal boot

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 277037
acpidump output on Linux 4.9.3 boot after Linux 4.17.0

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 277039
/proc/interrupts content on Linux 4.9.3 boot after Linux 4.17.0

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 277041
/sys/firmware/acpi/interrupts/ content on Linux 4.9.3 boot after Linux 4.17.0

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

If continuing bisect could be helpful please clarify how to proceed with it, relevant question is in Comment 15.

Revision history for this message
In , rui.zhang (rui.zhang-linux-kernel-bugs) wrote :

for Linux 4.9.3 normal boot
  7: 4945 1476 1820 275 IR-IO-APIC 7-fasteoi INT3432:00, INT3433:00

for Linux 4.9.3 boot after Linux 4.17.0
 7: 327339 48015 802942 21344 IR-IO-APIC 7-fasteoi INT3432:00, INT3433:00

yes. there is indeed an interrupt storm, and this could increase the power consumption easily.

It is very likely that the I2C bus is not powered off cleanly during reboot.

so, when you say normal boot, you mean a cold boot, say, in 4.17.0 kernel, shutdown the machine, and then power on the machine manually to boot into 4.9.3 kernel, right?

If this is true, we are still able to confirm the good and bad kernel, by do cold boot every time, right?

As this seems to be a driver issue, reassign to I2C experts anyway.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Thank you for looking into logs!

> so, when you say normal boot, you mean a cold boot, say, in 4.17.0 kernel,
> shutdown the machine, and then power on the machine manually to boot into
> 4.9.3 kernel, right?

Yes.

> If this is true, we are still able to confirm the good and bad kernel, by do
> cold boot every time, right?

So, if I cold boot some build for example 10-20 times, and interrupt storm happened at least once, then I should mark it as bad?

Does it count if I reboot (instead of cold boot) same build again and again, and then got interrupt storm after many attempts?

Revision history for this message
In , mika.westerberg (mika.westerberg-linux-kernel-bugs) wrote :

Can you also attach contents of /sys/bus/i2c/devices/*? It would be nice to know all devices connected to I2C buses.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 278237
i2c-ls

> Can you also attach contents of /sys/bus/i2c/devices/*?

Please look into attached file.

"ls /sys/bus/i2c/devices/*" output is sufficient or some additional info is required?

Revision history for this message
In , jarkko.nikula (jarkko.nikula-linux-kernel-bugs) wrote :

I see SMO91D0:00 (Sensor Hub) is also generating some amount of interrupts. Maybe something is generating a lot of events from there and that causes a lot of I2C traffic from drivers?

Revision history for this message
In , mika.westerberg (mika.westerberg-linux-kernel-bugs) wrote :

Indeed. I wonder if you can unload (or blacklist) those drivers and see if the interrupt count goes low?

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 278485
/proc/interrupts content on Linux 4.18.6 normal boot with i2c_hid module blacklisted

On boot with blacklisted i2c_hid there is 40-60 wakeups per second instead of 300+, /proc/interrupts content is attached.

Revision history for this message
In , mika.westerberg (mika.westerberg-linux-kernel-bugs) wrote :

OK, thanks. I kind of suspect that the sensor hub is the one generating those interrupts. Could you blacklist just hid-sensor-hub and see if you still see the interrupt storm?

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

> Could you blacklist just hid-sensor-hub and see if you still see the
> interrupt storm?

Blacklisted hid-sensor-hub and get same result as with blacklisting i2c_hid - no interrupt storm. Power consumption is below 3 Watts per second in idle.

Revision history for this message
In , mika.westerberg (mika.westerberg-linux-kernel-bugs) wrote :

Thanks. I guess this is not related to I2C host controller driver then. Sensors generate lots of traffic if they are enabled (not sure if there is a way to disable certain from UI).

Revision history for this message
In , mika.westerberg (mika.westerberg-linux-kernel-bugs) wrote :

Added Srinivas who knows this area better.

Revision history for this message
In , srinivas.pandruvada (srinivas.pandruvada-linux-kernel-bugs) wrote :

You can disable iio_sensor_proxy service and reboot. Then look at
value of /sys/bus/iio/devices/iio:device*/buffer/enable
They all should be 0. Also better to note the sensor name corresponding to each iio:device*. There is an attribute called "name" under each iio:device*.

Now measure power and see if you still have issue.

If you don't see issue, we can adjust some settings for sensor report interval.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

Created attachment 279003
iio devices list

Thank you for looking into this issue.

Names and status of sensor with enable iio sensor proxy is attached.

> Now measure power and see if you still have issue.

Issue is not reproducible with removed iio-sensor-proxy (for some reason disabling iio-sensor-proxy.service does not work - it remain enabled and start after reboot, so I removed it).

> If you don't see issue, we can adjust some settings for sensor report
> interval.

Is there patch that I could test?

Revision history for this message
In , srinivas.pandruvada (srinivas.pandruvada-linux-kernel-bugs) wrote :

I don't think you need a patch. Enable iio-sensor-proxy again. You probably want to change hysteresis. This will decide how much change in sample data before data is sent to user. OR need to reduce sampling frequency.

Most probably this is accel_3d, which in your case /sys/bus/iio/devices/iio:device3. Recheck with the "name" attribute.

Try adjusting
in_accel_hysteresis to some higher value and read back if this is accepted by the sensor.
For example
#echo 0.000010 > in_accel_hysteresis

Also try to reduce in_accel_sampling_frequency.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

I tried 0.000010 in_accel_hysteresis, then tried 0.000005 and 0.000001. I also tried to reduce in_accel_sampling_frequency from 10 to 10. Unfortunately, all of this doesn't make noticeable difference.

Revision history for this message
In , srinivas.pandruvada (srinivas.pandruvada-linux-kernel-bugs) wrote :

"in_accel_sampling_frequency from 10 to 10", may be you mean something.

You have two other devices also. First try this:

For all the devices
/sys/bus/iio/devices/iio:device*/buffer/enable = 1
make them 0.

echo 0 > /sys/bus/iio/devices/iio:device*/buffer/enable

Then I think you will be fine. Then enable 1 by 1 and see which device has issue.
Then play with those parameters in problem device.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

> "in_accel_sampling_frequency from 10 to 10", may be you mean something.

Sorry, I mean 10 to 1.

> Then I think you will be fine.

Yes, events stopped, power consumption back to normal.

> Then enable 1 by 1 and see which device has issue.

magn_3d and accel_3d

> Then play with those parameters in problem device.

0.010000 in_accel_hysteresis and 2 in_accel_sampling_frequency produce reasonable power consumption (below 3 watts in idle with enabled screen and wifi) and seems like doesn't affect tablet automatic screen rotation. 1 in_accel_sampling_frequency is noticeably slower. Default 10 in_accel_sampling_frequency consume more power (above 3 watts most of the time) without noticeable improvement to tablet automatic screen rotation.

With magnetometer it's kind of more difficult. I find that 1.000000 in_magn_hysteresis and 0.1 in_magn_sampling_frequency is good for power consumption, but I have no idea how to verify if magnetometer is still usable.

Revision history for this message
In , srinivas.pandruvada (srinivas.pandruvada-linux-kernel-bugs) wrote :

These parameters should be set by user space based on the application requirement, kernel can't set.

I think geoclue is some service uses magnetometer.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

> These parameters should be set by user space based on the application
> requirement, kernel can't set.

Then why kernel version makes a difference? And how actually this bug can be solved?

> I think geoclue is some service uses magnetometer.

I not sure how correct behaviour should look like, but with untouched Linux kerlenl magn_3d parameters Gnome Maps show my location as if laptop get constantly rotated.

Revision history for this message
In , srinivas.pandruvada (srinivas.pandruvada-linux-kernel-bugs) wrote :

These settings directly go to firmware and this part of code is not touched from a long time. Did you update BIOS recently?
Try to revert commits 6f92253024d9d947a4f454654840ce479e251376
and f1664eaacec31035450132c46ed2915fd2b2049a.

They should have been backported older kernels too. If this fixes this issue, means that sensors were not powered up in your other builds as user space program iio-sensor-proxy has a race condition and failed to power up sensors.

I think you are able to reproduce the condition even during cold boot not just reboots.

Revision history for this message
In , russianneuromancer (russianneuromancer-linux-kernel-bugs) wrote :

You are right, on Linux 4.9.0 where power consumption was low and there was no interrupts coming from INT343A - monitor-sensor can't detect orientation and can't get light sensor data.

Changed in linux:
status: Incomplete → Invalid
Changed in linux (Ubuntu):
status: New → Invalid
Brad Figg (brad-figg)
tags: added: cscc
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.