8086:1502 [Lenovo ThinkPad W530] e1000e module sometimes prevents suspend to ram

Bug #1250476 reported by James Hewitt on 2013-11-12
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Medium
Unassigned

Bug Description

Sometimes, usually towards the end of the day, I find my e100e module does not allow the system to suspend.

WORKAROUND: "rmmod e100e" and I can suspend again. I do use the ethernet adapter, and it only has occurred after having been connected to ethernet.

The following is seen in dmsg:
Nov 12 12:08:38 pk0k4dr kernel: [33964.643792] PM: Syncing filesystems ... done.
Nov 12 12:08:38 pk0k4dr kernel: [33965.021021] PM: Preparing system for mem sleep
Nov 12 12:08:42 pk0k4dr kernel: [33965.134863] Freezing user space processes ... (elapsed 0.001 seconds) done.
Nov 12 12:08:42 pk0k4dr kernel: [33965.136441] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
Nov 12 12:08:42 pk0k4dr kernel: [33965.137766] PM: Entering mem sleep
Nov 12 12:08:42 pk0k4dr kernel: [33965.137838] Suspending console(s) (use no_console_suspend to debug)
Nov 12 12:08:42 pk0k4dr kernel: [33965.137889] xhci_hcd 0000:00:14.0: power state changed by ACPI to D0
Nov 12 12:08:42 pk0k4dr kernel: [33965.238268] xhci_hcd 0000:00:14.0: setting latency timer to 64
Nov 12 12:08:42 pk0k4dr kernel: [33965.238338] ehci-pci 0000:00:1a.0: power state changed by ACPI to D0
Nov 12 12:08:42 pk0k4dr kernel: [33965.342280] ehci-pci 0000:00:1a.0: setting latency timer to 64
Nov 12 12:08:42 pk0k4dr kernel: [33965.342306] ehci-pci 0000:00:1d.0: power state changed by ACPI to D0
Nov 12 12:08:42 pk0k4dr kernel: [33965.446254] ehci-pci 0000:00:1d.0: setting latency timer to 64
Nov 12 12:08:42 pk0k4dr kernel: [33965.578114] sd 0:0:0:0: [sda] Synchronizing SCSI cache
Nov 12 12:08:42 pk0k4dr kernel: [33965.578476] sd 0:0:0:0: [sda] Stopping disk
Nov 12 12:08:42 pk0k4dr kernel: [33965.802057] nouveau [ DRM] suspending fbcon...
Nov 12 12:08:42 pk0k4dr kernel: [33965.802061] nouveau [ DRM] suspending display...
Nov 12 12:08:42 pk0k4dr kernel: [33965.802087] nouveau [ DRM] unpinning framebuffer(s)...
Nov 12 12:08:42 pk0k4dr kernel: [33965.802480] mei_me 0000:00:16.0: suspend
Nov 12 12:08:42 pk0k4dr kernel: [33965.803108] nouveau [ DRM] evicting buffers...
Nov 12 12:08:42 pk0k4dr kernel: [33965.827282] nouveau [ DRM] waiting for kernel channels to go idle...
Nov 12 12:08:42 pk0k4dr kernel: [33965.827309] nouveau [ DRM] suspending client object trees...
Nov 12 12:08:42 pk0k4dr kernel: [33965.835924] nouveau [ DRM] suspending kernel object tree...
Nov 12 12:08:42 pk0k4dr kernel: [33966.209804] i915 0000:00:02.0: power state changed by ACPI to D3cold
Nov 12 12:08:42 pk0k4dr kernel: [33966.429324] pci_pm_suspend(): e1000_suspend+0x0/0x20 [e1000e] returns -2
Nov 12 12:08:42 pk0k4dr kernel: [33966.429328] dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns -2
Nov 12 12:08:42 pk0k4dr kernel: [33966.429330] PM: Device 0000:00:19.0 failed to suspend async: error -2
Nov 12 12:08:42 pk0k4dr kernel: [33967.221549] nouveau 0000:01:00.0: power state changed by ACPI to D3cold
Nov 12 12:08:42 pk0k4dr kernel: [33967.221613] PM: Some devices failed to suspend, or early wake event detected
Nov 12 12:08:42 pk0k4dr kernel: [33967.221661] i915 0000:00:02.0: power state changed by ACPI to D0
Nov 12 12:08:42 pk0k4dr kernel: [33967.221684] xhci_hcd 0000:00:14.0: setting latency timer to 64
Nov 12 12:08:42 pk0k4dr kernel: [33967.221737] mei_me 0000:00:16.0: irq 45 for MSI/MSI-X

ProblemType: Bug
DistroRelease: Ubuntu 13.10
Package: linux-image-3.11.0-13-generic 3.11.0-13.20
ProcVersionSignature: Ubuntu 3.11.0-13.20-generic 3.11.6
Uname: Linux 3.11.0-13-generic x86_64
ApportVersion: 2.12.5-0ubuntu2.1
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: jammy 3673 F.... pulseaudio
Date: Tue Nov 12 13:42:10 2013
HibernationDevice: RESUME=UUID=36cdf9ac-9cc9-4649-9be0-ed0a81c29b26
InstallationDate: Installed on 2013-10-16 (26 days ago)
InstallationMedia: Ubuntu 13.10 "Saucy Salamander" - Beta amd64 (20130925.1)
MachineType: LENOVO 24491D1
MarkForUpload: True
ProcFB:
 0 inteldrmfb
 1 nouveaufb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.11.0-13-generic.efi.signed root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.11.0-13-generic N/A
 linux-backports-modules-3.11.0-13-generic N/A
 linux-firmware 1.116
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 05/24/2013
dmi.bios.vendor: LENOVO
dmi.bios.version: G5ET93WW (2.53 )
dmi.board.asset.tag: Not Available
dmi.board.name: 24491D1
dmi.board.vendor: LENOVO
dmi.board.version: Win8 Pro DPK TPG
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvrG5ET93WW(2.53):bd05/24/2013:svnLENOVO:pn24491D1:pvrThinkPadW530:rvnLENOVO:rn24491D1:rvrWin8ProDPKTPG:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 24491D1
dmi.product.version: ThinkPad W530
dmi.sys.vendor: LENOVO

James Hewitt (jammy) wrote :
James Hewitt (jammy) wrote :

This is probably a dup of 1213035, but raising a new bug as it was the advice of penalcvh (now subscribed)

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.12 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.12-trusty/

Changed in linux (Ubuntu):
importance: Undecided → Medium
James Hewitt (jammy) wrote :

I put on 3.12 but the system hung twice. Take a look at the syslog from 12 Nov at about 19:30 to see where lightdm hung.

Its not stable enough to test the bug here, and didn't stay up long enough to run apport-collect. Should I raise another bug against 3.12, and if so, what do I need to collect?

tags: added: bios-outdated-2.55 needs-upstream-testing regression-potential
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
description: updated
Changed in linux (Ubuntu):
importance: Medium → Low
summary: - e1000e module sometimes prevents suspend to ram
+ 8086:1502 [Lenovo ThinkPad W530] e1000e module sometimes prevents
+ suspend to ram
James Hewitt (jammy) wrote :

I had a look at the e1000e driver and tried a new version of that, so I installed a new version of that. It suspended fine today (in circumstances that have previously caused a problem). I'll run like this for a week before updating the bios. I think its version 2.3.2 in saucy, not sure why it would be so far backlevel, there is also a 2.5.4 stable version.

jammy@pk0k4dr:~$ sudo ethtool -i eth0
driver: e1000e
version: 2.4.14-NAPI
firmware-version: 0.13-3

There are a few power related changes:
* Upstream - commit e60b22c5b7e59db09a7c9490b1e132c7e49ae904 (e1000e: fix accessing to suspended device)
* Upstream - commit 66148babe728f3e00e13c56f6b0ecf325abd80da (e1000e: fix runtime power management transitions)
* Cleanup/refactor - Runtime Power Management flow
* Refactor/fix system hibernate flow

James Hewitt (jammy) wrote :

This hasn't occurred again since upgrading the driver. Just to be certain I have downgraded it again today and upgraded my bios.

Assuming it is the driver level, what needs to happen to this item?

(Version in saucy:
jammy@pk0k4dr:~$ sudo ethtool -i eth0
driver: e1000e
version: 2.3.2-k
firmware-version: 0.13-3)

James Hewitt (jammy) wrote :

jammy@pk0k4dr:~$ sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date
G5ET95WW (2.55 )
09/13/2013

tags: added: latest-bios-2.55
removed: bios-outdated-2.55
James Hewitt (jammy) wrote :

Happened again today, so definitely not the bios.

[78915.536665] i915 0000:00:02.0: power state changed by ACPI to D3cold
[78915.769180] pci_pm_suspend(): e1000_suspend+0x0/0x20 [e1000e] returns -2
[78915.769184] dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns -2
[78915.769185] PM: Device 0000:00:19.0 failed to suspend async: error -2
[78916.536411] nouveau 0000:01:00.0: power state changed by ACPI to D3cold
[78916.536520] PM: Some devices failed to suspend, or early wake event detected

I would put my money on it being the intel driver level, and will reinstall that on my system.

What is next for this item?

James Hewitt, could you please test the latest upstream kernel available (not the daily folder) following https://wiki.ubuntu.com/KernelMainlineBuilds ? It will allow additional upstream developers to examine the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this bug is fixed in the mainline kernel, please add the following tags:
kernel-fixed-upstream
kernel-fixed-upstream-VERSION-NUMBER

where VERSION-NUMBER is the version number of the kernel you tested. For example:
kernel-fixed-upstream-v3.12

This can be done by clicking on the yellow circle with a black pencil icon next to the word Tags located at the bottom of the bug description. As well, please remove the tag:
needs-upstream-testing

If the mainline kernel does not fix this bug, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-VERSION-NUMBER

As well, please remove the tag:
needs-upstream-testing

Once testing of the upstream kernel is complete, please mark this bug's Status as Confirmed. Please let us know your results. Thank you for your understanding.

James Hewitt (jammy) wrote :

Hi Chris,

3.12 was unstable on my machine so I could not test it.

Do you know what version of e1000e ships with 3.12?

James Hewitt (jammy) wrote :

This is still affecting me.

According to the source in 3.12, it still ships with 2.3.2-k of the intel driver, so I don't think the 3.12 kernel would solve the problem even if it were stable.

http://lxr.free-electrons.com/source/drivers/net/ethernet/intel/e1000e/netdev.c?a=m68knommu#L56

James Hewitt, thank you for your comment. Could you please provide the missing information following https://wiki.ubuntu.com/DebuggingKernelSuspend ?

tags: added: bios-outdated-2.56
removed: latest-bios-2.55
James Hewitt (jammy) wrote :

There are no stickers on the laptop, its a Lenovo ThinkPad W530, but that's already in the collected information.

$ cat /proc/acpi/wakeup
Device S-state Status Sysfs node
LID S4 *enabled
SLPB S3 *enabled
IGBE S4 *enabled pci:0000:00:19.0
EXP3 S4 *disabled pci:0000:00:1c.2
XHCI S3 *enabled pci:0000:00:14.0
EHC1 S3 *enabled pci:0000:00:1d.0
EHC2 S3 *enabled pci:0000:00:1a.0
HDEF S4 *disabled pci:0000:00:1b.0

Will run the full trace next time I encounter the problem. When it goes wrong, its reliably wrong until I reboot or rmmod e1000e.

James Hewitt (jammy) wrote :

Full trace doesn't help - the system never gets to suspended state so the suspend trace doesn't kick in.

We know which is the buggy driver anyway from dmesg, it is e1000e.

I will get bios 2.56, but then I'd really like to know what the next steps for this are. So far, your suggestions have not helped. We have identified the driver that contains the problem, and the driver in the kernel is outdated. Does this need to be an upstream fix?

James Hewitt (jammy) wrote :

jammy@pk0k4dr:~$ sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date
G5ET96WW (2.56 )
11/27/2013

James Hewitt (jammy) wrote :

Bug reproduced on latest bios, an rmmod of e1000e fixed it again:

[59789.643934] i915 0000:00:02.0: power state changed by ACPI to D3cold
[59789.876727] pci_pm_suspend(): e1000_suspend+0x0/0x20 [e1000e] returns -2
[59789.876731] dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns -2
[59789.876732] PM: Device 0000:00:19.0 failed to suspend async: error -2
[59790.667673] nouveau 0000:01:00.0: power state changed by ACPI to D3cold
[59790.667838] PM: Some devices failed to suspend, or early wake event detected
[59790.667925] i915 0000:00:02.0: power state changed by ACPI to D0

What is next? How do we get e1000e updated in the kernel?

tags: added: latest-bios-2.56
removed: bios-outdated-2.56
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired

James Hewitt, could you please confirm this issue exists with the latest development release of Ubuntu? ISO images are available from http://cdimage.ubuntu.com/daily-live/current/ . If the issue remains please just make a comment to this.

Changed in linux (Ubuntu):
importance: Low → Medium
status: Expired → Incomplete
James Hewitt (jammy) wrote :

Have updated the BIOS and upgraded to the final trusty. It hasn't happened yet, but I will make another comment if the bug still exists.

jammy@pk0k4dr:~$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 14.04 LTS
Release: 14.04
Codename: trusty
jammy@pk0k4dr:~$ uname -a
Linux pk0k4dr 3.13.0-24-generic #46-Ubuntu SMP Thu Apr 10 19:11:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
jammy@pk0k4dr:~$ sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date
G5ET97WW (2.57 )
02/25/2014

James Hewitt (jammy) wrote :

Just for completeness, e1000e driver in this kernel is the same:
jammy@pk0k4dr:~$ sudo ethtool -i eth0
driver: e1000e
version: 2.3.2-k
firmware-version: 0.13-3

tags: added: latest-bios-2.57
removed: latest-bios-2.56
James Hewitt (jammy) wrote :

Happened again today.

[228215.830259] PM: Entering mem sleep
[228215.830326] Suspending console(s) (use no_console_suspend to debug)
[228215.953981] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[228215.954146] sd 0:0:0:0: [sda] Stopping disk
[228216.173167] nouveau [ DRM] suspending display...
[228216.173285] nouveau [ DRM] unpinning framebuffer(s)...
[228216.173928] nouveau [ DRM] evicting buffers...
[228216.174918] nouveau [ DRM] waiting for kernel channels to go idle...
[228216.174939] nouveau [ DRM] suspending client object trees...
[228216.176857] pci_pm_suspend(): e1000_suspend+0x0/0x20 [e1000e] returns -2
[228216.176861] dpm_run_callback(): pci_pm_suspend+0x0/0x150 returns -2
[228216.176862] PM: Device 0000:00:19.0 failed to suspend async: error -2
[228216.183169] nouveau [ DRM] suspending kernel object tree...
[228217.504658] PM: Some devices failed to suspend, or early wake event detected

James Hewitt, could you please test the latest mainline kernel via http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.15-rc5-utopic/ and advise to the results?

James Hewitt (jammy) wrote :

OK.

But I still think this is to do with power management in the e1000e driver, and that a newer version of the e1000e driver needs merging into the kernel.

James Hewitt (jammy) wrote :

jammy@pk0k4dr:~$ uname -a
Linux pk0k4dr 3.15.0-031500rc5-generic #201405091635 SMP Fri May 9 20:36:31 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
jammy@pk0k4dr:~$ sudo ethtool -i eth0
driver: e1000e
version: 2.3.2-k
firmware-version: 0.13-3
bus-info: 0000:00:19.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: no

Will make another comment if it happens again.

Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers