Gigabyte P35-DS3: does not stay suspended with default "ug" wake-on-lan setting

Bug #1450396 reported by Jan Rathmann on 2015-04-30
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Linux
Unknown
Unknown
linux (Ubuntu)
Medium
Unassigned
systemd (Ubuntu)
Medium
Unassigned

Bug Description

=== Update: This bug was reported from BIOS version F13 of my board, in the meantime I upgraded to the most recent version F14, but this had no impact on the bug. ===

Hello,

using Suspend-to-RAM on my system has stopped working reliably on Vivid when booting with Systemd. When I try to suspend my PC the following happens:

- PC turns off
- PC unwantedly wakes up (turns on) by itself again. This time span until it turns on again can reach from a second (almost immediately) until two or three minutes.
- In a few cases (<25%), the PC stays suspended properly.

A way to workaround this problem is to boot my PC with Upstart instead of Systemd, then this problem never happens (and has never happened in any of the previous Ubuntu versions since 2008). Thus it seems to be a bug somehow related to Systemd.

Other steps I have tried to solve the problem (without any success):

- Checking BIOS settings, all options are set that way that the PC should never wake up by any event other than pressing the power button.
- Checking by running "cat /proc/acpi/wakeup | grep enabled" if any device could cause wake up events. This lists my USB controller, but deactivating it (with commands like "echo USB0 > /proc/acpi/wakeup") had no success on the problem. Also suspend while running Upstart works fine despite the USB controller listed as "enabled".

In the system logs I could see no hint what causes the undesired wake-ups.

Allthough there is one factor which seems to have a certain influence: If I close programs like Pidgin, Skype or Firefox that cause continuous network traffic, the chance is higher that my PC will stay suspened (while runinng Systemd).

Kind regards,
Jan

ProblemType: Bug
DistroRelease: Ubuntu 15.04
Package: systemd 219-7ubuntu3
ProcVersionSignature: Ubuntu 3.19.0-15.15-generic 3.19.3
Uname: Linux 3.19.0-15-generic x86_64
NonfreeKernelModules: nvidia
ApportVersion: 2.17.2-0ubuntu1
Architecture: amd64
CurrentDesktop: Unity
Date: Thu Apr 30 11:24:46 2015
InstallationDate: Installed on 2015-04-16 (13 days ago)
InstallationMedia: Ubuntu 15.04 "Vivid Vervet" - Beta amd64 (20150415)
MachineType: Gigabyte Technology Co., Ltd. P35-DS3
ProcKernelCmdLine: BOOT_IMAGE=/@/boot/vmlinuz-3.19.0-15-generic root=/dev/mapper/internal--ssd-root ro rootflags=subvol=@ quiet splash
SourcePackage: systemd
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 07/10/2008
dmi.bios.vendor: Award Software International, Inc.
dmi.bios.version: F13
dmi.board.name: P35-DS3
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.type: 3
dmi.chassis.vendor: Gigabyte Technology Co., Ltd.
dmi.modalias: dmi:bvnAwardSoftwareInternational,Inc.:bvrF13:bd07/10/2008:svnGigabyteTechnologyCo.,Ltd.:pnP35-DS3:pvr:rvnGigabyteTechnologyCo.,Ltd.:rnP35-DS3:rvrx.x:cvnGigabyteTechnologyCo.,Ltd.:ct3:cvr:
dmi.product.name: P35-DS3
dmi.sys.vendor: Gigabyte Technology Co., Ltd.

Jan Rathmann (kaiserclaudius) wrote :
Jan Rathmann (kaiserclaudius) wrote :

Further information:

When my system is booted with Systemd, the bug does _not_ appear when I use "pm-suspend" to suspend my PC. If I use "systemctl suspend" instead, the bug appears. This seems to confirm that it indeed an issue caused by a component of Systemd.

And it also happens on Debian Jessy with Systemd and "systemctl suspend", so this bug seems to be not specific to Ubuntu.

Martin Pitt (pitti) on 2015-06-02
summary: - Systemd prevents computer from staying suspended
+ Gigabyte P35-DS3: needs suspend quirks

The main difference here is that under upstart we still run pm-suspend with its quirks, while under systemd there are no quirks being run any more. As suspend quirks have supposed to be obsolete for many years and should be fixed properly in drivers/kernel, I add a linux task.

To find out which quirks you need, can you please run

  sudo pm-suspend --store-quirks-as-lkw

confirm that this actually still works properly (i. e. leave it suspended for a few minuts), and then attach /var/cache/pm-utils/last_known_working.quirkdb ? Thanks!

summary: - Gigabyte P35-DS3: needs suspend quirks
+ Gigabyte P35-DS3: does not stay suspended

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed

Martin, thanks for looking into this, I have attached the requested file.

Martin Pitt (pitti) wrote :

Interesting, pm-suspend uses no quirks at all. Can you double-check that running "sudo pm-suspend" is reliable while "sudo systemctl suspend" is not? The two do exactly the same without quirks, i.e. writing "mem" into /sys/power/state..

Jan Rathmann (kaiserclaudius) wrote :

I did check it again yesterday and today using 'systemctl suspend'. The bug does not always happen, e.g. yesterday I was able to suspend my system 4 or 5 times in a row properly, but tomorrow after the first attempt the PC woke up again by itself ~2 seconds after I suspended it. So yes, the bug is still there unfortunately. I have used 'pm-suspend' almost exclusively during the last two months and this bug never ever has happened with 'pm-suspend' during that time (or in earlier Ubuntu releases).

Martin Pitt (pitti) wrote :

OK, then it might be one of the other hooks in /usr/lib/pm-utils/sleep.d that pm-suspend runs. According to your dmesg it's unlikely that you have the "alx" module loaded ("lsmod | grep alx" should be empty).

Can you please attach your /var/log/pm-suspend.log?

60_wpa_supplicant could be a likely cause -- can you see if this works:

  sudo wpa_cli suspend ; sudo systemctl suspend; sudo wpa_cli resume

75modules is a less likely cause, I'll see if that does anything on your system in pm-suspend.log. The other hooks seem rather unlikely to me, so let's check these two first.

Changed in systemd (Ubuntu):
status: New → Incomplete
Jan Rathmann (kaiserclaudius) wrote :

Yes, the output of 'lsmod|grep alx' is indeed empty.

Since my system doesn't have wifi, I don't think anything WPA related could be the cause. I ran the commands anyway:

  sudo wpa_cli suspend ; sudo systemctl suspend; sudo wpa_cli resume

and it has no positive impact. The output of wpa_cli is:

"Failed to connect to non-global ctrl_ifname: (null) error: No such file or directory"

Martin Pitt (pitti) wrote :

Thanks. So we need to find out which hook does the magic. For each value of <hook> in the below list, can you please run

  sudo /usr/lib/pm-utils/sleep.d/<hook> suspend ; sudo systemctl suspend; sudo /usr/lib/pm-utils/sleep.d/<hook> resume

In descending order of likelyhood, I recommend the following list for <hook>:

  99video
  95hdparm-apm
  00powersave
  94cpufreq
  95led

(The others seem *really* unrelated to suspend..)

tags: added: bios-outdated-f14
Changed in linux (Ubuntu):
importance: Undecided → Low
Jan Rathmann (kaiserclaudius) wrote :

Ok, then I'll test those hooks, it may take a few days before I can give a definitive feedback because it will need quite a few suspend-resume cycles to validate if one of the hooks really makes the problem go away.

Martin Pitt (pitti) wrote :

Thanks Jan. Note that you can make this a little faster too -- you can start with testing (i. e. starting with "suspend") all five hooks at the same time, to confirm whether it's actually any of these five. If it still doesn't help, then the problem is somewhere entirely different. If it does help, you can drop two or three hooks, and check whether it still works. If not, one of the 2 or 3 dropped hooks is the "good" one, if it still works it's one of the remaining hooks. That'll reduce the number of tests a bit. (This is called "bisecting").

Jan Rathmann (kaiserclaudius) wrote :

Thanks for the hint to try more than one hook at once, that's really useful!

I'm done with the testing faster than I thought, because unfortunately none of the hooks makes the bug go away. I even tested applying all five of them together, but no success.

The only thing notable is, that /bin/sh spills out errors on three of the hook scripts:

/usr/lib/pm-utils/sleep.d/00powersave: 3: .: : not found
/usr/lib/pm-utils/sleep.d/94cpufreq: 6: .: : not found
/usr/lib/pm-utils/sleep.d/99video: 22: .: : not found

I'm not sure if this has any implication on this bug (I tested executing the hooks with /bin/bash instead, but that lead to just more syntax errors).

Jan Rathmann (kaiserclaudius) wrote :

Some progress: After I failed to identify the responsible hook by running them before 'systemctl suspend' I had the idea to test vice versa if I would be able to reproduce the undesired wakeups with pm-suspend by successively disabling its hooks (=moving the files in /usr/lib/pm-utils/sleep.d to another directoy).

And I was successfull: The hook that seems to be responsible for correct suspend on my system is 00powersave. If all other hooks are enabled except this one, the bug also appears with pm-suspend. And on the other hand if 00powersave is enabled (even if it is the only hook enabled), I was not able to reproduce the bug with pm-suspend.

Martin Pitt (pitti) wrote :

Ah, great. So can you confirm that this works:

  sudo pm-powersave false; sudo systemctl suspend; sudo pm-powersave true

? If so, can you repeat the bisecting exercise with the hooks in /usr/lib/pm-utils/power.d/ to find out which one is the important one here?

Thanks!

Jan Rathmann (kaiserclaudius) wrote :

Yes, 'pm-powersave false; systemctl suspend; pm-powersave true' seems to work, and I think I have identified the responsible hook:

/usr/lib/pm-utils/power.d/disable_wol

The bug appears even when all other hooks are there and otherwise has not occured so far when "disable_wol" is the only hook enabled.

Martin Pitt (pitti) wrote :

Ah, thanks! That's a bit weird -- on powersave false wake-on-lan is *enabled*. So it seems that with WOL disabled your computer doesn't stay suspended, but with WOL enabled it does.

Cross-check:

  sudo ethtool -s wlan0 wol g ; sudo systemctl suspend

-> that enables WOL on the usual magick packet. With that it stays suspended, right?

  sudo ethtool -s wlan0 wol d; sudo systemctl suspend

-> that disables WOL. With that it wakes up again?

Jan Rathmann (kaiserclaudius) wrote :

Martin, I did check that and I think I can preliminary give results that are quite interesting:

- If I run the ethtool command before 'systemctl suspend', the bug hasn't appeared so far - and it does not seem to matter if I run ethtool with the 'wol g' (enable WOL) or with the 'wol d' flag!

- After I suspended my PC succesfull by running e.g. 'ethtool -s eth0 wol g ; systemctl suspend', the bug seems to stay away untill the next reboot even I simply use 'systemctl suspend' without ethtool on the next attempts to suspend.

Thus it seems that for proper suspend on my system it only matters that ethtool is "poking" the network card one-time after system startup, regardless if I use it to enable or disable WOL. I'll do further testing if it is indeed sufficient to put the ethtool command e.g. in /etc/rc.local to make the bug disappear without any additional steps.

(Allthough it has no influence on the bug there is one relevant difference between running ethtool to disable or enable WOL before systemctl: If I disable WOL, my network is offline after resume, and I have do run 'ifdown eth0; ifup eth0' to make it work again. If I enable WOL instead, my network is up after suspend at least most of the time.)

Hello Jan,

Jan Rathmann [2015-06-15 14:06 -0000]:
> - If I run the ethtool command before 'systemctl suspend', the bug
> hasn't appeared so far - and it does not seem to matter if I run ethtool
> with the 'wol g' (enable WOL) or with the 'wol d' flag!

That's indeed interesting -- After a clean boot, if you just do
"sudo ethtool eth0", it should show you the default WOL status; for me
it is "g". So running "ethtool -s eth0 wol g" *should* be a no-op, but
maybe it actually isn't.

> - After I suspended my PC succesfull by running e.g. 'ethtool -s eth0
> wol g ; systemctl suspend', the bug seems to stay away untill the next
> reboot even I simply use 'systemctl suspend' without ethtool on the next
> attempts to suspend.

Yes, that's expected. The driver ought to remember the WOL status
across suspends.

> Thus it seems that for proper suspend on my system it only matters that
> ethtool is "poking" the network card one-time after system startup,
> regardless if I use it to enable or disable WOL. I'll do further testing
> if it is indeed sufficient to put the ethtool command e.g. in
> /etc/rc.local to make the bug disappear without any additional steps.

That'll be interesting indeed! It'll also show the kernel developers
where to fix this.

> (Allthough it has no influence on the bug there is one relevant
> difference between running ethtool to disable or enable WOL before
> systemctl: If I disable WOL, my network is offline after resume, and I
> have do run 'ifdown eth0; ifup eth0' to make it work again. If I enable
> WOL instead, my network is up after suspend at least most of the time.)

This could also explain bug 1270257, and its more recent duplicates
like bug 1431582.

Thanks for your investigations!

Hello Martin,

I think I have found the real cause of the bug: It seems to always happen when the wol flag is set to 'ug' (instead of 'g' or 'd'). I made a few reboots and checked the wol value with 'ethtool eth0' and it was always set to 'ug' by default after system startup (I don't know if this _really_ is always the case by default, because there were a few times in the past where suspend has worked without manually applying 'ethtool wol g'. I'll try to check that.). And I crosschecked that if I first set manually wol to 'g', do a successfull suspend, then set manually wol to 'ug' and then suspend again, the bug indeed appears.

This explains why the bug did never happen all the years when suspending with pm-suspend: The disable_wol hook always sets the flag explicitly to 'g' before suspend and thus the fact that the flag has the value 'ug' by default at system startup has no negative impact.

If I add 'ethtool -s eth0 wol g' to /etc/rc.local, this is indeed a good workaround for the bug - it seems to change the state of the wol flag reliably.

Martin, I really want to thank you for supporting me so far in finding the cause and also a workaround :-)

Martin Pitt (pitti) wrote :

Thanks for your patience! I believe this is sufficiently understood now. I retitled the bug accordingly, this should indeed be fixed properly in the driver then. In the meantime, putting that workaround into rc.local or /lib/systemd/system-sleep/ (see man systemd-suspend.service) is fine.

summary: - Gigabyte P35-DS3: does not stay suspended
+ Gigabyte P35-DS3: does not stay suspended with default "ug" wake-on-lan
+ setting
Changed in systemd (Ubuntu):
importance: Undecided → Medium
status: Incomplete → Triaged
Jan Rathmann (kaiserclaudius) wrote :

I'm attaching the script here that I'm using as a workaround under Vivid and Wily for proper suspend (put into /lib/systemd/system-sleep).

Kind regards,
Jan

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Jan Rathmann (kaiserclaudius) wrote :

Hello Christopher,

I updated the BIOS to F14 a few months ago to test if this makes any change on the bug, but it didn't.

Here's the output of sudo dmidecode -s bios-version && sudo dmidecode -s bios-release-date:
F14
06/18/2009

Kind regards,
Jan

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
description: updated

Jan Rathmann, could you please test the latest upstream kernel available from the very top line at the top of the page from http://kernel.ubuntu.com/~kernel-ppa/mainline/?C=N;O=D (the release names are irrelevant for testing, and please do not test the daily folder)? Install instructions are available at https://wiki.ubuntu.com/Kernel/MainlineBuilds . This will allow additional upstream developers to examine the issue.

If the latest kernel did not allow you to test to the issue (ex. you couldn't boot into the OS) please make a comment in your report about this, and continue to test the next most recent kernel version until you can test to the issue. Once you've tested the upstream kernel, please comment on which kernel version specifically you tested. If this issue is fixed in the mainline kernel, please add the following tags by clicking on the yellow circle with a black pencil icon, next to the word Tags, located at the bottom of the report description:
kernel-fixed-upstream
kernel-fixed-upstream-X.Y-rcZ

Where X, Y, and Z are numbers corresponding to the kernel version.

If the mainline kernel does not fix the issue, please add the following tags:
kernel-bug-exists-upstream
kernel-bug-exists-upstream-X.Y-rcZ

Please note, an error to install the kernel does not fit the criteria of kernel-bug-exists-upstream.

Once testing of the latest upstream kernel is complete, please mark this report's Status as Confirmed. Please let us know your results.

Thank you for your understanding.

tags: added: latest-bios-f14
removed: bios-outdated-f14
Changed in linux (Ubuntu):
importance: Low → Medium
status: Confirmed → Incomplete
Jan Rathmann (kaiserclaudius) wrote :

The lasted upstream kernel I could test was 4.2.3, because with 4.3 the compilation and installation of the Nvidia kernel module fails, and I can only test with the propritary Nvidia driver because suspend doesn't work properly with Nouveau on my card.

tags: added: kernel-bug-exists-upstream kernel-bug-exists-upstream-4.2.3
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Jan Rathmann (kaiserclaudius) wrote :

Disregard my last comment - I did a test anyway with 4.3-rc4 and Nouveau and supending with Nouveau worked at least one time so I could reproduce and confirm the bug for 4.3-rc4.

tags: added: kernel-bug-exists-upstream-4.3-rc4
removed: kernel-bug-exists-upstream-4.2.3

Jan Rathmann, to clarify:
1) If you use nvidia with 4.2.3, are you able to suspend with the WORKAROUND noted in https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1450396/comments/22 ?

2) If you use nouveau with 4.3-rc4, are you able to suspend with the WORKAROUND noted in https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1450396/comments/22 ?

3) Could you please provide the missing information following https://wiki.ubuntu.com/DebuggingKernelSuspend ?

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Jan Rathmann (kaiserclaudius) wrote :

Christopher, yes, my workaround seems to work in both cases 1) and 2) ( =PC stays suspended properly and doesn't power itself on unwantedly). Btw. it seems that which graphics driver is in use is completely unrelated to this bug.

Regarding 3):
- It doesn't seem to matter if I suspend by pressing power button, session menu in Unity or executing 'systemctl suspend'. As noted before, the bug never appears when suspending with 'pm-suspend', because that implicitly does the same thing as my workaround (=explicitly setting wol flag to 'g' on my network card).

- Is it really necessary to provide a resume trace? As far as I understand, this is for cases when resume fails/doesn't work properly, but this bug is _not_ about that resume fails, but that it works correctly, but is triggered automatically (and undesired) by wake-on-lan.

Kind regards,
Jan

Jan Rathmann (kaiserclaudius) wrote :

Jan Rathmann:
>"- Is it really necessary to provide a resume trace?"
Yes the trace is helpful with the latest mainline kernel (now 4.3-rc4). Just narrowing it down it WOL isn't necessarily enough for a developer to fix it quickly. However, the requested trace may provide which file and line of a buggy driver the offending code lies in.

Jan Rathmann (kaiserclaudius) wrote :

Ok, I made a resume trace with my workaround deactivated on Kernel 4.3-rc4, while the bug (= waking up automatically) appeared.

I have attached two dmesg files: The first one contains the output directly after resume. The second one contains the output directly after the next reboot. I'm not sure which one is more useful in my case.

Kind regards,
Jan

Jan Rathmann (kaiserclaudius) wrote :

Jan Rathmann, the issue you are reporting is an upstream one. Could you please report this problem following the instructions verbatim at https://wiki.ubuntu.com/Bugs/Upstream/kernel to the appropriate venue (linux-pm)?

Please provide a direct URL to your newly made report when it becomes available so that it may be tracked.

Thank you for your understanding.

Changed in linux (Ubuntu):
status: Incomplete → Triaged
Jan Rathmann (kaiserclaudius) wrote :

Here are the links to the upstream bug report that I posted to the linux-pm and the netdev mailing list:

https://marc.info/?l=linux-pm&m=144420765316442&w=2
http://www.spinics.net/lists/netdev/msg346751.html

affects: systemd → linux
Xwarman (xwarman) wrote :

I just tested it, just by installing the 4.8.0 Kernel in Ubuntu 16.04, without any workaround and Nvidia drivers enabled. Nothing changed so far.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.