Various instabilities after resuming from standby

Bug #1509326 reported by Joachim Durchholz
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nvidia-graphics-drivers-352 (Ubuntu)
New
Medium
Unassigned

Bug Description

None of these problems are readily reproducible, I'm having a roughly 30:70 chance of a successful standby. Things that I have seen happen:

1) Click on Whisker menu -> Logoff ("Abmelden") button -> Logoff/Restart/Reboot/Standby dialog (no Hibernate option, I think it's blacklisted for my machine) -> Click on "Shutdown" -> no shutdown, I get the login screen.

2) Trying to go to standby from the Logoff button in the login screen after seeing (1) -> monitor switches off, then on, login screen again.

3) After (2), close the lid, open it, standby procedure from login screen -> works. Sometimes.

4) After (2), remove machine from docking station, standby procedure from login screen -> works. Sometimes.

5) Machine is in standby, I press the power button, I get some text mode messages, monitor flashes and shows graphic content (mostly a mouse cursor on some background, too fast to properly see), then the screen goes dark. Ctrl-Alt-F1 works, Ctrl-Alt-F7 simply makes the screen dark again (both the builtin laptop screen and one of the external, connected-through-docking-station screens).

6) The effects from (5) can also happen right after cold booting the machine.

7) I *think* on a few occasions, I tried to shutdown the machine and nothing happened at all.

8) On some rarer occasions, nothing would happen after the shutdown command, and trying to issue another shutdown resulted in a message amounting to "can't do another action while the previous action is in progress"; I remember pstree reported some hung process that was connected to shutdown, and I couldn't kill it (not even kill -9). I cannot reproduce this, and I didn't take notes so I can't be more specific.

I have no idea which of these effects are related to a common cause and which are independent.
I have been unable to find out anything given that no individual phenomenon is reproducible with any reliability.
I hope the attached debug info will help, but I fully expect this to require much more diagnosis.

Hardware:
- Docking station PR02X
- Two 2560x1440 Dell U2515H monitors, one on each video output of the docking station
- nVidia Quadro 2000M, using the proprietary driver (Nouveau cannot drive two 2560x1440 screens)

ProblemType: Bug
DistroRelease: Ubuntu 15.04
Package: xorg 1:7.7+7ubuntu4
ProcVersionSignature: Ubuntu 3.19.0-31.36-generic 3.19.8-ckt7
Uname: Linux 3.19.0-31-generic x86_64
NonfreeKernelModules: nvidia
.proc.driver.nvidia.registry: Binary: ""
.proc.driver.nvidia.version:
 NVRM version: NVIDIA UNIX x86_64 Kernel Module 346.96 Sun Aug 23 22:29:21 PDT 2015
 GCC version: gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13)
ApportVersion: 2.17.2-0ubuntu1.5
Architecture: amd64
CurrentDesktop: XFCE
Date: Fri Oct 23 13:40:40 2015
DistUpgraded: Fresh install
DistroCodename: vivid
DistroVariant: ubuntu
DkmsStatus:
 bbswitch, 0.7, 3.19.0-28-generic, x86_64: installed
 bbswitch, 0.7, 3.19.0-30-generic, x86_64: installed
 bbswitch, 0.7, 3.19.0-31-generic, x86_64: installed
 nvidia-346, 346.96, 3.19.0-31-generic, x86_64: installed
GraphicsCard:
 NVIDIA Corporation GF106GLM [Quadro 2000M] [10de:0dda] (rev a1) (prog-if 00 [VGA controller])
   Subsystem: Dell Device [1028:04a3]
InstallationDate: Installed on 2015-08-24 (59 days ago)
InstallationMedia: Xubuntu 15.04 "Vivid Vervet" - Release amd64 (20150422.1)
LightdmGreeterLogOld:
 upstart: indicator-sound-main-Prozess (11841) wurde von TERM-Signal beendet
 upstart: indicator-application-main-Prozess (11843) wurde von TERM-Signal beendet
MachineType: Dell Inc. Precision M4600
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.19.0-31-generic root=/dev/mapper/xubuntu--vg-root ro quiet splash
SourcePackage: xorg
UdevLog: Error: [Errno 2] Datei oder Verzeichnis nicht gefunden: '/var/log/udev'
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 12/26/2013
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A16
dmi.board.name: 08V9YG
dmi.board.vendor: Dell Inc.
dmi.board.version: A00
dmi.chassis.type: 9
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA16:bd12/26/2013:svnDellInc.:pnPrecisionM4600:pvr01:rvnDellInc.:rn08V9YG:rvrA00:cvnDellInc.:ct9:cvr:
dmi.product.name: Precision M4600
dmi.product.version: 01
dmi.sys.vendor: Dell Inc.
version.compiz: compiz N/A
version.ia32-libs: ia32-libs N/A
version.libdrm2: libdrm2 2.4.60-2
version.libgl1-mesa-dri: libgl1-mesa-dri 10.5.9-2ubuntu1~vivid2
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 10.5.9-2ubuntu1~vivid2
version.nvidia-graphics-drivers: nvidia-graphics-drivers N/A
version.xserver-xorg-core: xserver-xorg-core 2:1.17.1-0ubuntu3.1
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.9.0-1ubuntu2
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:7.5.0-1ubuntu2
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.917-1~exp1ubuntu2.2
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.11-1ubuntu2build1
xserver.bootTime: Thu Oct 22 10:57:03 2015
xserver.configfile: default
xserver.errors: open /dev/fb0: No such file or directory
xserver.logfile: /var/log/Xorg.0.log
xserver.outputs:

xserver.version: 2:1.17.1-0ubuntu3.1

Revision history for this message
Joachim Durchholz (jo-durchholz) wrote :
Revision history for this message
Joachim Durchholz (jo-durchholz) wrote :

Added Christopher M. Penalver (penalvch) as requested in #1313847.
Sorry for coming back to you so late, I've been trying to find out more but couldn't determine anything.

Note that ubuntu-bug reported this to me when I started it from the command line:

ubuntu-bug xorg
Gtk-Message: GtkDialog mapped without a transient parent. This is discouraged.
ERROR: hook /usr/share/apport/package-hooks/source_xorg.py crashed:
Traceback (most recent call last):
  File "/usr/lib/python3/dist-packages/apport/report.py", line 197, in _run_hook
    symb['add_info'](report, ui)
  File "/usr/share/apport/package-hooks/source_xorg.py", line 677, in add_info
    attach_nvidia_info(report, ui)
  File "/usr/share/apport/package-hooks/source_xorg.py", line 521, in attach_nvidia_info
    attach_file(report, logfile)
  File "/usr/lib/python3/dist-packages/apport/hookutils.py", line 115, in attach_file
    report[key] = read_file(path, force_unicode=force_unicode)
  File "/usr/lib/python3/dist-packages/problem_report.py", line 627, in __setitem__
    assert k.replace('.', '').replace('-', '').replace('_', '').isalnum()
AssertionError

(process:7289): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed

penalvch (penalvch)
tags: added: latest-bios-a16
summary: - Various instabilities around hibernate, standby, and display control
+ Various instabilities after resuming from standby
penalvch (penalvch)
description: updated
Revision history for this message
penalvch (penalvch) wrote :

Joachim Durchholz, thank you for reporting this and helping make Ubuntu better.

As per http://www.nvidia.com/download/driverResults.aspx/92826/en-us , the latest version of the nvidia driver available for your card is 352.

Hence, could you please update to Wily, and advise if this is still reproducible with https://launchpad.net/ubuntu/+source/nvidia-graphics-drivers-352 ?

Changed in xorg (Ubuntu):
importance: Undecided → Medium
status: New → Incomplete
Revision history for this message
Joachim Durchholz (jo-durchholz) wrote :

It's my main and only work machine, I can upgrade if and only if I can easily revert any upgrade.
I don't expect to be able to revert from Wily to Vivid, so that's not an option unless I can have guarantees.
A driver update would be fine, I can always go back to Nouveau, then reinstall the current Nvidia driver.

Given #1488206 and #1507328, and that I've been having these problems for years now, I do not expect the driver update to fix anything, but I can still check and see what's happening. (My bets are on a race condition somewhere in the wakeup software stack since the symptoms are so irreproducible, but I don't know how to validate that.)

More symptoms:

9) Sometimes (rarely), X stops recognizing the screen. I get a "default" screen @1024x768 instead of the VGA-0, LVDS-0, and DP0...DP-6 connections in xrandr --verbose. It does not help to reboot, not even when disconnecting from all power including battery. What does help is installing Nouveau, then reinstalling nvidia-current.

10) Sometimes (slightly more often than symptom nr. 9), I get the same as nr. 8, shutdown does not complete. I cannot get a console through Ctrl-Alt-F1, but I can Ctrl-Alt-F7 and get a graphic screen with a greyed-out desktop and an empty dialog window (there's a white title bar and a grey dialog area, but no title, no icons, no text, no buttons, no border, entirely unresponsive).
What I could do was to Ctrl-Alt-Del to get to the lock screen. From there, I could resume my normal session (just to restore the same greyed-out-with-unresponsible-dialog behaviour). I could also start a guest session, but I didn't try anything there because I believe guest isn't in the sudoers list.
Logging in with explicit user name gave me a black screen and unresponsiveness to Ctrl-Alt-F1 and Ctrl-Alt-F7, so I power cycled the machine and called it a day.

Revision history for this message
Joachim Durchholz (jo-durchholz) wrote :

11) Did a reboot because graphics performance had dropped. Reboot through XFCE desktop would make the screen go black, pstree showed that the desktop still lived. reboot command from console would work, but boot into a black screen (after a brief display of a background and something that might have been the mouse pointer). Closing and opening the lid would not help, detaching the laptop from the docking station and Ctrl-Alt-F7 would restore the XFCE desktop, reattaching to the docking station would give access to the docking-station-connected screens.
So... something wasn't properly initialized with the docking station attached during boot.

For reference, I have placed the following script on a hotkey to switch from laptop to docking-station-connected screens.

#! /bin/sh
echo ================================================== >>/home/jo/xrandr.log
echo docked.sh >>/home/jo/xrandr.log
date >>/home/jo/xrandr.log
echo >>/home/jo/xrandr.log
xrandr --output DP-6 --auto >>/home/jo/xrandr.log 2>&1
xrandr --output LVDS-0 --off >>/home/jo/xrandr.log 2>&1
xrandr --output DP-3 --auto --left-of DP-6 --primary >>/home/jo/xrandr.log 2>&1

It's extremely rigid and unable to deal with any variation in the hardware situation, and sometimes is fails to disable LVDS-0 and cannot activate DP-3 afterwards (I think because there are only two CRTCs), but if all hardware is properly detected, it will work reliably on the second attempt... barely good enough so I don't have to redo Display configuration every time it fails to auto-reconfigure when attaching to or detaching from the docking station.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for xorg (Ubuntu) because there has been no activity for 60 days.]

Changed in xorg (Ubuntu):
status: Incomplete → Expired
Revision history for this message
Joachim Durchholz (jo-durchholz) wrote :

Please unexpire. This bug report is waiting for feedback from support.

Revision history for this message
Joachim Durchholz (jo-durchholz) wrote :

Issues are persistent in Wily BTW. Can't say if they are exactly the same since the behaviour is so unstable.
Some of the old behaviour seems gone, but I'm getting new weirdnesses, such as lightdm spontaneously restarting after unsuspending.

That particular incident:
Unsuspend started the machine, final X log message after loading was
[ 32016.399] (**) Option "xkb_layout" "de"
I switched to console 1, started investigating pstree|less. After a while, the graphic screen came up with the login screen (which I had configured to never show up, but ah well).
X log has this to say:
[ 32024.769] (II) evdev: Dell WMI hotkeys: Close
Not sure whether that means "lid closed" or the power button, but I was typing on the USB keyboard, which has neither.

Not sure what to do next, any advice welcome.

Revision history for this message
penalvch (penalvch) wrote :

Joachim Durchholz, regarding https://bugs.launchpad.net/ubuntu/+source/xorg/+bug/1509326/comments/3 what version of the nvidia driver are you using?

tags: added: wily
Revision history for this message
Joachim Durchholz (jo-durchholz) wrote :

I already have nvidia-352 installed.

Revision history for this message
Joachim Durchholz (jo-durchholz) wrote :

Can you give me some instructions how to nail down the actual cause of these problems?
The symptoms are so varied and instrutable, it almost looks as if some race conditions deep down in the software stack leave stuff in inconsistent state.
Logging pertinent status information for each startup might be a good start.

The system is showing the login screen even though everything has been configured to not show the login screen. Finding out the reason why would be a good start, then I'd look for the reasons for the reason, until the root cause is found.
(This might even be a hardware glitch, but without any pointers to a concrete failure I can't turn in the machine for fixing. Also, if it's a hardware problem, it would be worth it submitting a bug report to whatever software subsystem should have logged it.)

Changed in xorg (Ubuntu):
status: Expired → Incomplete
Revision history for this message
penalvch (penalvch) wrote :

Joachim Durchholz, to advise, the only issue that this report is scoped to is the resume from standby. For each issue different from this (ex. resuming from hibernate, issues that occur after restarting or coming back from a shutdown, etc.), you would want to file separate reports, one per issue.

Despite this, if you uninstall the nvidia driver, is the resume from standby issue still reproducible?

Revision history for this message
Joachim Durchholz (jo-durchholz) wrote :

I can't file reports for different issues because I can't even name a single issue. All I'm seeing is sporadic but nonreproducible, somehow related failures. Some of them may be unrelated, some of them might be misconfiguration leftovers from previous installs or attempts at getting the machine to work reliably, but I have no way of determining which is which.

Trying nouveau now. It used to deliver insufficient performance to drive both of my monitors though; maybe it got upgraded, but I'm not confident.

Revision history for this message
Joachim Durchholz (jo-durchholz) wrote :

Well, living with a single monitor for a while and seeing whether that fixes the problems.

Revision history for this message
Joachim Durchholz (jo-durchholz) wrote :

Okay... I'm seeing some glitches with Nouveau, but not all.
1) Spontaneous wake-ups, not going to sleep when told to, and similar are gone.
2) There is still improper detection of monitor configuration changes between suspend and resume. Dlosing and reopening the lid fixed that in one instance, don't know whether that is a general rule.

Now that monitor detection is clearly identified as an unrelated problem, I agree it should go to a separate problem report.
How do I proceed about the spontaneous wake-ups?

penalvch (penalvch)
affects: xorg (Ubuntu) → nvidia-graphics-drivers-352 (Ubuntu)
Changed in nvidia-graphics-drivers-352 (Ubuntu):
status: Incomplete → New
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.