graphic system freezes after a while, can still ssh into machine
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
xorg (Ubuntu) |
Expired
|
Low
|
Unassigned |
Bug Description
Ubuntu 16.04 desktop with all updates up to date
Gigagyte GA-Z170 motherboard, Intel i7-6700 3.4gHz, 64GB RAM, Samsung 840 SSD, Gigabyte GEForce GT710 graphics (nvidia chipset)
Very light load. A few terminal windows, a few Firefox pages, Thunderbird, and VirtualBox (with the extension pack loaded) running one small Linux VM with 512Mb memory (very lightly loaded as it's only running bind9). I think (but don't have perfect certainty on this) that some of the crashes have happened without VirtualBox running. Also running rsync so that backuppc on a separate machine can back it up over the network (backuppc has been a real lifesaver with all the reinstallations!)
After booting the system runs happily for a while then the GUI freezes. This can take a few hours or up to a day or two. You can still ssh into the system, and if you do "top" it typically shows 100% CPU load on systemd-timesync.
The same happens whether I use nouveau or the proprietary nvidia driver (361)
I have tried a complete reinstallation (4 times already, starting to get quite frustrating) using the proprietary drivers and NOT using the proprietary drivers and the same still happens. The most recent time was with nouveau, and Xorg.log has a number of entries like this:
(EE) [mi] EQ overflowing. Additional events will be discarded until existing events are processed.
(EE)
(EE) Backtrace:
(EE) 0: /usr/lib/xorg/Xorg (xorg_backtrace
(EE) 1: /usr/lib/xorg/Xorg (mieqEnqueue+0x253) [0x5652cde14083]
(EE) 2: /usr/lib/xorg/Xorg (QueuePointerEv
(EE) 3: /usr/lib/
(EE) 4: /usr/lib/
(EE) 5: /usr/lib/xorg/Xorg (0x5652cdc80000
(EE) 6: /usr/lib/xorg/Xorg (0x5652cdc80000
(EE) 7: /lib/x86_
(EE) 8: /usr/lib/xorg/Xorg (GiveUp+0x0) [0x5652cde37330]
(EE) 9: /lib/x86_
(EE) 10: /lib/x86_
(EE) 11: /usr/lib/xorg/Xorg (WaitForSomethi
(EE) 12: /usr/lib/xorg/Xorg (0x5652cdc80000
(EE) 13: /usr/lib/xorg/Xorg (0x5652cdc80000
(EE) 14: /lib/x86_
(EE) 15: /usr/lib/xorg/Xorg (_start+0x29) [0x5652cdcc1f59]
(EE)
(EE) [mi] These backtraces from mieqEnqueue may point to a culprit higher up the stack.
(EE) [mi] mieq is *NOT* the cause. It is a victim.
(EE) [mi] EQ overflow continuing. 100 events have been dropped.
Then at the end the last few are:
[ 6265.543] [mi] Increasing EQ size to 1024 to prevent dropped events.
[ 6265.543] [mi] EQ processing has resumed after 1070 dropped events.
[ 6265.543] [mi] This may be caused by a misbehaving driver monopolizing the server's resources.
What cna the problem possibly be, especially as 16.04 is meant to be a stable release and an Nvidia GT710 card is hardly bleeding-edge? Help greatly appreciated!
ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: xorg 1:7.7+13ubuntu3
ProcVersionSign
Uname: Linux 4.4.0-22-generic x86_64
.tmp.unity_
ApportVersion: 2.20.1-0ubuntu2.1
Architecture: amd64
CompizPlugins: No value set for `/apps/
CompositorRunning: compiz
CompositorUnred
CompositorUnred
Date: Wed Jun 1 13:22:13 2016
DistUpgraded: Fresh install
DistroCodename: xenial
DistroVariant: ubuntu
DkmsStatus: virtualbox, 5.0.18, 4.4.0-22-generic, x86_64: installed
ExtraDebuggingI
GraphicsCard:
NVIDIA Corporation Device [10de:128b] (rev a1) (prog-if 00 [VGA controller])
Subsystem: Gigabyte Technology Co., Ltd Device [1458:36ec]
InstallationDate: Installed on 2016-06-01 (0 days ago)
InstallationMedia: Ubuntu 16.04 LTS "Xenial Xerus" - Release amd64 (20160420.1)
MachineType: Gigabyte Technology Co., Ltd. Z170-D3H
ProcEnviron:
LANGUAGE=en_HK:en
PATH=(custom, no user)
LANG=en_HK.UTF-8
SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=
SourcePackage: xorg
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 07/24/2015
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: F2
dmi.board.
dmi.board.name: Z170-D3H-CF
dmi.board.vendor: Gigabyte Technology Co., Ltd.
dmi.board.version: x.x
dmi.chassis.
dmi.chassis.type: 3
dmi.chassis.vendor: To Be Filled By O.E.M.
dmi.chassis.
dmi.modalias: dmi:bvnAmerican
dmi.product.name: Z170-D3H
dmi.product.
dmi.sys.vendor: Gigabyte Technology Co., Ltd.
version.compiz: compiz 1:0.9.12.
version.ia32-libs: ia32-libs N/A
version.libdrm2: libdrm2 2.4.67-1
version.
version.
version.
version.
version.
version.
version.
version.
xserver.bootTime: Wed Jun 1 13:14:17 2016
xserver.configfile: default
xserver.devices:
input Power Button KEYBOARD, id 6
input Power Button KEYBOARD, id 7
input Sleep Button KEYBOARD, id 8
input Microsoft Microsoft Wireless Optical Desktop® 2.10 KEYBOARD, id 9
input Microsoft Microsoft Wireless Optical Desktop® 2.10 KEYBOARD, id 10
xserver.errors:
Failed to load module "nvidia" (module does not exist, 0)
Failed to load module "nvidia" (module does not exist, 0)
xserver.logfile: /var/log/Xorg.0.log
xserver.version: 2:1.18.3-1ubuntu2.2
xserver.
Changed in xorg (Ubuntu): | |
status: | Confirmed → In Progress |
status: | In Progress → Confirmed |
information type: | Public → Private |
information type: | Private → Public |
Changed in xorg (Ubuntu): | |
importance: | Undecided → Critical |
One more observation - when the display is in the frozen state (frozen windows, no response to mouse or keyboard input) and you ssh into the machine, certain things don't work. For example, 'sudo shutdown now' or 'sudo shutdown -r now' produce no result.
Sometimes there is a timeout message after a few minutes which says:
"Failed to start poweroff.target: Connection timed out
See system logs and 'systemctl status poweroff.target' for details"
Likewise 'kill -9 <process ID>' produces inconsistent results - sometimes the process in question is successfully killed and other times not.