service networking restart causes Xorg crash

Bug #1235516 reported by Paul
44
This bug affects 8 people
Affects Status Importance Assigned to Milestone
xorg-server (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

NetworkManager stopped being able to connect to Wi-Fi access points, so I attempted to restart it. There was no /etc/init.d/network-manager script (presumably removed during the upgrade to Saucy), so I ran:

service networking restart

This caused all GUI programs to be killed (several left crash logs, which I did not submit to Launchpad for obvious reasons), leaving me with just a wallpaper and a mouse pointer, still moveable. Attempting to switch to a VT displayed first a black screen with the classic X pointer, then finally a VT, but the system became unresponsive to any key. Pressing the power button caused the system to shut down.

This is how the end of syslog looks:

Oct 5 09:14:23 hostname dbus[393]: [system] Successfully activated service 'org.freedesktop.login1'
Oct 5 09:14:23 hostname whoopsie[996]: Could not get the list of active connections: GDBus.Error:org.freedesktop.DBus.Error.NoReply: Message did not receive a reply (timeout by message bus)
Oct 5 09:14:23 hostname modem-manager[656]: <info> Caught signal 15, shutting down...
Oct 5 09:14:23 hostname whoopsie[29365]: whoopsie 0.2.23 starting up.
Oct 5 09:14:23 hostname whoopsie[29365]: Using lock path: /var/lock/whoopsie/lock
Oct 5 09:14:23 hostname whoopsie[29366]: Could not connect to the system bus: Could not connect: No such file or directory
Oct 5 09:14:23 hostname ntpdate[29407]: Can't find host ntp.ubuntu.com: Name or service not known (-2)
Oct 5 09:14:23 hostname ntpdate[29407]: no servers can be used, exiting
Oct 5 09:14:36 hostname kernel: [170242.480128] usb 6-2: USB disconnect, device number 2
Oct 5 09:14:47 hostname acpid: client 1031[0:0] has disconnected
Oct 5 09:14:47 hostname acpid: client connected from 1031[0:0]
Oct 5 09:14:47 hostname acpid: 1 client rule loaded
Oct 5 09:15:01 hostname CRON[29460]: (root) CMD (command -v debian-sa1 > /dev/null && debian-sa1 1 1)
Oct 5 09:15:04 hostname kernel: [170270.268321] [drm:intel_enable_lvds] *ERROR* timed out waiting for panel to power on

ProblemType: Crash
DistroRelease: Ubuntu 13.10
Package: xserver-xorg-video-intel 2:2.99.903-0ubuntu1
ProcVersionSignature: Ubuntu 3.11.0-11.17-generic 3.11.3
Uname: Linux 3.11.0-9-generic x86_64
.tmp.unity.support.test.0:

ApportVersion: 2.12.5-0ubuntu1
Architecture: amd64
Chipset: gm45
CompizPlugins: No value set for `/apps/compiz-1/general/screen0/options/active_plugins'
CompositorRunning: compiz
CompositorUnredirectDriverBlacklist: '(nouveau|Intel).*Mesa 8.0'
CompositorUnredirectFSW: true
Date: Wed Oct 2 19:49:29 2013
DistUpgraded: 2013-10-02 08:02:08,593 DEBUG enabling apt cron job
DistroCodename: saucy
DistroVariant: ubuntu
DuplicateSignature: [gm45] GPU lockup IPEHR: 0x0a8da72e Ubuntu 13.10
ExecutablePath: /usr/share/apport/apport-gpu-error-intel.py
ExtraDebuggingInterest: Yes
GpuHangFrequency: This is the first time
GraphicsCard:
 Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a42] (rev 07) (prog-if 00 [VGA controller])
   Subsystem: Acer Incorporated [ALI] Device [1025:029b]
   Subsystem: Acer Incorporated [ALI] Device [1025:029b]
InstallationDate: Installed on 2013-04-15 (172 days ago)
InstallationMedia: Ubuntu 13.04 "Raring Ringtail" - Alpha amd64 (20130413)
InterpreterPath: /usr/bin/python3.3
MachineType: Acer Aspire 1810T
MarkForUpload: True
ProcCmdline: /usr/bin/python3 /usr/share/apport/apport-gpu-error-intel.py
ProcEnviron:

ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.11.0-11-generic root=UUID=d74ea524-a93a-4851-9da7-fef5443fd44b ro quiet splash vt.handoff=7
RelatedPackageVersions:
 xserver-xorg 1:7.7+1ubuntu5
 libdrm2 2.4.46-1
 xserver-xorg-video-intel 2:2.21.14-4ubuntu4
SourcePackage: xserver-xorg-video-intel
Title: [gm45] GPU lockup IPEHR: 0x0a8da72e
UpgradeStatus: Upgraded to saucy on 2013-10-01 (3 days ago)
UserGroups:

dmi.bios.date: 08/31/2010
dmi.bios.vendor: INSYDE
dmi.bios.version: v1.3314
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: JM11-MS
dmi.board.vendor: Acer
dmi.board.version: Base Board Version
dmi.chassis.type: 1
dmi.chassis.vendor: Chassis Manufacturer
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnINSYDE:bvrv1.3314:bd08/31/2010:svnAcer:pnAspire1810T:pvrv1.3314:rvnAcer:rnJM11-MS:rvrBaseBoardVersion:cvnChassisManufacturer:ct1:cvrChassisVersion:
dmi.product.name: Aspire 1810T
dmi.product.version: v1.3314
dmi.sys.vendor: Acer
version.compiz: compiz 1:0.9.10+13.10.20131004-0ubuntu1
version.ia32-libs: ia32-libs N/A
version.libdrm2: libdrm2 2.4.46-1
version.libgl1-mesa-dri: libgl1-mesa-dri 9.2-1ubuntu3
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 9.2-1ubuntu3
version.xserver-xorg-core: xserver-xorg-core 2:1.14.3-3ubuntu1
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.7.3-0ubuntu3.1
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:7.2.0-0ubuntu9
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.99.903-0ubuntu1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:1.0.9-2ubuntu1
xserver.bootTime: Sat Oct 5 09:16:44 2013
xserver.configfile: default
xserver.errors:

xserver.logfile: /var/log/Xorg.0.log
xserver.outputs:
 product id 4352
 vendor CMO
xserver.version: 2:1.14.3-3ubuntu1

Revision history for this message
Paul (i41bktob-launchpad-net) wrote :
tags: removed: need-duplicate-check
Revision history for this message
Paul (i41bktob-launchpad-net) wrote :

Tried it again, another crash, though this one had different symptoms: application windows remained running but without decoration, and the window manager appeared to be caught in a restart loop.

summary: - [gm45] GPU lockup IPEHR: 0x0a8da72e
+ service networking restart causes Xorg crash
Revision history for this message
Chris Wilson (ickle) wrote :

This will fix the crash, but really it is just the last step in a long journey of fail.

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "0001-glx-glxdri2-Unwrap-EnterVT-LeaveVT-upon-CloseScreen.patch" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Revision history for this message
Chris Wilson (ickle) wrote :

The original cause seems to be bad RAM - there is a single invalid byte inside an uncached memory region, which is most likely due to a physical error not software.

affects: xserver-xorg-video-intel (Ubuntu) → xorg-server (Ubuntu)
Revision history for this message
Paul (i41bktob-launchpad-net) wrote :

Interesting! How did you manage to determine that?

I just ran MemTest86 for 10 hours with no errors, but of course other sources of memory corruption are possible. E.g. the computer's manufacturer did not expect more than 4GB of RAM to be used and may have mapped the GPU's hardware registers just above 4GB.

Revision history for this message
Chris Wilson (ickle) wrote :

In the crash dump, the instruction that blows up has a single byte incorrect. That is a constant written by the kernel into uncached memory to be read by the GPU, so the write should be fine. That leaves either it gets overwritten by something else or there was a physical defect. Since I could only see a single byte invalid inside the crash dump, it didn't look likely to be a stray invalid write - leading me to suspect hardware.

Revision history for this message
Ryan Schoppmeyer (rjs3275) wrote :

I actually am having the exact same problem, right down to the lead-up. After resuming from suspend I found myself unable to connect to wireless networks, so I tried restarting networking with 'service networking restart'. lightdm and everything under it immediately went down hard. I was able to get to a VT, and when I tried to 'service lightdm start', the computer stopped responding to keyboard input, and shut down when I tapped the power button. Upon restarting, I tried to restart networking immediately after logging in, and the window manager ended up trapped in a restart loop. I then tried booting to the login screen, switching to a VT, stopping lightdm, restarting networking, and starting lightdm, but lightdm fails to start.

Cursory examination of the logs provided by the original reporter indicate that our hardware is radically different. This leads me to be skeptical of the memory hardware diagnosis. It seems outlandishly improbable that two completely different computers would experience reproducible corruption of the same byte of kernel memory without any other discernable memory errors.

This is 100% reproducible on my system. Do you need anything from me?

Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in xorg-server (Ubuntu):
status: New → Confirmed
Revision history for this message
Christian González (droetker) wrote :

Exact same bahaviour here. Don't know if it helps (or is another bug) that connecting to a DSL network crashes the network-manager as well, sometimes with lightdm.
If it seems to be another bug, please ignore my comment.

Revision history for this message
John Wiggins (jcwiggi) wrote :

Same issues here. My setup is a bonx6 from Syatem76. No network (WiFi nor ETH) and any attempt to start/restart networking crashes lightdm. Guess it's as good a time as any to try out Arch...

Revision history for this message
infolock (jhibbard) wrote :

Same exact issue here.

Revision history for this message
infolock (jhibbard) wrote :

The issue I had was resolved, however I can still recreate the problem. The problem is 2 fold for me:
1) User error in setting invalid values to the /etc/network/interface
2) System error due to it locking up the machine completely and requiring a reboot in order to resolve issue 1.

Side note: The thing I was doing was pretty routine - I was setting a static IP to the machine.

Anyways, here is what I did to cause the system to lock up as is reported above:

1) edit /etc/network/interfaces

2) add the following configuration to the file:

# interfaces(5) file used by ifup(8) and ifdown(8)
auto lo
iface lo inet loopback

# The primary network interface
iface eth0 inet static
address 192.168.1.130
gateway 192.168.1.1
netmask 255.255.255.0
dns-nameservers 8.8.8.8 8.8.4.4

# Two things to note from the above:
# a) auto eth0 is missing,
# b) the dns-nameservers was incorrect for me.

3) Save the file, and issue:
  sudo services networking restart

4) Observe that the system will lock up completely due to an invalid configuration

To Resolve the issue, and successfully restart the networking service without it freezing:

1) edit /etc/network/interfaces

2) add the following configuration to the file:

# interfaces(5) file used by ifup(8) and ifdown(8)
auto lo
iface lo inet loopback

# The primary network interface
auto eth0
iface eth0 inet static
        address 192.168.1.105
        netmask 255.255.255.0
        gateway 192.168.1.1
        network 192.168.1.0
        broadcast 192.168.1.255
        dns-nameservers 8.8.8.8 192.168.1.1

# Note that the above now has the missing auto eth0, valid dns-nameservers,
# and also added (for good measure) the broadcast address

3) Save the file
4) Restart the service:
  sudo service networking restart
5) Observe that the service restarts, the system does not hang up, and all is well once again in the world

Again, the main issue for me was user error. I'm not really sure the entire system should lock up and become unresponsive when this happens. Maybe it should - but that's up to the development team and community to decide I guess.

Revision history for this message
Bart Janssens (bartholomeus-j) wrote :

Same here (fully update Saucy, no ppa's, default repos),
I edited /etc/network/interfaces to add a bridge for an interface that was already in /etc/network/interfaces (not managed by network-manager).
After running service networking restart I had a grey screen with a black square in the middle. Restarting lightdm from another tty failed and Xorg was labeled defunct in process list.

Only a reboot fixed it since I couldn't kill xorg.

Revision history for this message
Stu (stu-axon) wrote :

This bug is definitely happens on ubuntu inside virtualbox, with gnome installed. I've generally don't restart networking now since it kills the GUI.

Interestingly this time, it seemed to leave one of my programmes running in the GUI, but it killed gnome shell + everything else.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.