X trying to start before plymouth has finished using the drm driver

Bug #982889 reported by Tomas Vanderka
766
This bug affects 96 people
Affects Status Importance Assigned to Milestone
OEM Priority Project
Fix Released
High
James M. Leddy
Precise
Fix Released
High
James M. Leddy
gdm (Ubuntu)
Fix Released
Medium
Tim Lunn
Precise
Fix Released
Medium
Unassigned
Raring
Fix Released
Medium
Unassigned
Saucy
Fix Released
Medium
Tim Lunn
lightdm (Ubuntu)
Fix Released
Critical
Timo Aaltonen
Precise
Fix Released
High
Timo Aaltonen
Raring
Fix Released
Critical
Timo Aaltonen
Saucy
Fix Released
High
Unassigned
plymouth (Ubuntu)
Fix Released
Critical
Timo Aaltonen
Precise
Fix Released
High
Timo Aaltonen
Raring
Fix Released
Critical
Timo Aaltonen
Saucy
Fix Released
High
Unassigned

Bug Description

X server fails to start the first time after boot, it works fine when I start it again.

Looks like a race condition with intel drm initialization, i guess X tries to start faster than drm driver is initialized so it fails.

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: xorg 1:7.6+12ubuntu1
ProcVersionSignature: Ubuntu 3.2.0-23.36-generic 3.2.14
Uname: Linux 3.2.0-23-generic x86_64
.tmp.unity.support.test.0:

ApportVersion: 2.0.1-0ubuntu3
Architecture: amd64
CompizPlugins: [core,composite,opengl,compiztoolbox,decor,vpswitch,snap,mousepoll,resize,place,move,wall,grid,regex,imgpng,session,gnomecompat,animation,fade,unitymtgrabhandles,workarounds,scale,expo,ezoom,unityshell]
CompositorRunning: compiz
Date: Mon Apr 16 10:35:28 2012
DistUpgraded: Fresh install
DistroCodename: precise
DistroVariant: ubuntu
ExtraDebuggingInterest: Yes, whatever it takes to get this fixed in Ubuntu
GraphicsCard:
 Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0102] (rev 09) (prog-if 00 [VGA controller])
   Subsystem: Micro-Star International Co., Ltd. Device [1462:7750]
 Advanced Micro Devices [AMD] nee ATI Barts XT [ATI Radeon HD 6800 Series] [1002:6738] (prog-if 00 [VGA controller])
   Subsystem: Giga-byte Technology Device [1458:21fa]
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Beta amd64 (20120301)
MachineType: MSI MS-7750
ProcEnviron:
 LANGUAGE=en_US:en
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-23-generic root=/dev/mapper/ssd-ubuntu--precise ro quiet splash
SourcePackage: xorg
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 08/25/2011
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: V4.0
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: Z68A-G43 (G3) (MS-7750)
dmi.board.vendor: MSI
dmi.board.version: 1.0
dmi.chassis.asset.tag: To Be Filled By O.E.M.
dmi.chassis.type: 3
dmi.chassis.vendor: MSI
dmi.chassis.version: 1.0
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrV4.0:bd08/25/2011:svnMSI:pnMS-7750:pvr1.0:rvnMSI:rnZ68A-G43(G3)(MS-7750):rvr1.0:cvnMSI:ct3:cvr1.0:
dmi.product.name: MS-7750
dmi.product.version: 1.0
dmi.sys.vendor: MSI
version.compiz: compiz 1:0.9.7.6-0ubuntu1
version.ia32-libs: ia32-libs N/A
version.libdrm2: libdrm2 2.4.32-1ubuntu1
version.libgl1-mesa-dri: libgl1-mesa-dri 8.0.2-0ubuntu3
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 8.0.2-0ubuntu3
version.xserver-xorg-core: xserver-xorg-core 2:1.11.4-0ubuntu10
version.xserver-xorg-input-evdev: xserver-xorg-input-evdev 1:2.7.0-0ubuntu1
version.xserver-xorg-video-ati: xserver-xorg-video-ati 1:6.14.99~git20111219.aacbd629-0ubuntu2
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.17.0-1ubuntu4
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau 1:0.0.16+git20111201+b5534a1-1build2

Related branches

Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :
affects: ubuntu → xorg (Ubuntu)
Bryce Harrington (bryce)
affects: xorg (Ubuntu) → xserver-xorg-video-intel (Ubuntu)
Revision history for this message
Bryce Harrington (bryce) wrote :

We've seen this (or something akin) with the binary drivers. We speculated that perhaps having the driver provide some sort of "all ready" signal that upstart can listen for would help. Short of that, for a workaround might test adding some sleeps in front of lightdm.

Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
Bryce Harrington (bryce)
summary: - xorg fails to start after boot on core i5
+ X trying to start faster than drm driver is ready
Revision history for this message
Bryce Harrington (bryce) wrote : Re: X trying to start faster than drm driver is ready

Meanwhile, can you post the output of sudo lshw?

Changed in xserver-xorg-video-intel (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Bryce Harrington (bryce) wrote :

Also, please attach your /var/log/udev from after reproducing the bug.

Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → New
status: New → Incomplete
Revision history for this message
Bryce Harrington (bryce) wrote :

A workaround is to add some sleep to /etc/init/lightdm.conf:

    sleep 10
    exec lightdm

Revision history for this message
Bryce Harrington (bryce) wrote :

<slangasek> bryceh: line 4396 of the udev log shows drm card0 coming up at 1.901920 after boot; the udev doesn't start on the root filesystem until 2.244658 according to BootDmesg.txt
<slangasek> bryceh: so the drm device *was* there for coldplugging, which makes this a kernel bug
<slangasek> fundamentally, there is still a race in how we're handling video at boot... but as my earlier surprise indicates, I think it's incredibly unlikely we'll hit it in practice

Revision history for this message
Bryce Harrington (bryce) wrote :

From the X log, it is finding the dri card device file ok:

[ 2.446] drmOpenDevice: node name is /dev/dri/card0
[ 2.446] drmOpenDevice: open result is 9, (OK)
[ 2.497] drmOpenByBusid: Searching for BusID pci:0000:00:02.0
[ 2.497] drmOpenDevice: node name is /dev/dri/card0
[ 2.497] drmOpenDevice: open result is 9, (OK)
[ 2.497] drmOpenByBusid: drmOpenMinor returns 9

But then it runs into an interface version error:

[ 2.497] drmOpenByBusid: Interface 1.4 failed, trying 1.1

Then it tries opening card1-15, then finally gives up, goes back to 0, and acts confused:

[ 2.547] drmOpenDevice: node name is /dev/dri/card14
[ 2.551] drmOpenByBusid: drmOpenMinor returns -1
[ 2.551] drmOpenDevice: node name is /dev/dri/card15
[ 2.555] drmOpenByBusid: drmOpenMinor returns -1
[ 2.555] drmOpenDevice: node name is /dev/dri/card0
[ 2.555] drmOpenDevice: open result is 9, (OK)
[ 2.555] drmOpenDevice: node name is /dev/dri/card0
[ 2.555] drmOpenDevice: open result is 9, (OK)
[ 2.555] drmGetBusid returned ''
[ 2.555] (EE) intel(0): [drm] failed to set drm interface version.
[ 2.555] (EE) intel(0): Failed to become DRM master.

Revision history for this message
Bryce Harrington (bryce) wrote :

The code that does the kernel module loading is in libdrm; reassigning.

Probably best to forward this upstream for advice before we start hacking loops into libdrm...

affects: xserver-xorg-video-intel (Ubuntu) → libdrm (Ubuntu)
Changed in libdrm (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :

Putting "sleep 1" in /etc/init/lightdm.conf was enough.

When i look at dmesg it says
[ 2.263168] [drm] Initialized drm 1.1.0 20060810
[ 2.766592] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0

so i guess xorg tries to use it at [ 2.497] in some partially initialized state or something

Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

<apw> slangasek, we know that the drm driver can be opened by splash before its ready. we had to add an 'EAGAIN' failure there
<apw> slangasek, we may or may not cope with that in userspace
<apw> commit 6d74feca6235b463ade4ecddd1dfdb73d30a2ff7
<apw> Author: Andy Whitcroft <email address hidden>
<apw> Date: Thu Jul 29 16:48:21 2010 +0100
<apw> UBUNTU: SAUCE: drm -- stop early access to drm devices
<apw> slangasek, ^^
<apw> bryceh, its a race, we need to know the minors to tell the load method, but till its run we can't actually safely open them
<apw> which is why we have to tell you to hang fire a sec
<apw> though any open can return EAGAIN and you really should damn well listen :)
<slangasek> upstart should not open the drm device at all - it's up to whatever wants to use it to handle this
<slangasek> (assuming the kernel really can't defer announcing it until it's initialized)
<apw> right, we presumably run plymouth or X and its crapping self cause the open failed
<bryceh> ok, so then _X_ should block and retry on the device until it gets something working?
<slangasek> yes
<apw> that'd be helpful if it could
<bryceh> hum
<apw> as EAGAIN really means, ooops could you try that again please
<apw> bryceh, well it should only do it for open == -1 and errno == EAGAIN

Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :

udev log and corresponding dmesg and xorg log

In this case it got to a state when lightdm/X thought it was doing fine, but i got black screen with blinking cursor + mouse pointer

Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :
Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :
Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :

I looked at the code and it seems it fails somewhere in kernel drm_setversion ioctl after being called from libdrm drmSetInterfaceVersion.
I guess it's because drm driver load didn't finish yet. And there are no usefull return values in the code involved so there's no way to know libdrm should try again.

Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :

Maybe related to #927684 #899725

Revision history for this message
Andy Whitcroft (apw) wrote :

Ok it is clear something odd going on with the drm ioctls. As there is little information in the X logs, I have put some debug in the kernel to try and help us understand this. If those of you who can reproduce this issue could try out the following kernels and report back. When you have successfully reproduced this please report back here including the output of 'dmesg':

    http://people.canonical.com/~apw/lp982889-precise/

Thanks in advanced.

Andy Whitcroft (apw)
Changed in linux (Ubuntu):
assignee: nobody → Andy Whitcroft (apw)
status: New → Incomplete
Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :
Download full text (12.2 KiB)

So I tried a few things with drm.debug=1 kernel param

When I reproduce the problem, something (plymouth?) does drm stuff before xorg, and xorg then gets EACCESS error from drm_setversion ioctl (nr=0x07) and dmesg looks like this

Apr 21 02:25:35 kujoniq kernel: [ 0.000000] Command line: BOOT_IMAGE=/vmlinuz-3.2.0-24-generic root=/dev/mapper/ssd-ubuntu--precise ro quiet drm.debug=1
Apr 21 02:25:35 kujoniq kernel: [ 0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-3.2.0-24-generic root=/dev/mapper/ssd-ubuntu--precise ro quiet drm.debug=1
Apr 21 02:25:35 kujoniq kernel: [ 2.487567] [drm] Initialized drm 1.1.0 20060810
Apr 21 02:25:35 kujoniq kernel: [ 2.497942] [drm:drm_pci_init],
Apr 21 02:25:35 kujoniq kernel: [ 2.497952] [drm:drm_get_pci_dev],
Apr 21 02:25:35 kujoniq kernel: [ 2.497977] [drm:drm_get_minor],
Apr 21 02:25:35 kujoniq kernel: [ 2.498099] [drm:drm_get_minor], new minor assigned 64
Apr 21 02:25:35 kujoniq kernel: [ 2.498101] [drm:drm_get_minor],
Apr 21 02:25:35 kujoniq kernel: [ 2.498161] [drm:drm_get_minor], new minor assigned 0
Apr 21 02:25:35 kujoniq kernel: [ 2.552305] [drm] Supports vblank timestamp caching Rev 1 (10.10.2010).
Apr 21 02:25:35 kujoniq kernel: [ 2.552306] [drm] Driver supports precise vblank timestamp query.
Apr 21 02:25:35 kujoniq kernel: [ 2.552830] [drm:drm_sysfs_connector_add], adding "VGA-1" to sysfs
Apr 21 02:25:35 kujoniq kernel: [ 2.552960] [drm:drm_sysfs_hotplug_event], generating hotplug event
Apr 21 02:25:35 kujoniq kernel: [ 2.567174] [drm:drm_sysfs_connector_add], adding "HDMI-A-1" to sysfs
Apr 21 02:25:35 kujoniq kernel: [ 2.567194] [drm:drm_sysfs_hotplug_event], generating hotplug event
Apr 21 02:25:35 kujoniq kernel: [ 2.567201] [drm:drm_sysfs_connector_add], adding "DP-1" to sysfs
Apr 21 02:25:35 kujoniq kernel: [ 2.567235] [drm:drm_sysfs_hotplug_event], generating hotplug event
Apr 21 02:25:35 kujoniq kernel: [ 2.676194] [drm:drm_irq_install], irq=51
Apr 21 02:25:35 kujoniq kernel: [ 2.783104] fbcon: inteldrmfb (fb0) is primary device
Apr 21 02:25:35 kujoniq kernel: [ 2.783492] [drm:drm_vblank_get], enabling vblank on crtc 0, ret: -22
Apr 21 02:25:36 kujoniq kernel: [ 2.951148] [drm:drm_calc_timestamping_constants], crtc 3: hwmode: htotal 2080, vtotal 1235, vdisplay 1200
Apr 21 02:25:36 kujoniq kernel: [ 2.951151] [drm:drm_calc_timestamping_constants], crtc 3: clock 154000 kHz framedur 16679910 linedur 13506, pixeldur 6
Apr 21 02:25:36 kujoniq kernel: [ 2.957757] fb0: inteldrmfb frame buffer device
Apr 21 02:25:36 kujoniq kernel: [ 2.957757] drm: registered panic notifier
Apr 21 02:25:36 kujoniq kernel: [ 2.957806] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
Apr 21 02:25:36 kujoniq kernel: [ 2.997507] [drm:drm_stub_open],
Apr 21 02:25:36 kujoniq kernel: [ 2.997510] [drm:drm_open_helper], pid = 286, minor = 0
Apr 21 02:25:36 kujoniq kernel: [ 2.997514] [drm:drm_setup],
Apr 21 02:25:36 kujoniq kernel: [ 2.997517] [drm:drm_ioctl], pid=286, cmd=0xc0406400, nr=0x00, dev 0xe200, auth=1
Apr 21 02:25:36 kujoniq kernel: [ 2.997520] [drm:drm_ioctl], pid=286, cmd=0xc0406400, nr=0x00, dev...

Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :

Everything starts fine if nothing touches drm before X or it somehow finishes correctly before X tries to start i guess ...

Apr 21 02:28:24 kujoniq kernel: [ 2.802193] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
Apr 21 02:28:24 kujoniq kernel: [ 2.826851] [drm:drm_stub_open],
Apr 21 02:28:24 kujoniq kernel: [ 2.826854] [drm:drm_open_helper], pid = 1317, minor = 0
Apr 21 02:28:24 kujoniq kernel: [ 2.826857] [drm:drm_setup],
Apr 21 02:28:24 kujoniq kernel: [ 2.826865] [drm:drm_ioctl], pid=1317, cmd=0xc0406400, nr=0x00, dev 0xe200, auth=1
Apr 21 02:28:24 kujoniq kernel: [ 2.826869] [drm:drm_ioctl], pid=1317, cmd=0xc0406400, nr=0x00, dev 0xe200, auth=1
Apr 21 02:28:24 kujoniq kernel: [ 2.826873] [drm:drm_release], open_count = 1
Apr 21 02:28:24 kujoniq kernel: [ 2.826875] [drm:drm_release], pid = 1317, device = 0xe200, open_count = 1
Apr 21 02:28:24 kujoniq kernel: [ 2.826878] [drm:drm_lastclose],
Apr 21 02:28:24 kujoniq kernel: [ 2.826890] [drm:drm_lastclose], driver lastclose completed
Apr 21 02:28:24 kujoniq kernel: [ 2.826892] [drm:drm_lastclose], lastclose completed
Apr 21 02:28:24 kujoniq kernel: [ 2.826905] [drm:drm_stub_open],
Apr 21 02:28:24 kujoniq kernel: [ 2.826906] [drm:drm_open_helper], pid = 1317, minor = 0
Apr 21 02:28:24 kujoniq kernel: [ 2.826908] [drm:drm_setup],
Apr 21 02:28:24 kujoniq kernel: [ 2.826917] [drm:drm_ioctl], pid=1317, cmd=0xc0106407, nr=0x07, dev 0xe200, auth=1
Apr 21 02:28:24 kujoniq kernel: [ 2.826919] APW: drm_setversion called
Apr 21 02:28:24 kujoniq kernel: [ 2.826921] APW: drm_setversion returned 0
Apr 21 02:28:24 kujoniq kernel: [ 2.826922] [drm:drm_ioctl], pid=1317, cmd=0xc0106401, nr=0x01, dev 0xe200, auth=1
Apr 21 02:28:24 kujoniq kernel: [ 2.826924] [drm:drm_ioctl], pid=1317, cmd=0xc0106401, nr=0x01, dev 0xe200, auth=1
Apr 21 02:28:24 kujoniq kernel: [ 2.826934] [drm:drm_ioctl], pid=1317, cmd=0xc0106407, nr=0x07, dev 0xe200, auth=1
Apr 21 02:28:24 kujoniq kernel: [ 2.826935] APW: drm_setversion called
Apr 21 02:28:24 kujoniq kernel: [ 2.826937] APW: drm_setversion returned 0
Apr 21 02:28:24 kujoniq kernel: [ 2.826939] [drm:drm_ioctl], pid=1317, cmd=0xc0106446, nr=0x46, dev 0xe200, auth=1
Apr 21 02:28:24 kujoniq kernel: [ 2.826989] [drm:drm_ioctl], pid=1317, cmd=0x80106463, nr=0x63, dev 0xe200, auth=1

Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :

After disabling plymouth-splash I can't reproduce this anymore.

It's a race between plymouth and xorg. Plymouth holds DRM master (drm_setmaster_ioctl) while xorg tries to start (drm_setversion) and fails with EACCESS because it needs DRM_MASTER for that.

It works fine if
1. plymouth never starts
2. plymouth calls drm_dropmaster_ioctl before X starts

And another thing is the Xorg.log timestamps can't really be trusted.

Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Bryce Harrington (bryce) wrote :

Tomas, did you ever get a chance to test the kernel apw posted in comment #17?

http://people.canonical.com/~apw/lp982889-precise/

Changed in libdrm (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :

Yes, dmesg output when X fails to start with that kernel is in comment #18
#19 is when it starts fine

But when X failed to start it never even got to your debug code cause it failed in drm_ioctl with EACCESS as can be seen from drm debug output.

This is where it tries to call drm_set_version and fails with EACCESS

Apr 21 02:25:36 kujoniq kernel: [ 3.134782] [drm:drm_ioctl], pid=1207, cmd=0xc0106407, nr=0x07, dev 0xe200, auth=1
Apr 21 02:25:36 kujoniq kernel: [ 3.134784] [drm:drm_ioctl], ret = fffffff3
Apr 21 02:25:36 kujoniq kernel: [ 3.134790] [drm:drm_ioctl], pid=1207, cmd=0xc0106407, nr=0x07, dev 0xe200, auth=1
Apr 21 02:25:36 kujoniq kernel: [ 3.134792] [drm:drm_ioctl], ret = fffffff3

So it's not really a drm problem. Xorg tries to start too soon while plymouth-splash is still doing stuff with drm. After i disabled plymouth-splash i never reproduced this again.

Revision history for this message
Bryce Harrington (bryce) wrote :

bug #966868 may be a dupe of this issue

summary: - X trying to start faster than drm driver is ready
+ X trying to start before plymouth has finished using the drm driver
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in xorg-server (Ubuntu):
status: New → Confirmed
Revision history for this message
ben (roboben) wrote :

Can confirm this bug on recent ubuntu 12.10. The workaround from Bryce Harrington (comment #5) fixed the problem for me. thanks!

Revision history for this message
Justyn Butler (justyn) wrote :

I had this bug on 12.04 and now have it on 12.10.
64-bit Thinkpad X220, Core i7 (2nd gen) and Intel 320 SSD, with external monitor.

Bryce's workaround in comment #5 seems to have "fixed" it.

Timo Aaltonen (tjaalton)
tags: added: quantal
Bryce Harrington (bryce)
Changed in xorg-server (Ubuntu):
assignee: nobody → Bryce Harrington (bryce)
tags: added: raring
Revision history for this message
Bryce Harrington (bryce) wrote :

Bug #1037518 is a dupe.

Changed in xorg-server (Ubuntu):
status: Confirmed → Triaged
importance: Undecided → Critical
Changed in linux (Ubuntu):
status: Incomplete → New
Changed in libdrm (Ubuntu):
status: Incomplete → Triaged
assignee: nobody → Bryce Harrington (bryce)
importance: Medium → Critical
Revision history for this message
Brad Figg (brad-figg) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Eddie B (fasteddieb216) wrote :

I hope this info helps... I also had this problem (and another possibly related problem) until I replaced my new Radeon HD 6450 video card with an old GeForce 7900 GS card. You can see more details of my problem here:
https://answers.launchpad.net/ubuntu/+source/apt/+question/221246

I would much rather use the Radeon card if possible.

Revision history for this message
Bryce Harrington (bryce) wrote :
tags: added: patch
Revision history for this message
Bryce Harrington (bryce) wrote :

Similar patch, for xorg-server.

Changed in xorg-server (Ubuntu):
status: Triaged → In Progress
Revision history for this message
Franck (alci) wrote :
Revision history for this message
Chris Wilson (ickle) wrote :

No, the race here is in the global_mutex used for serialising setting up, tearing down and opening the device. My preferred workaround is exactly what Bryce implemented.

Revision history for this message
Bryce Harrington (bryce) wrote :

I've packaged the patch in the following PPA:

   https://launchpad.net/~bryce/+archive/lp982889

Since I've so far not reproduced the bug on my own hw, I need someone else who does see this problem to install and run the ppa and verify it fixes the issue.

Potentially we may need to adjust the countdown timer or the error code, but I think this should work.

Revision history for this message
Bryce Harrington (bryce) wrote :

Oh, and if you do run the PPA, I'd like to see your Xorg.0.log, regardless of whether it worked or not, so I can verify the error code being passed up.

Revision history for this message
Franck (alci) wrote :

I have just install xserver-common from the PPA, but it did not solve the issue here.

Here is my Xorg.0.log on failure.

Revision history for this message
Adam Bruce (brucey-99-deactivatedaccount) wrote :

I have added the PPA and installed the update

First restart went to low graphics mode

I then went to recovery mode to fix the errors, restarting after that booted successfully

Restarting again boots to low graphics mode

I go to recovery mode again to fix the errors, restarting after that boots successfully

This repeats everytime

I have attached /var/log/Xorg.0.log

Revision history for this message
Bryce Harrington (bryce) wrote :

Thanks for the logs.

Unfortunately neither of these match the OP's sequence of events, and thus does not hit the problematic drm loading race that the original reporter was seeing. So, not surprising that this patch does nothing for your cases. I'd like to see someone who has Thomas' bug to test this. Specifically, if on startup you see in /var/log/Xorg.0.log a sequence of messages like this:

[ 2.555] drmOpenDevice: open result is 9, (OK)
[ 2.555] drmGetBusid returned ''
[ 2.555] (EE) intel(0): [drm] failed to set drm interface version.
[ 2.555] (EE) intel(0): Failed to become DRM master.
[ 2.555] (II) intel(0): Creating default Display subsectio

Then it's worth testing the patch in the PPA, and I think it will fix that case (or at least give us more specific data).

@Adam, I'd want to see the Xorg.0.log from the first boot, when you get dumped into low graphics mode. And maybe Xorg.0.log.old. The log you posted appears to be from a successful session so that doesn't illuminate why the failed session failed.

@Franck, it's possible your bug (#1129220) is distinct and was incorrectly duped here, but it does seem that your issue belongs with this general "class" of problems. However I think a different kind of fix will be needed in your case.

Revision history for this message
Adam Bruce (brucey-99-deactivatedaccount) wrote :

Sorry, please see attached. I see no mention of drm like you said

But it does seem to have done something as I could never boot before without either that xorg config file or the sleep delay

Revision history for this message
Adam Bruce (brucey-99-deactivatedaccount) wrote :
Revision history for this message
Adam Bruce (brucey-99-deactivatedaccount) wrote :

I don't know what boot this is from

Revision history for this message
Bryce Harrington (bryce) wrote :

@Adam, interesting. Indeed your problem also appears to be distinct from the OP. It looks like you have a hybrid graphics system and both intel and nouveau drivers are loading. It sounds like the nouveau driver isn't detecting any connected outputs and eventually unloads itself. The modesetting driver decides to load up after that for some reason, and immediately proceeds to segfault.

From your comment #39 it sounds like with the patch, X is getting further along? Looking at your original bug (#1125759) you had been seeing this problem early on, which broke everything subsequent to it:

[ 3.579] (II) config/udev: Adding drm device (/dev/dri/card1)
[ 4.377] (II) config/udev: Adding drm device (/dev/dri/card0)
[ 4.377] setversion 1.4 failed <--- oops
[ 4.379] (--) PCI:*(0:0:2:0) 8086:0126:1028:0446 rev 9, Mem @ 0xf1400000/4194304, 0xe0000000/268435456, I/O @ 0x00004000/64
[ 4.379] (--) PCI: (0:1:0:0) 10de:0df5:1028:0446 rev 161, Mem @ 0xf0000000/16777216, 0xc0000000/268435456, 0xd0000000/33554432, I/O @ 0x00003000/128, BIOS @ 0x????????/524288

Now you're seeing this:

[ 3.610] (II) config/udev: Adding drm device (/dev/dri/card1)
[ 5.309] (II) config/udev: Adding drm device (/dev/dri/card0)
[ 5.311] (--) PCI:*(0:0:2:0) 8086:0126:1028:0446 rev 9, Mem @ 0xf1400000/4194304, 0xe0000000/268435456, I/O @ 0x00004000/64
[ 5.311] (--) PCI: (0:1:0:0) 10de:0df5:1028:0446 rev 161, Mem @ 0xf0000000/16777216, 0xc0000000/268435456, 0xd0000000/33554432, I/O @ 0x00003000/128, BIOS @ 0x????????/524288

I would expect the patch to show some debug info if it is applying its fix; it should display a stream of "drm device not ready (#), sleeping for 20us" messages, and then the system should boot up to fully functional (or some permutation thereof). So, I'm not sure we can say the patch fixed it for you; it could be just coincidental. This is a race condition after all, and as such can be influenced by all manner of random things, and maybe it just had lucky timing on that boot.

As to the modesetting segfault, that would be worth tracking as its own bug. Did a crash file get generated for that in /var/crash/ ?

Revision history for this message
Adam Bruce (brucey-99-deactivatedaccount) wrote :

Yes it did

Reported: bug 1130667

Revision history for this message
Franck (alci) wrote :

Ok, I reopen bug report #1129220 as it is not an exact duplicate...

Revision history for this message
Tomas Vanderka (tomas-vanderka) wrote :

I don't think the patch from #31 will fix this because X is really getting EACCESS (0xfffffff3) not EAGAIN as can be seen in #22

I also think this is probably an issue in plymouth because plymouth is told to quit and release drm before X starts but it somehow does not allways do this in time. Maybe there should be "plymouth --wait quit" instead of "plymouth quit" in /etc/init/plymouth-stop.conf? I tried to find how exactly "plymouth quit" is handled (especially if the plymouth daemon really releases drm before that command returns) but I never got to the answer.

You can try to change the X patch in #31 to spin also on EACCESS (bad solution imho) or plymouth should be fixed to really release drm when told to quit.

And I also had this problem in Fedora, someone else reported it at https://bugzilla.redhat.com/show_bug.cgi?id=855677

I reenabled plymouth-splash and will play a little with it again, but it happens randomly in maybe 1/10 reboots so it's not easy to hit.

Revision history for this message
Bryce Harrington (bryce) wrote : Re: [Bug 982889] Re: X trying to start before plymouth has finished using the drm driver

On Wed, Feb 20, 2013 at 02:18:25PM -0000, Tomas Vanderka wrote:
> I don't think the patch from #31 will fix this because X is really
> getting EACCESS (0xfffffff3) not EAGAIN as can be seen in #22

No it will; I don't check for EAGAIN specifically yet, but just a
non-zero return. I want to make it check for specific error codes but
want to see what actually is coming through when the patch's code gets
hit. I agree that while the kernel guys say it should be EAGAIN, the
data I've seen so far suggests a different error code (maybe EACCESS).

> I also think this is probably an issue in plymouth because plymouth is
> told to quit and release drm before X starts but it somehow does not
> allways do this in time. Maybe there should be "plymouth --wait quit"
> instead of "plymouth quit" in /etc/init/plymouth-stop.conf? I tried to
> find how exactly "plymouth quit" is handled (especially if the plymouth
> daemon really releases drm before that command returns) but I never got
> to the answer.

Well maybe, however we've seen people disable plymouth but still hit the
race condition. My goal with this patch is to fix it on the X side so
it'll work in the general case of anything holding the drm device too
long.

> You can try to change the X patch in #31 to spin also on EACCESS (bad
> solution imho) or plymouth should be fixed to really release drm when
> told to quit.
>
> And I also had this problem in Fedora, someone else reported it at
> https://bugzilla.redhat.com/show_bug.cgi?id=855677

Thanks.

> I reenabled plymouth-splash and will play a little with it again, but it
> happens randomly in maybe 1/10 reboots so it's not easy to hit.

Yeah, and it's possible there are several different failure modes
depending on how the race is run, each needing a separate fix.

Revision history for this message
Bryce Harrington (bryce) wrote :

@Adam, thanks; will follow up on bug 1130667 for the crash.

@Franck, thanks for reopening #1129220; will re-review and follow up there.

Revision history for this message
Franck (alci) wrote :

@Bryce: Chris Wilson commented on bug #1129220 saying both bugs are really related. I quote:
"It is that bug. What's changed in the meantime is Xorg now has a platform probe first so hits the error even earlier."

Revision history for this message
Bryce Harrington (bryce) wrote :

@Franck, yes the bugs belong in the same general class of issue (race condition with drm), but the stacktraces and code paths differ so the patch does not address the type of fault you saw.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

in ubiquity-dm installer, we pre-empt lightdm. Then execute "plymouth deactivate" followed by "plymouth quit" if plymouth does not have an active-vt. Then we start Xorg straight away, if Xorg fails to start, we retry with modified 'fallback' configs using fbdev and finally vesa. I guess in ubiquity-dm we can hit this race as there is nothing else running on the system. Should we do a loop and try starting X a couple of times before trying fallback configs?

Revision history for this message
Amit Kucheria (amitk) wrote :

On 13 Feb 19, Bryce Harrington wrote:
> Oh, and if you do run the PPA, I'd like to see your Xorg.0.log,
> regardless of whether it worked or not, so I can verify the error code
> being passed up.

Please find my Xorg.0.log and Xorg.0.log.old attached. I see Xorg win
sometimes (1 out of 5 reboots?).

Revision history for this message
Franck (alci) wrote :

I also just had a success on rebooting ! Here is the relevant Xorg.0.log. Hopes that helps.

Revision history for this message
Jason Robinson (jaywink) wrote :

Yesterday booted first time, today took 5 times. Computer has SSD if that has something to do with things (timings).

Xorg logs logs attached. Interesting ending to the .old log :)

Revision history for this message
Jason Robinson (jaywink) wrote :
Revision history for this message
krissetto (chris-1414) wrote :
Download full text (4.0 KiB)

Mine has an ssd too, ocz agility 3 120GB. Maybe loading certain parts of
the system too fast make this happen if there aren't any checks
On Feb 28, 2013 8:06 AM, "Jason Robinson" <email address hidden> wrote:

> ** Attachment added: "Xorg.0.log.old"
>
> https://bugs.launchpad.net/ubuntu/+source/xorg-server/+bug/982889/+attachment/3550695/+files/Xorg.0.log.old
>
> --
> You received this bug notification because you are subscribed to a
> duplicate bug report (1128487).
> https://bugs.launchpad.net/bugs/982889
>
> Title:
> X trying to start before plymouth has finished using the drm driver
>
> Status in “libdrm” package in Ubuntu:
> Triaged
> Status in “linux” package in Ubuntu:
> Confirmed
> Status in “xorg-server” package in Ubuntu:
> In Progress
>
> Bug description:
> X server fails to start the first time after boot, it works fine when
> I start it again.
>
> Looks like a race condition with intel drm initialization, i guess X
> tries to start faster than drm driver is initialized so it fails.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 12.04
> Package: xorg 1:7.6+12ubuntu1
> ProcVersionSignature: Ubuntu 3.2.0-23.36-generic 3.2.14
> Uname: Linux 3.2.0-23-generic x86_64
> .tmp.unity.support.test.0:
>
> ApportVersion: 2.0.1-0ubuntu3
> Architecture: amd64
> CompizPlugins:
> [core,composite,opengl,compiztoolbox,decor,vpswitch,snap,mousepoll,resize,place,move,wall,grid,regex,imgpng,session,gnomecompat,animation,fade,unitymtgrabhandles,workarounds,scale,expo,ezoom,unityshell]
> CompositorRunning: compiz
> Date: Mon Apr 16 10:35:28 2012
> DistUpgraded: Fresh install
> DistroCodename: precise
> DistroVariant: ubuntu
> ExtraDebuggingInterest: Yes, whatever it takes to get this fixed in
> Ubuntu
> GraphicsCard:
> Intel Corporation 2nd Generation Core Processor Family Integrated
> Graphics Controller [8086:0102] (rev 09) (prog-if 00 [VGA controller])
> Subsystem: Micro-Star International Co., Ltd. Device [1462:7750]
> Advanced Micro Devices [AMD] nee ATI Barts XT [ATI Radeon HD 6800
> Series] [1002:6738] (prog-if 00 [VGA controller])
> Subsystem: Giga-byte Technology Device [1458:21fa]
> InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Beta amd64
> (20120301)
> MachineType: MSI MS-7750
> ProcEnviron:
> LANGUAGE=en_US:en
> TERM=xterm
> PATH=(custom, no user)
> LANG=en_US.UTF-8
> SHELL=/bin/bash
> ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-3.2.0-23-generic
> root=/dev/mapper/ssd-ubuntu--precise ro quiet splash
> SourcePackage: xorg
> UpgradeStatus: No upgrade log present (probably fresh install)
> dmi.bios.date: 08/25/2011
> dmi.bios.vendor: American Megatrends Inc.
> dmi.bios.version: V4.0
> dmi.board.asset.tag: To be filled by O.E.M.
> dmi.board.name: Z68A-G43 (G3) (MS-7750)
> dmi.board.vendor: MSI
> dmi.board.version: 1.0
> dmi.chassis.asset.tag: To Be Filled By O.E.M.
> dmi.chassis.type: 3
> dmi.chassis.vendor: MSI
> dmi.chassis.version: 1.0
> dmi.modalias:
> dmi:bvnAmericanMegatrendsInc.:bvrV4.0:bd08/25/2011:svnMSI:pnMS-7750:pvr1.0:rvnMSI:rnZ68A-G43(G3)(MS-7750):rvr1.0:cvnMSI:ct3:cvr1.0:
> dmi.product.name: MS-7750
> dmi.produc...

Read more...

Revision history for this message
Franck (alci) wrote :

Yes, SSD too here. but I guess this ain't extraordinary nowdays... Also this most probably has an impact, as it seems to be a race condiction...

Revision history for this message
Mark (mark-murphy-48ad) wrote :

I just have a plain old hdd, no ssd.
On Feb 28, 2013 8:41 a.m., "Franck" <email address hidden> wrote:

> Yes, SSD too here. but I guess this ain't extraordinary nowdays... Also
> this most probably has an impact, as it seems to be a race condiction...
>
> --
> You received this bug notification because you are subscribed to a
> duplicate bug report (1132056).
> https://bugs.launchpad.net/bugs/982889
>
> Title:
> X trying to start before plymouth has finished using the drm driver
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/libdrm/+bug/982889/+subscriptions
>

Revision history for this message
Bryce Harrington (bryce) wrote :

I need someone to install the xserver from this ppa, and verify it fixes the issue:

   https://launchpad.net/~bryce/+archive/lp982889

All the Xorg.0.log files posted since I mentioned this PPA are from the stock xserver (or older). What we need is an Xorg.0.log from a session with this PPA installed (and rebooted). What I expect to see in the log is something like:

...
(WW) drm device not ready (8), sleeping for 20us
(WW) drm device not ready (8), sleeping for 20us
(WW) drm device not ready (8), sleeping for 20us
(WW) drm device not ready (8), sleeping for 20us
(WW) drm device not ready (8), sleeping for 20us
...
(--) PCI:*(0:0:2:0) 8086:0166:144d:c0d7 rev 9, Mem @ 0xf0000000/4194304, 0xe0000000/268435456, I/O @ 0x00003000/64

There should be no mention of "setversion 1.4 failed".

If anyone can verify the above on their system, I can push this fix.

Revision history for this message
Jason Robinson (jaywink) wrote :

Bryce, would love to try this but I'm a bit worried that this is a raring PPA with for example xserver-common 2:1.13.2-0ubuntu3 based package and my system (12.10) has 2:1.13.0-0ubuntu6.1 installed. Warning bells in my head say upgrading could break my work computer.

What do you think? Any chance of backporting the fix to quantal version?

Revision history for this message
Mark Murphy (mokmeister) wrote :

Installed PPA, rebooted, aok. Rebooted / Shutdown another couple of times to be sure, every restart successful. It looks like the PPA works for me, Raring Ringtail development branch with Intel GM965 graphics.

Revision history for this message
Franck (alci) wrote :

Tested the PPA: it still fails here. Attached is the log...
I'm availalbe to do test whatever is needed.

Revision history for this message
Franck (alci) wrote :

Sorry, wrong attachment file extension... here it is.

Revision history for this message
Bryce Harrington (bryce) wrote :

On Fri, Mar 01, 2013 at 08:03:17AM -0000, Mark Murphy wrote:
> Installed PPA, rebooted, aok. Rebooted / Shutdown another couple of
> times to be sure, every restart successful. It looks like the PPA works
> for me, Raring Ringtail development branch with Intel GM965 graphics.
>
> ** Attachment added: "Xorg.0.log"
> https://bugs.launchpad.net/ubuntu/+source/libdrm/+bug/982889/+attachment/3552446/+files/Xorg.0.log

[ 4.182] Build Date: 09 February 2013 07:24:35AM
[ 4.182] xorg-server 2:1.13.2-0ubuntu2 (For technical support please
see http://www.ubuntu.com/support)

Nope, that was with the stock xserver, not the PPA xserver.

Make sure to add the ppa, then do a full dist-upgrade, and restart the
xserver (i.e. log out and log back in.)

You can verify that you have the right xserver running like this:

  grep -A1 'Build Date' /var/log/Xorg.0.log

Revision history for this message
Bryce Harrington (bryce) wrote :

On Fri, Mar 01, 2013 at 08:03:17AM -0000, Mark Murphy wrote:
> Installed PPA, rebooted, aok. Rebooted / Shutdown another couple of
> times to be sure, every restart successful. It looks like the PPA works
> for me, Raring Ringtail development branch with Intel GM965 graphics.
>
> ** Attachment added: "Xorg.0.log"
> https://bugs.launchpad.net/ubuntu/+source/libdrm/+bug/982889/+attachment/3552446/+files/Xorg.0.log

That's great to hear, and a very good sign that it works repeatably.
Unfortunately this log appears to be from one of the non-bugged cases.
Could I bother you to reboot a bunch more times and tarball up the
Xorg.0.logs from each of those?

If you hit any failed boots, I would be particularly interested in those
logs.

Revision history for this message
Franck (alci) wrote :

I was quite sure I activated the PPA, and it failed. But looking at the version in Xorg.0.log, it seems I didn't get the right version. I did not pin any specific version... Looking at what is installed, I only have xerver-common coming from the PPA...

And looking at the PPA web page, it seems that amd64 build was cancelled. So, am I wrong, or is mad64 build missing from the PPA ?

Revision history for this message
Bryce Harrington (bryce) wrote :

On Sat, Mar 02, 2013 at 04:34:00PM -0000, Franck wrote:
> I was quite sure I activated the PPA, and it failed. But looking at the
> version in Xorg.0.log, it seems I didn't get the right version. I did
> not pin any specific version... Looking at what is installed, I only
> have xerver-common coming from the PPA...
>
> And looking at the PPA web page, it seems that amd64 build was
> cancelled. So, am I wrong, or is mad64 build missing from the PPA ?

Ah, good find. I've requested a rebuild. Should be available in a few
hours.

Revision history for this message
Mark (mark-murphy-48ad) wrote :

Hmmm, I was fairly sure I added the PPA, it's in my sources.list. I did
another apt-get update and dist-upgrade this morning though and the Xorg
build date didn't change, so I'm not sure what I'm doing wrong. I can
tarball up the Xorg.0.logs as they are if you wish but I'm not sure if they
would be of any use to you?

Having said all that, the problem for me has gone away since 1st March. Up
to that I was getting the (to paraphrase) "running in low res mode"error
most of the time on start up. I wonder would my dpkg log from that date be
of any use to you?

On 1 March 2013 17:33, Bryce Harrington <email address hidden> wrote:

> On Fri, Mar 01, 2013 at 08:03:17AM -0000, Mark Murphy wrote:
> > Installed PPA, rebooted, aok. Rebooted / Shutdown another couple of
> > times to be sure, every restart successful. It looks like the PPA works
> > for me, Raring Ringtail development branch with Intel GM965 graphics.
> >
> > ** Attachment added: "Xorg.0.log"
> >
> https://bugs.launchpad.net/ubuntu/+source/libdrm/+bug/982889/+attachment/3552446/+files/Xorg.0.log
>
> That's great to hear, and a very good sign that it works repeatably.
> Unfortunately this log appears to be from one of the non-bugged cases.
> Could I bother you to reboot a bunch more times and tarball up the
> Xorg.0.logs from each of those?
>
> If you hit any failed boots, I would be particularly interested in those
> logs.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/982889
>
> Title:
> X trying to start before plymouth has finished using the drm driver
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/libdrm/+bug/982889/+subscriptions
>

Revision history for this message
Mark (mark-murphy-48ad) wrote :
Download full text (3.9 KiB)

Just looking at the dpkg log myself and noticed this:

2013-03-01 07:31:31 upgrade xserver-xorg-core:i386 2:1.13.2-0ubuntu2
2:1.13.2-0ubuntu3~lp982889~1
2013-03-01 07:31:31 status half-configured xserver-xorg-core:i386
2:1.13.2-0ubuntu2
2013-03-01 07:31:31 status unpacked xserver-xorg-core:i386 2:1.13.2-0ubuntu2
2013-03-01 07:31:31 status half-installed xserver-xorg-core:i386
2:1.13.2-0ubuntu2
2013-03-01 07:31:31 status half-installed xserver-xorg-core:i386
2:1.13.2-0ubuntu2
2013-03-01 07:31:31 status half-installed xserver-xorg-core:i386
2:1.13.2-0ubuntu2
2013-03-01 07:31:31 status unpacked xserver-xorg-core:i386
2:1.13.2-0ubuntu3~lp982889~1
2013-03-01 07:31:31 status unpacked xserver-xorg-core:i386
2:1.13.2-0ubuntu3~lp982889~1
2013-03-01 07:31:33 upgrade xserver-common:all 2:1.13.2-0ubuntu2
2:1.13.2-0ubuntu3~lp982889~1
2013-03-01 07:31:33 status half-configured xserver-common:all
2:1.13.2-0ubuntu2
2013-03-01 07:31:33 status unpacked xserver-common:all 2:1.13.2-0ubuntu2
2013-03-01 07:31:33 status half-installed xserver-common:all
2:1.13.2-0ubuntu2
2013-03-01 07:31:34 status half-installed xserver-common:all
2:1.13.2-0ubuntu2
2013-03-01 07:31:34 status half-installed xserver-common:all
2:1.13.2-0ubuntu2
2013-03-01 07:31:34 status unpacked xserver-common:all
2:1.13.2-0ubuntu3~lp982889~1
2013-03-01 07:31:34 status unpacked xserver-common:all
2:1.13.2-0ubuntu3~lp982889~1
2013-03-01 07:34:12 configure xserver-common:all
2:1.13.2-0ubuntu3~lp982889~1 <none>
2013-03-01 07:34:12 status unpacked xserver-common:all
2:1.13.2-0ubuntu3~lp982889~1
2013-03-01 07:34:12 status half-configured xserver-common:all
2:1.13.2-0ubuntu3~lp982889~1
2013-03-01 07:34:12 status installed xserver-common:all
2:1.13.2-0ubuntu3~lp982889~1
2013-03-01 07:34:12 configure xserver-xorg-core:i386
2:1.13.2-0ubuntu3~lp982889~1 <none>
2013-03-01 07:34:12 status unpacked xserver-xorg-core:i386
2:1.13.2-0ubuntu3~lp982889~1
2013-03-01 07:34:12 status half-configured xserver-xorg-core:i386
2:1.13.2-0ubuntu3~lp982889~1
2013-03-01 07:34:12 status installed xserver-xorg-core:i386
2:1.13.2-0ubuntu3~lp982889~1

On 3 March 2013 10:50, Mark Murphy <email address hidden> wrote:

> Hmmm, I was fairly sure I added the PPA, it's in my sources.list. I did
> another apt-get update and dist-upgrade this morning though and the Xorg
> build date didn't change, so I'm not sure what I'm doing wrong. I can
> tarball up the Xorg.0.logs as they are if you wish but I'm not sure if they
> would be of any use to you?
>
> Having said all that, the problem for me has gone away since 1st March. Up
> to that I was getting the (to paraphrase) "running in low res mode"error
> most of the time on start up. I wonder would my dpkg log from that date be
> of any use to you?
>
>
> On 1 March 2013 17:33, Bryce Harrington <email address hidden> wrote:
>
>> On Fri, Mar 01, 2013 at 08:03:17AM -0000, Mark Murphy wrote:
>> > Installed PPA, rebooted, aok. Rebooted / Shutdown another couple of
>> > times to be sure, every restart successful. It looks like the PPA works
>> > for me, Raring Ringtail development branch with Intel GM965 graphics.
>> >
>> > ** Attachment added: "Xorg.0.log"
>> >
>> https://bugs.launchpad.n...

Read more...

Revision history for this message
Franck (alci) wrote :

Build for amd64 running for 12 hours and an half, and ils still ongoing... ils that expected ??

Revision history for this message
Bryce Harrington (bryce) wrote :

On Sun, Mar 03, 2013 at 10:50:43AM -0000, Mark wrote:
> Hmmm, I was fairly sure I added the PPA, it's in my sources.list. I did
> another apt-get update and dist-upgrade this morning though and the Xorg
> build date didn't change, so I'm not sure what I'm doing wrong. I can
> tarball up the Xorg.0.logs as they are if you wish but I'm not sure if they
> would be of any use to you?

You can run `apt-cache policy xserver-xorg-core` to see if it's
installed. It won't be reflected in your Xorg.0.log until you restart.

> Having said all that, the problem for me has gone away since 1st March. Up
> to that I was getting the (to paraphrase) "running in low res mode"error
> most of the time on start up. I wonder would my dpkg log from that date be
> of any use to you?

I can take a look at it.

Revision history for this message
Mark Murphy (mokmeister) wrote :

I think I was a bit confused this morning!

 I've done a few cold boots and restarts since then, I've included them in the attached file. I included all the Xorg log files in the first directory. The first four directories in the tar contain log files from cold boots, the last one is a reboot.

apt-cache policy xserver-xorg-core provides following printout:
xserver-xorg-core:
  Installed: 2:1.13.2-0ubuntu3~lp982889~1
  Candidate: 2:1.13.2-0ubuntu3~lp982889~1
  Version table:
 *** 2:1.13.2-0ubuntu3~lp982889~1 0
        500 http://ppa.launchpad.net/bryce/lp982889/ubuntu/ raring/main i386 Packages
        100 /var/lib/dpkg/status
     2:1.13.2-0ubuntu2 0
        500 http://ie.archive.ubuntu.com/ubuntu/ raring/main i386 Packages

Hope this helps.

Revision history for this message
Ernie 07 (ernestboyd) wrote :

64-bit 3.8.0-9-generic #18-Ubuntu SMP Thu Feb 28 17:02:06 UTC 2013
Desktop with Intel e8500 cpu, Nvidia GT9500 and default (non-Nvidia) video driver.

I get this problem frequently following a restart from 12.04.

Pressing the reset button or executing a FULL power shutdown prior to completion of booting 13.04 will usually eliminate this problem. This would imply that a hardware state that is properly initialized following a reset or full power shutdown is NOT being properly initialized during a warm reboot.

On occasion, I have also had to update/install grub, either from 12.04 or a via fix a broken system from a server CD.

Revision history for this message
Bryce Harrington (bryce) wrote :

On Sun, Mar 03, 2013 at 04:06:47PM -0000, Franck wrote:
> Build for amd64 running for 12 hours and an half, and ils still
> ongoing... ils that expected ??

Yes, unfortunately. PPA rebuilds are executed at the lowest priority,
and it's possible all the ARM virtualized builds for phone stuff may be
taking precidence.

Revision history for this message
Bryce Harrington (bryce) wrote :

On Sun, Mar 03, 2013 at 06:39:55PM -0000, Mark Murphy wrote:
> I think I was a bit confused this morning!
>
> I've done a few cold boots and restarts since then, I've included them
> in the attached file. I included all the Xorg log files in the first
> directory. The first four directories in the tar contain log files from
> cold boots, the last one is a reboot.

Aha, perfect. So, it looks like the bug hit on boots 2, 3 and 4, the
patch took effect and delayed the boot for 12 sec or so waiting on drm.
There is the "setversion 1.4 failed" (which is the thing we're trying to
avoid), but the rest of the log looks otherwise normal.

Can you confirm whether boots 2, 3, 4 produced a usable X session or if
it froze or crashed or something? By 'cold boot' do you mean you had to
manually power cycle (hold down the power button)?

If boots 2,3,4 worked, then I think we're golden. If not, then back to
the drawing board... maybe we need ickle's advice on what to look at next.

Revision history for this message
Mark (mark-murphy-48ad) wrote :

All 5 boots produced a usable X session. I haven't had a problem booting up
successfully since the 1st. By cold boot I just mean a normal power on
after a normal power off. In actual fact I haven't had to do a hard reset
(or manually hold down the power button to power off and then power on
again), with my current setup at all since the 1st of March, and even then
it was rare that I would have to have done it before that. And just to go
off topic a bit I find it amazing how quickly it does shut down, in
seconds, it's really cool!

On 3 March 2013 20:48, Bryce Harrington <email address hidden> wrote:

> On Sun, Mar 03, 2013 at 06:39:55PM -0000, Mark Murphy wrote:
> > I think I was a bit confused this morning!
> >
> > I've done a few cold boots and restarts since then, I've included them
> > in the attached file. I included all the Xorg log files in the first
> > directory. The first four directories in the tar contain log files from
> > cold boots, the last one is a reboot.
>
> Aha, perfect. So, it looks like the bug hit on boots 2, 3 and 4, the
> patch took effect and delayed the boot for 12 sec or so waiting on drm.
> There is the "setversion 1.4 failed" (which is the thing we're trying to
> avoid), but the rest of the log looks otherwise normal.
>
> Can you confirm whether boots 2, 3, 4 produced a usable X session or if
> it froze or crashed or something? By 'cold boot' do you mean you had to
> manually power cycle (hold down the power button)?
>
> If boots 2,3,4 worked, then I think we're golden. If not, then back to
> the drawing board... maybe we need ickle's advice on what to look at next.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/982889
>
> Title:
> X trying to start before plymouth has finished using the drm driver
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/libdrm/+bug/982889/+subscriptions
>

Revision history for this message
Franck (alci) wrote :

Still no amd64...

From the PPA status page:

Build status
[CANCELLED] Cancelled build
    Started on 2013-03-03
    Finished 8 hours ago (took 20 hours, 38 minutes, 42.3 seconds)

Revision history for this message
Franck (alci) wrote :

I have compiled xorg-server from PPA localy. So now I have tried it, and "kind of" works. Not really satisfactory, although.
What happens is that lightdm end up starting, BUT in the meantime, I have a TTY console showing off (ie no plymouth), and it takes a looong time.

Here is the log of such a case.

Revision history for this message
Adam Bruce (brucey-99-deactivatedaccount) wrote :

I ran apt-cache policy xserver-xorg-core yesterday and realised the patched hadn't been applied when I originally posted my Xorg logs.

I saw a update today from your PPA and now I see

xserver-xorg-core:
  Installed: 2:1.13.2-0ubuntu3~lp982889~1

So it has now been applied. I guess this was due to the building problem for amd? It now shows as built successfully.

I know you said my error may be slightly different to the OP, but I thought I would post as it may be useful and this time I can see the "drm device not ready (#), sleeping for 20us" messages you mentioned.

Mine seems similar to Franck's. I'm also seeing 'setversion 1.4 failed'.

It did not boot successfuly though and went to Low Graphics mode. There is a crash report if you want me to upload that as a report. But I don't know if it will just be the same as: bug 1130667 - let me know if you want me to post it

Revision history for this message
Adam Bruce (brucey-99-deactivatedaccount) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

On Tue, Mar 05, 2013 at 01:13:39AM -0000, Adam Bruce wrote:
>
> It did not boot successfuly though and went to Low Graphics mode. There
> is a crash report if you want me to upload that as a report. But I don't
> know if it will just be the same as: bug 1130667 - let me know if you
> want me to post it

Yeah post it, in the chance the logs have something interesting; worst
case I can just dupe it.

Revision history for this message
Adam Bruce (brucey-99-deactivatedaccount) wrote :

Apport won't let me report as it's not an official package

I don't know if you can make use of the crash file itself?

Otherwise let me know how I can report it to you

Revision history for this message
Bryce Harrington (bryce) wrote :

On Tue, Mar 05, 2013 at 02:57:55AM -0000, Adam Bruce wrote:
> Apport won't let me report as it's not an official package
>
> I don't know if you can make use of the crash file itself?
>
> Otherwise let me know how I can report it to you

Thanks, yes I know how to unpack .crash files. The file contains the
core dump and basic info. It lacks the various logs and such but I
probably don't need those. I'll review it tomorrow and let you know.

Revision history for this message
Bryce Harrington (bryce) wrote :

Thanks everyone for the recent testing. In comparing everyone's logs the following observations can be drawn:

1. The assertions made in comment #11 are incorrect. We are not seeing EAGAIN (error code 11) but apparently EACCES (error code 13).

2. The patch I prepared is suited for handling EAGAIN. The reason it works for EACCES as well is simply because it's throwing some sleep in there.

We already know from comment #5 that tossing in some sleep before X starts works around the problem, so that isn't surprising. Still, this patch is probably a better workaround since it only comes into play when the bug has actually happened; on non-bugged boots you'll get a faster non-sleepy startup. For this reason, I'll proceed with polishing up the patch and add it to the archive.

Revision history for this message
Bryce Harrington (bryce) wrote :

I've polished up the patch and tweaked it to give more informative error messages. I've stuck the patch into the PPA if anyone would like to test.

Revision history for this message
Jason Robinson (jaywink) wrote :

Would love to apply but running quantal

Revision history for this message
Franck (alci) wrote :

Here is my log with latest from PPA.

Revision history for this message
Adam Bruce (brucey-99-deactivatedaccount) wrote :

xserver-xorg-core:
  Installed: 2:1.13.2-0ubuntu3~lp982889~2

Which I guess means the newer build is installed?

I'm having more success with this one. Before it was failing everytime, now it fails about 50% of the time.

I have got logs of the successful boot and failed boot if that helps

This attachment is the successful boot log

Revision history for this message
Adam Bruce (brucey-99-deactivatedaccount) wrote :

This is the failed log

Revision history for this message
Adam Bruce (brucey-99-deactivatedaccount) wrote :

This is the failed crash file

Revision history for this message
Bryce Harrington (bryce) wrote :

Thanks Adam and Bruce - for those logs it looks like you just lucked out and didn't hit "the bug", but that at least verifies the patch doesn't introduce any regressions. I'll go ahead and upload it. It's probably going to be largely random whether or not you hit the bug, but with the patch hopefully you just won't need to notice anymore.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package xorg-server - 2:1.13.2-0ubuntu3

---------------
xorg-server (2:1.13.2-0ubuntu3) raring; urgency=low

  * Add drm_device_keep_trying.patch: When kernel reports drm device is
    not available, don't give up immediately, but keep retrying for a
    little bit. Fixes boot failures due to a race condition with plymouth
    or the kernel. Typical symptom is xserver error exit, "Cannot run in
    framebuffer mode" and Xorg.0.log messages about "setversion 1.4
    failed".
    (LP: #982889)
 -- Bryce Harrington <email address hidden> Tue, 19 Feb 2013 07:58:24 -0800

Changed in xorg-server (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Franck (alci) wrote :

thanks Bryce.
One question: this workaround results in a flip-flap between console mode / graphical mode during boot, and a 5 to 20 seconds delay in the boot. This is better than lightdm failing to start, but not the best we can aim...
So my question is: should I open another bug report, maybe mentioning EACCESS error code ? Are we just going on with this workaround, waiting for Mir to fix everything (and probably break some also :-) ) ?

Revision history for this message
Bryce Harrington (bryce) wrote :

The flip/flap I don't know about; this patch shouldn't *cause* that afaict. Maybe that relates to the underlying bug though.

The patch does cause a delay in the boot, just the nature of the beast. However it's set to 2 sec. If you're seeing a 5-20 sec delay, that's longer than I would think it should be. But again, maybe that relates to the underlying bug.

Despite what apw and slangasek said earlier, I can find no sign of where EACCES would come from. The code doesn't appear to be passing it, and as seen in the logs there's no evidence of it appearing in X. Makes me wonder if there's a kernel patch missing or something. Fact is we really don't understand what the kernel is doing behind the scenes here.

I don't think we need another bug opened. While the xserver task for this bug is closed, there is still a task open against the kernel. I am hopeful we can get some of their time to help on this.

I'm not sure what the next step for this bug should be. It would probably help to develop a synthetic test case, so that developers can repro the problem 100% locally. mlankhorst took a shot at this today but found his experiments just exposed other unrelated kernel bugs. I've suggested he and apw brainstorm on this.

Regarding Mir - near as I can tell this is *not* an X bug. It's something lower down. We're just trying to paper over the underlying problem at the X layer. So, it would not surprise me at all that this same issue will hit Mir. But Mir has a much cleaner design for graphics bring up, so maybe not.

Revision history for this message
Franck (alci) wrote :

Here are some materials about what I see happen here:

- a video showing how the console shows after plymouth and before X comes back
- a bootchart
- syslog
- Xorg.0.log

There is indeed a huge gap (10 seconds) without any activity at the beginning of the boot, maybe not related to X, but still here.
Then X will take another 10 sec until lightdm is ready.

Does this fit into this bug report ?

Revision history for this message
Franck (alci) wrote :
Revision history for this message
Franck (alci) wrote :
Revision history for this message
Franck (alci) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

[ 14.278] (II) config/udev: Adding drm device (/dev/dri/card0)
[ 14.716] drm device access denied
[ 15.140] drm device access denied
[ 15.568] drm device access denied
[ 15.992] drm device access denied
[ 16.415] drm device access denied
[ 16.844] drm device access denied
[ 17.268] drm device access denied
[ 17.691] drm device access denied
[ 18.115] drm device access denied
[ 18.538] drm device access denied
[ 18.538] setversion 1.4 failed

Gratifying to see the patch's behavior "in the wild". :-)

Anyway, you can see there that the patch introduced a 4-second delay. There's also another similar delay later in the Xorg.0.log:

[ 19.031] (II) config/udev: Adding drm device (/dev/dri/card0)
[ 19.455] drm device access denied
[ 19.880] drm device access denied
[ 20.305] drm device access denied
[ 20.729] drm device access denied
[ 21.157] drm device access denied
[ 21.582] drm device access denied
[ 22.008] drm device access denied
[ 22.432] drm device access denied
[ 22.857] drm device access denied
[ 23.282] drm device access denied
[ 23.282] setversion 1.4 failed

So, between the two invocations of the patched behavior that accounts for ~8-9 sec of your 10 sec delay. The good news is we can dial that up or down however we want, the bad news is it looks like doing so wouldn't substantially affect the behavior...

Revision history for this message
Bryce Harrington (bryce) wrote :

Alrighty, I've posted an update to the PPA for folks to test. https://launchpad.net/~bryce/+archive/lp982889/

As before, it looks like i386 has built, but amd64 is having some troubles - I've restarted it and hope it'll build by morning. Those of you on amd64 who are comfortable building packages locally may want to just grab the .dsc and roll your own .debs. Otherwise, hopefully the PPA will rebuild amd64 by tomorrow.

Maarten Lankhorst theorized that the failure to 'reset' may be because we need to forcefully set xserver as master of the DRM, so this patch adds a call to drmSetMaster(). Unfortunately, with the limited testing we've done so far we did not see this as truly fixing the bug. Maarten believes plymouth may not be handing the drm off properly, and has been roughing up a plymouth patch to try to investigate that angle.

Anyway, along with the drmSetMaster() call, I've done a rewrite of the patch, to make the logic a bit more presentable. I also knocked the timeout down to 2 seconds. So, worst case, for those of you suffering from this bug, this PPA should give you a slightly faster boot and otherwise no change in behavior. I've verified on my own (non-bugged) hardware that at least the xserver boots up and works properly in the non-bugged case.

For those of you wanting to do more testing on this issue, one thing to try is disable plymouth completely. It seems that simply specifying nosplash on the kernel command line is insufficient to fully disable plymouth, and it's thought that perhaps this may account for why some people found the same troubles even though they turned the splash screen off.

Revision history for this message
Franck (alci) wrote :

Here is a Xorg.0.log with new patch.
Regarding the whole boot process, I didn't get the 10 sec stale at the very beginning at the boot process that was in my latest bootchart graph... but this might be totally unrelated, I don't know.

Revision history for this message
Franck (alci) wrote :
Revision history for this message
Franck (alci) wrote :

Sorry for duplicate post... can't seem to attach my log. Last try :-)

Revision history for this message
Maarten Lankhorst (mlankhorst) wrote :

I posted a new version of the patch that should fix xorg-server, it should be in 1.13.3-0ubuntu1.

However it might not be enough to make things work, since there might still be bugs in plymouth and lightdm

Changed in plymouth (Ubuntu):
status: New → Confirmed
Changed in linux (Ubuntu):
status: Confirmed → Invalid
Changed in libdrm (Ubuntu):
status: Triaged → Invalid
Changed in lightdm (Ubuntu):
status: New → Confirmed
assignee: nobody → Maarten Lankhorst (mlankhorst)
Changed in plymouth (Ubuntu):
assignee: nobody → Maarten Lankhorst (mlankhorst)
Revision history for this message
Franck (alci) wrote :

The problem(s) seem to be fixed for me. No more boot delay, lightdm starts fine, no console appearing...
My boot time is under 10 sec.
No warning or anaivalable drm in Xorg.0.log... Kind of miraculous :-)

Attached are my logs, in case this matters.

Revision history for this message
Franck (alci) wrote :
Revision history for this message
Maarten Lankhorst (mlankhorst) wrote :

Yes xorg log of working would be nice, but I don't see it attached. ;-)

Revision history for this message
Franck (alci) wrote :

@Maarten. Sorry, forgot to attach it and did it on a second post. Here it is.

Revision history for this message
Maarten Lankhorst (mlankhorst) wrote :

Woops, judging from the activity log it looks like you did attach it, it just doesn't show up in the comments somehow.

Revision history for this message
Alexander Kops (alexkops) wrote :

I'm also affected by this #969489 which I solved by always restarting lightdm in my /etc/rc.local
But still I get a Xorg crashreport after starting once in a while, so maybe those two bugs are also related?

Revision history for this message
Franck (alci) wrote :

Just fixed for me, but it seems that last update in Lightdm and / or upstart broke things again. Symptoms are not quite the same, but the issues might be related...

Now X fails with these messages:
[ 4.746] (WW) xf86OpenConsole: setpgid failed: Operation not permitted
[ 4.746] (WW) xf86OpenConsole: setsid failed: Operation not permitted
[ 4.748] (WW) Falling back to old probe method for vesa
[ 4.748] (WW) Falling back to old probe method for modesetting
[ 4.748] (WW) Falling back to old probe method for fbdev
[ 4.748] (EE) No devices detected.
[ 4.748]
Fatal server error:
[ 4.748] no screens found

I opened bug #1159099 against lightdm (but maybe its a duplicate after all).
I also found references to the same type of problems in #799069.

All related to a race condition / problem with upstart events between lightdm / plymouth.

Revision history for this message
Steve Langasek (vorlon) wrote :

Those errors show that X is running without the correct privileges; that has nothing to do with Plymouth.

Revision history for this message
Maarten Lankhorst (mlankhorst) wrote :

Wrong, in any case the race happens because plymouth-splash is racing against lightdm. if you add 'and started plymouth-splash' after the 'and dbus' line in /etc/init/lightdm.conf the race should be gone.

Changed in lightdm (Ubuntu):
status: Confirmed → Invalid
status: Invalid → Confirmed
Revision history for this message
Franck (alci) wrote :

I can confirm that adding 'and started plymouth-splash' after the 'and dbus' line in /etc/init/lightdm.conf , as Maarten suggests, does fix the problem for me.

Revision history for this message
Steve Langasek (vorlon) wrote :

+1 for this change. It should be safe because plymouth-splash is only ever started once at boot, so there's no risk of 'and started plymouth-splash' causing a maintainer script hang at package upgrade time.

Since plymouth-splash itself waits for the video device before starting, I think the lightdm start condition can probably be reduced to:

start on ((filesystem
           and runlevel [!06]
           and started dbus
           and started plymouth-splash)
          or runlevel PREVLEVEL=S)

Revision history for this message
Franck (alci) wrote :

Just a question: what if the no-splash is passed as a boot option ? Or if plymouth is removed ?

Revision history for this message
Steve Langasek (vorlon) wrote :

On Wed, Mar 27, 2013 at 07:14:39AM -0000, Franck wrote:
> Just a question: what if the no-splash is passed as a boot option ?

That only controls whether you have a graphical splash, not whether the
plymouth-splash job runs.

> Or if plymouth is removed ?

That is absolutely unsupported.

Changed in lightdm (Ubuntu):
importance: Undecided → Critical
Robert Hooker (sarvatt)
Changed in lightdm (Ubuntu):
status: Confirmed → Triaged
Steve Magoun (smagoun)
Changed in oem-priority:
importance: Undecided → Critical
Steve Magoun (smagoun)
Changed in oem-priority:
status: New → In Progress
Revision history for this message
Ubuntu QA Website (ubuntuqa) wrote :

This bug has been reported on the Ubuntu ISO testing tracker.

A list of all reports related to this bug can be found here:
http://iso.qa.ubuntu.com/qatracker/reports/bugs/982889

tags: added: iso-testing
Revision history for this message
Robert Hooker (sarvatt) wrote :

plymouth:debug log from a failed boot from tjaalton

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

So, the proposed change from #115 doesn't work if plymouth is started from the initramfs, like when cryptsetup is used. In such a case no 'started plymouth-splash' is ever emitted, so something else needs to be used:

00:34 < tjaalton> slangasek: heh, yeah. I was wondering if there was some more generic event to abuse here
00:36 < slangasek> tjaalton: not really, we need to create one - either by fixing it in upstart, or by adding a
                   secondary job that's 'start on startup or started plymouth-splash', checks for plymouth-splash
                   running already, and emits an appropriate common event
00:36 < tjaalton> slangasek: right, that could work
00:36 < tjaalton> as an interim solution
00:37 < slangasek> tjaalton: and by 'checks for plymouth splash running', I mean checking the output of 'status
                   plymouth-splash'

fixing upstart means "synthesizing 'started' events for jobs started from initramfs", but in the meantime the other approach could be used.

Revision history for this message
manatorg (manatorg) wrote :

Hi,

had the same problem.

 /etc/init/lightdm.conf:
   ...
    sleep 2
    exec lightdm
    ...

sleep 2 was enough to solve the problem for me.

i am using intel graphics an a l520 with a ssd (msata)

Revision history for this message
James M. Leddy (jm-leddy) wrote :

Moving to High prio in oem-priority since we have a workaround.

Changed in oem-priority:
assignee: nobody → James M. Leddy (jm-leddy)
importance: Critical → High
Revision history for this message
Simon Baconnais (smon-deactivatedaccount) wrote :

Same problem here. A sleep 10 workarounnd solved it, but can't we have a fix ?

Timo Aaltonen (tjaalton)
no longer affects: libdrm (Ubuntu)
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Ok I have the necessary changes tested locally, and they seem to work when plymouth is started from the initrd too.

no longer affects: xorg-server (Ubuntu)
no longer affects: linux (Ubuntu)
Changed in plymouth (Ubuntu Raring):
assignee: Maarten Lankhorst (mlankhorst) → Timo Aaltonen (tjaalton)
importance: Undecided → Critical
status: Confirmed → In Progress
Changed in lightdm (Ubuntu Raring):
assignee: Maarten Lankhorst (mlankhorst) → Timo Aaltonen (tjaalton)
status: Triaged → In Progress
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

for the record, plymouth and lightdm updates available on raring-proposed for testing

Changed in lightdm (Ubuntu Raring):
status: In Progress → Fix Committed
Changed in plymouth (Ubuntu Raring):
status: In Progress → Fix Committed
Robert Hooker (sarvatt)
tags: added: verification-needed
Revision history for this message
Kamal Mostafa (kamalmostafa) wrote :

I have verified that these new package versions from raring-proposed do fix the problem. Thanks!:

  lightdm (1.6.0-0ubuntu2.1)
  plymouth (0.8.8-0ubuntu6.1)

tags: added: verification-done
removed: verification-needed
Revision history for this message
Adam Conrad (adconrad) wrote : Update Released

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lightdm - 1.6.0-0ubuntu2.1

---------------
lightdm (1.6.0-0ubuntu2.1) raring-proposed; urgency=low

  * lightdm.upstart: Add a start condition on plymouth-ready, and
    drop conditions already handled by plymouth-splash (LP: #982889).
  * control: Depend on the new plymouth version that provides plymouth-ready.
 -- Timo Aaltonen <email address hidden> Tue, 23 Apr 2013 12:10:28 +0300

Changed in lightdm (Ubuntu Raring):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package plymouth - 0.8.8-0ubuntu6.1

---------------
plymouth (0.8.8-0ubuntu6.1) raring-proposed; urgency=low

  * plymouth-ready.conf: Send an event to indicate plymouth is up. Needed
    to inform login managers that they can start without racing with
    plymouth-splash. (LP: #982889)
 -- Timo Aaltonen <email address hidden> Fri, 19 Apr 2013 21:55:18 +0300

Changed in plymouth (Ubuntu Raring):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lightdm - 1.6.0-0ubuntu2.1

---------------
lightdm (1.6.0-0ubuntu2.1) raring-proposed; urgency=low

  * lightdm.upstart: Add a start condition on plymouth-ready, and
    drop conditions already handled by plymouth-splash (LP: #982889).
  * control: Depend on the new plymouth version that provides plymouth-ready.
 -- Timo Aaltonen <email address hidden> Tue, 23 Apr 2013 12:10:28 +0300

Changed in lightdm (Ubuntu Saucy):
status: New → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package plymouth - 0.8.8-0ubuntu6.1

---------------
plymouth (0.8.8-0ubuntu6.1) raring-proposed; urgency=low

  * plymouth-ready.conf: Send an event to indicate plymouth is up. Needed
    to inform login managers that they can start without racing with
    plymouth-splash. (LP: #982889)
 -- Timo Aaltonen <email address hidden> Fri, 19 Apr 2013 21:55:18 +0300

Changed in plymouth (Ubuntu Saucy):
status: New → Fix Released
Timo Aaltonen (tjaalton)
Changed in lightdm (Ubuntu Precise):
assignee: nobody → Timo Aaltonen (tjaalton)
importance: Undecided → High
status: New → Triaged
Changed in plymouth (Ubuntu Precise):
assignee: nobody → Timo Aaltonen (tjaalton)
importance: Undecided → High
status: New → Triaged
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Tomas, or anyone else affected,

Accepted plymouth into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/plymouth/0.8.2-2ubuntu31.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in plymouth (Ubuntu Precise):
status: Triaged → Fix Committed
tags: removed: verification-done
tags: added: verification-needed
Changed in lightdm (Ubuntu Precise):
status: Triaged → Fix Committed
Revision history for this message
Brian Murray (brian-murray) wrote :

Hello Tomas, or anyone else affected,

Accepted lightdm into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/lightdm/1.2.3-0ubuntu2.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Revision history for this message
Miklos Juhasz (mjuhasz) wrote :

The proposed version of lightdm/plymouth did not fix the issue for me. Lightdm sometimes fails to start, see Xorg log attached.
For now I had to put back the "and stopped udevtrigger" workaround into /etc/init/lightdm.conf. With that lightdm never failed to start for me.

tags: added: verification-failed
removed: verification-needed
Revision history for this message
Steve Langasek (vorlon) wrote :

Miklos, please confirm the version numbers of both lightdm and plymouth that you have installed, and please verify that all of the upstart job files for both of these packages are the unmodified versions from the package. If you already had a modified /etc/init/lightdm.conf on your system, perhaps you didn't get the new version correctly installed?

Revision history for this message
Miklos Juhasz (mjuhasz) wrote :

I reinstalled the proposed plymouth/lightdm packages, confirmed that I have the unmodified files from both packages.
I rebooted my desktop 30 times in a row, not a single failure. If I happen to experience the issue again I'll report back.

Steve Langasek (vorlon)
tags: added: verification-needed
removed: verification-failed
Tim Lunn (darkxst)
Changed in gdm (Ubuntu Saucy):
assignee: nobody → Tim (darkxst)
Revision history for this message
Miklos Juhasz (mjuhasz) wrote :

After all the proposed packages work fine here.
I've installed them on 2 laptops and a desktop (all suffering from this issue ) and been using them for a week. I started up these hosts several times a day and for the sake of testing I also rebooted them a couple of times every day and each time lightdm started as expected.
I also installed the packages on 2 other laptops (not suffering from this bug) and I did not spot any regression.

Steve Langasek (vorlon)
tags: added: verification-done
removed: verification-needed
Martin Pitt (pitti)
Changed in gdm (Ubuntu Saucy):
status: New → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gdm - 3.6.1-0ubuntu6

---------------
gdm (3.6.1-0ubuntu6) saucy; urgency=low

  * Merge changes from lightdm to fix plymouth race (LP: #982889)
    - lightdm.upstart: Add a start condition on plymouth-ready, and
      drop conditions already handled by plymouth-splash.
    - control: Depend on the new plymouth version that provides
      plymouth-ready.
 -- Tim Lunn <email address hidden> Mon, 27 May 2013 12:41:07 +0200

Changed in gdm (Ubuntu Saucy):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package plymouth - 0.8.2-2ubuntu31.1

---------------
plymouth (0.8.2-2ubuntu31.1) precise-proposed; urgency=low

  * plymouth-ready.conf: Send an event to indicate plymouth is up. Needed
    to inform login managers that they can start without racing with
    plymouth-splash. (LP: #982889)
 -- Timo Aaltonen <email address hidden> Fri, 03 May 2013 10:49:34 +0300

Changed in plymouth (Ubuntu Precise):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package lightdm - 1.2.3-0ubuntu2.1

---------------
lightdm (1.2.3-0ubuntu2.1) precise-proposed; urgency=low

  * lightdm.upstart: Add a start condition on plymouth-ready, and
    drop conditions already handled by plymouth-splash (LP: #982889).
  * control: Depend on the new plymouth version that provides plymouth-ready.
 -- Timo Aaltonen <email address hidden> Fri, 03 May 2013 11:30:33 +0300

Changed in lightdm (Ubuntu Precise):
status: Fix Committed → Fix Released
Changed in oem-priority:
status: In Progress → Fix Released
Revision history for this message
William Pietri (william-launchpad-net) wrote :

Hi! I suspect this fix may be causing a problem for me. I'm using an up to date 12.04.2 LTS setup that worked fine until now.

My symptom is that upon reboot, I end up with a black screen. By hitting Control-Alt-F1, I can get things working properly with "sudo service start lightdm". Looking at the logs, everything seems fine, except that /var/log/upstart/plymouth-ready-startup.log says "status: Unknown job: plymouth-splash".

Running "initctl check-config -w" gets me similar comments:
  setvtrgb
    start on: unknown job plymouth-splash
  plymouth-ready
    start-on: unknown job plymouth splash

I'm glad to debug further, but I can't quite figure out how this is supposed to work; the correct relationship between events is opaque to me.

Revision history for this message
Steve Langasek (vorlon) wrote :

William, you have a broken upstart config on your system. The /etc/init/plymouth-splash.conf system file has apparently been removed, by you, another admin, or a filesystem error. To restore it, you should run reinstall the plymouth package with the '--force-confmiss' option.

Revision history for this message
William Pietri (william-launchpad-net) wrote :

Thanks, Steve. The box was set up by a vendor (Emperor Linux), so perhaps they did something custom with the splash screens when I got it 18 months ago.

Aesthetically, making something vital (UI startup) depend on something decorative (splash screens) seems odd. Perhaps future versions of this can be made more robust.

In case somebody else with a similar problem finds this bug report, I verified the problem by using debsums

  % sudo debsums -a plymouth
  [most package items ok]
  debsums: missing file /etc/init/plymouth-log.conf (from plymouth package)
  debsums: missing file /etc/init/plymouth-upstart-bridge.conf (from plymouth package)
  /etc/init/plymouth.conf OK
  debsums: missing file /etc/init/plymouth-stop.conf (from plymouth package)
  /etc/init/plymouth-ready.conf OK
  debsums: missing file /etc/init/plymouth-splash.conf (from plymouth package)

And the precise command I used to fix the problem was:
  % sudo apt-get --reinstall -o Dpkg::Options::=--force-confmiss install plymouth

That put back the missing files.

I also did "sudo debsums -s" to make sure nothing else important was missing.

Revision history for this message
Steve Langasek (vorlon) wrote : Re: [Bug 982889] Re: X trying to start before plymouth has finished using the drm driver

On Wed, Jun 05, 2013 at 05:39:07PM -0000, William Pietri wrote:
> Aesthetically, making something vital (UI startup) depend on something
> decorative (splash screens) seems odd. Perhaps future versions of this
> can be made more robust.

To clarify, plymouth is not "merely decorative". It is the boot-time
interface used for interacting with the user regarding any boot problems; so
it's always present, and it needs to hand off control of the console
reliably to lightdm, which means that the lightdm job can't start until it
knows for sure plymouth has started up.

Revision history for this message
Johann Gail (johann-gail) wrote :

Either this fix isn't complete or I have a similar problem.

On my fast hardware (a Clevo W370ET with SSD and a quad core i7) I observe the same effects. Even after updating to latest versions I get the fault. I'm running raring (XUbuntu 13.04).

I observe at 50% of booting a black screen with a blinking cursor and a mousepointer. I can switch to a terminal console and by entering the command "sudo service lightdm restart" I get a running graphical session.

Also the workaround at comment #5, inserting a sleep 1 in lightdm.conf does help. But I doesn't like the idea of solving a race condition by inserting some sleep. Especially on a fast booting ssd notebook.

Should I open a new bug report or should I continue tis one?

Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Tomas, or anyone else affected,

Accepted gdm into raring-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/gdm/3.6.1-0ubuntu4.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in gdm (Ubuntu Raring):
status: New → Fix Committed
tags: removed: verification-done
tags: added: verification-needed
Revision history for this message
the foodmonkey (foodmonkey) wrote :

ok - the fix might have been releaased but can someone please let those idiots at intel open technologies know about the fix - i used theirrepos to install the latest intel graphics stack - which then results i a distro upgrade and whammo - had to put the "sleep 10" workaround into /etc/init/lightdm.conf so it wouldn't boot into low graphics mode - seems like the boys from intel are using an old baseline for their kernel mods

tags: added: saucy
Revision history for this message
Steve Langasek (vorlon) wrote :

Hello Tomas, or anyone else affected,

Accepted gdm into precise-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/gdm/3.0.4-0ubuntu15.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in gdm (Ubuntu Precise):
status: New → Fix Committed
Revision history for this message
Philippe Coval (rzr) wrote :

Wondering why this bug has not been forwarded upstream ?

https://bugs.freedesktop.org/show_bug.cgi?id=42972

does not mention about :

     setversion 1.4 failed

nor https://bugzilla.redhat.com/show_bug.cgi?id=855677

Also I feel the patch worth to be forwarded/upstreamed too :

https://launchpadlibrarian.net/131169748/0001-If-drm-device-couldn-t-be-opened-keep-trying-for-a-s.patch

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

because it's not the same bug

Revision history for this message
Philippe Coval (rzr) wrote :

Well I forwarded the patch to that closed bug anyway since look like userspace race too ...

Next step would be to open a new one and update that patch (see TODO ) to see if upstream want to integrate this sleep workaround or give hints to fix this elsewhere ...

Revision history for this message
Franz Hsieh (franz-hsieh) wrote :

We have the same problem on the platfrom runs with Ubuntu 12.04.2 with quantal backported.

I reviewed the raring patch from https://launchpad.net/ubuntu/+source/xorg-server/2:1.13.2-0ubuntu3 and tried to backport it to 12.04.2.
It looks work fine on our platfrom.

Revision history for this message
Steve Langasek (vorlon) wrote :

Maarten, is this something you could have a look at?

Changed in xorg-server-lts-quantal (Ubuntu Raring):
status: New → Invalid
Changed in xorg-server-lts-quantal (Ubuntu Saucy):
status: New → Invalid
Changed in xorg-server-lts-quantal (Ubuntu Precise):
assignee: nobody → Maarten Lankhorst (mlankhorst)
milestone: none → ubuntu-12.04.4
status: New → Triaged
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gdm - 3.0.4-0ubuntu15.2

---------------
gdm (3.0.4-0ubuntu15.2) precise; urgency=low

  * Merge changes from lightdm to fix plymouth race (LP: #982889)
    - lightdm.upstart: Add a start condition on plymouth-ready, and
      drop conditions already handled by plymouth-splash.
    - control: Depend on the new plymouth version that provides plymouth-ready.
 -- Tim Lunn <email address hidden> Thu, 23 May 2013 17:45:44 +1000

Changed in gdm (Ubuntu Precise):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package gdm - 3.6.1-0ubuntu4.1

---------------
gdm (3.6.1-0ubuntu4.1) raring; urgency=low

  * Merge changes from lightdm to fix plymouth race (LP: #982889)
    - lightdm.upstart: Add a start condition on plymouth-ready, and
      drop conditions already handled by plymouth-splash.
    - control: Depend on the new plymouth version that provides plymouth-ready.
 -- Tim Lunn <email address hidden> Thu, 23 May 2013 17:45:44 +1000

Changed in gdm (Ubuntu Raring):
status: Fix Committed → Fix Released
Revision history for this message
Mathew Hodson (mhodson) wrote :

The gdm oackages in raring and precise-proposed have been released so removing the verification-needed tag.

tags: removed: verification-needed
Revision history for this message
Rolf Leggewie (r0lf) wrote :

raring has seen the end of its life and is no longer receiving any updates. Marking the raring task for this ticket as "Won't Fix".

Changed in kde-workspace (Ubuntu Raring):
status: New → Won't Fix
Revision history for this message
Rolf Leggewie (r0lf) wrote :

saucy has seen the end of its life and is no longer receiving any updates. Marking the saucy task for this ticket as "Won't Fix".

Changed in kde-workspace (Ubuntu Saucy):
status: New → Won't Fix
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in kde-workspace (Ubuntu Precise):
status: New → Confirmed
Changed in kde-workspace (Ubuntu):
status: New → Confirmed
Revision history for this message
gna (nagy-gergely) wrote :

I know rather old thing, but this got irritating me now. one bit of info:

This shows that plymouth-ready is after lightdm...

/var/lib/plymouth/boot-duration
0.978:avahi-daemon
0.979:statd
0.979:udev-finish
0.993:rc-sysinit
0.993:rc
0.993:tty4
0.993:tty5
0.993:acpid
0.993:anacron
0.993:tty2
0.993:tty3
0.993:dmesg
0.993:tty6
0.993:plymouth-stop
0.993:cron
0.993:atd
0.993:irqbalance
0.993:whoopsie
0.993:libvirt-bin
0.994:apport
0.995:qemu-kvm
0.998:lightdm
0.999:plymouth-ready

Revision history for this message
Stu (stu-axon) wrote :

Just confirming that on 15.04 I get the symptoms mentioned in

https://bugs.launchpad.net/ubuntu/+source/xorg-server/+bug/1115177

Where Nouveau won't load, which is apparently because of this bug

[ 27074.122] (II) [drm] nouveau interface version: 1.2.1
[ 27074.122] (II) Loading sub module "dri2"
[ 27074.122] (II) LoadModule: "dri2"
[ 27074.122] (II) Module "dri2" already built-in
[ 27074.122] (EE) NOUVEAU(0): [drm] failed to set drm interface version.
[ 27074.122] (EE) NOUVEAU(0): [drm] error opening the drm
[ 27074.122] (EE) NOUVEAU(0): 904:
[ 27074.122] (II) UnloadModule: "nouveau"
[ 27074.122] (EE) Screen(s) found, but none have a usable configuration.
[ 27074.122] (EE)
Fatal server error:
[ 27074.122] (EE) no screens found(EE)
[ 27074.122] (EE)

Mathew Hodson (mhodson)
Changed in xorg-server-lts-quantal (Ubuntu Precise):
assignee: Maarten Lankhorst (mlankhorst) → nobody
milestone: ubuntu-12.04.4 → none
status: Triaged → Won't Fix
no longer affects: xorg-server-lts-quantal (Ubuntu Saucy)
no longer affects: xorg-server-lts-quantal (Ubuntu Raring)
Mathew Hodson (mhodson)
no longer affects: xorg-server-lts-quantal (Ubuntu Precise)
no longer affects: xorg-server-lts-quantal (Ubuntu)
no longer affects: kde-workspace (Ubuntu Saucy)
no longer affects: kde-workspace (Ubuntu Raring)
no longer affects: kde-workspace (Ubuntu Precise)
no longer affects: kde-workspace (Ubuntu)
Mathew Hodson (mhodson)
Changed in lightdm (Ubuntu Saucy):
importance: Undecided → High
Changed in plymouth (Ubuntu Saucy):
importance: Undecided → High
Changed in gdm (Ubuntu):
importance: Undecided → High
Changed in gdm (Ubuntu Precise):
importance: Undecided → Medium
Changed in gdm (Ubuntu):
importance: High → Medium
Changed in gdm (Ubuntu Raring):
importance: Undecided → Medium
Changed in gdm (Ubuntu Saucy):
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.