plymouthd crashed with SIGABRT in __assert_fail_base()

Bug #966868 reported by Florin Broasca
54
This bug affects 9 people
Affects Status Importance Assigned to Milestone
xf86-video-intel
Unknown
High
libdrm (Ubuntu)
Incomplete
High
Unassigned
plymouth (Ubuntu)
Invalid
High
Unassigned

Bug Description

After a force restart of Ubuntu, I've got a System Crash error after logging in.

lsb_release -rd
Description: Ubuntu precise (development branch)
Release: 12.04

Thanks!

ProblemType: Crash
DistroRelease: Ubuntu 12.04
Package: plymouth 0.8.2-2ubuntu28
ProcVersionSignature: Ubuntu 3.2.0-20.33-generic 3.2.12
Uname: Linux 3.2.0-20-generic x86_64
ApportVersion: 1.95-0ubuntu1
Architecture: amd64
Date: Wed Mar 28 09:33:16 2012
DefaultPlymouth: /lib/plymouth/themes/ubuntu-logo/ubuntu-logo.plymouth
ExecutablePath: /sbin/plymouthd
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Alpha amd64 (20120322)
MachineType: LENOVO 4284BZ4
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-20-generic root=UUID=f1bb4518-a890-49c0-9339-ecc3d8bd2658 ro quiet splash vt.handoff=7
ProcCmdline: /sbin/plymouthd --mode=boot --attach-to-session
ProcEnviron:
 TERM=linux
 PATH=(custom, no user)
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-20-generic root=UUID=f1bb4518-a890-49c0-9339-ecc3d8bd2658 ro quiet splash vt.handoff=7
Signal: 6
SourcePackage: plymouth
TextPlymouth: /lib/plymouth/themes/ubuntu-text/ubuntu-text.plymouth
Title: plymouthd crashed with SIGABRT in raise()
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

dmi.bios.date: 01/19/2012
dmi.bios.vendor: LENOVO
dmi.bios.version: 8BET56WW (1.36 )
dmi.board.asset.tag: Not Available
dmi.board.name: 4284BZ4
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr8BET56WW(1.36):bd01/19/2012:svnLENOVO:pn4284BZ4:pvrThinkPadW520:rvnLENOVO:rn4284BZ4:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 4284BZ4
dmi.product.version: ThinkPad W520
dmi.sys.vendor: LENOVO

Revision history for this message
Florin Broasca (florin.broasca) wrote :
Revision history for this message
Apport retracing service (apport) wrote :

StacktraceTop:
 __assert_fail_base (fmt=<optimized out>, assertion=0x7f83e335d118 "0", file=0x7f83e335d7a8 "../../intel/intel_bufmgr_gem.c", line=<optimized out>, function=<optimized out>) at assert.c:94
 __GI___assert_fail (assertion=0x7f83e335d118 "0", file=0x7f83e335d7a8 "../../intel/intel_bufmgr_gem.c", line=2783, function=0x7f83e335ddf0 "drm_intel_bufmgr_gem_init") at assert.c:103
 drm_intel_bufmgr_gem_init (fd=<optimized out>, batch_size=4096) at ../../intel/intel_bufmgr_gem.c:2783
 create_driver (device_fd=11) at ./ply-renderer-i915-driver.c:85
 load_driver (backend=0x250f7b0) at ./plugin.c:512

Revision history for this message
Apport retracing service (apport) wrote : Stacktrace.txt
Revision history for this message
Apport retracing service (apport) wrote : ThreadStacktrace.txt
Changed in plymouth (Ubuntu):
importance: Undecided → Medium
summary: - plymouthd crashed with SIGABRT in raise()
+ plymouthd crashed with SIGABRT in __assert_fail_base()
tags: removed: need-amd64-retrace
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in plymouth (Ubuntu):
status: New → Confirmed
Steve Langasek (vorlon)
visibility: private → public
Changed in plymouth (Ubuntu):
importance: Medium → High
Revision history for this message
Bryce Harrington (bryce) wrote :

This looks more like a libdrm bug. There's a race condition with the i915 device not being ready by the time plymouth is starting. Possibly it's because it doesn't have drm master.

<Sarvatt> apparently chromeos works around it with http://git.chromium.org/gitweb/?p=chromiumos/third_party/kernel.git;a=commit;h=32a8c5b67163a6ae211ff2683c999b6ad2c76d1f but thats just working around the problem..

googling intel/intel_bufmgr_gem.c:2783 turns up a lot of hits.

The code in question with the assert is:

 if (IS_GEN2(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 2;
        else if (IS_GEN3(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 3;
 else if (IS_GEN4(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 4;
        else if (IS_GEN5(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 5;
        else if (IS_GEN6(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 6;
        else if (IS_GEN7(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 7;
        else
                assert(0);

$ xpci 8086:0126
snb-m-gt2+ (8086:0126) sandybridge

So it should be going into the IS_GEN6 branch.

Revision history for this message
In , Bryce Harrington (bryce) wrote :
Download full text (3.4 KiB)

Forwarding this bug from Ubuntu reporter Florin:
http://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-intel/+bug/966868

[Problem]
On Ubuntu there appears to be a race condition in libdrm during boot. It appears the i915 drm device exists but isn't fully initialized at the time plymouth wants to use it.

Note I'm filing this against -intel just because it's the intel portion of libdrm where the code is passing through; I think this is really a libdrm bug.

[Original Description]
After a force restart of Ubuntu, I've got a System Crash error after logging in.

lsb_release -rd
Description: Ubuntu precise (development branch)
Release: 12.04

This looks more like a libdrm bug. There's a race condition with the i915 device not being ready by the time plymouth is starting. Possibly it's because it doesn't have drm master.

<Sarvatt> apparently chromeos works around it with http://git.chromium.org/gitweb/?p=chromiumos/third_party/kernel.git;a=commit;h=32a8c5b67163a6ae211ff2683c999b6ad2c76d1f but thats just working around the problem..

googling intel/intel_bufmgr_gem.c:2783 turns up a lot of hits.

The code in question with the assert is:

 if (IS_GEN2(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 2;
        else if (IS_GEN3(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 3;
 else if (IS_GEN4(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 4;
        else if (IS_GEN5(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 5;
        else if (IS_GEN6(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 6;
        else if (IS_GEN7(bufmgr_gem->pci_device))
                bufmgr_gem->gen = 7;
        else
                assert(0);

$ xpci 8086:0126
snb-m-gt2+ (8086:0126) sandybridge

So it should be going into the IS_GEN6 branch.

Thanks!

ProblemType: Crash
DistroRelease: Ubuntu 12.04
Package: plymouth 0.8.2-2ubuntu28
ProcVersionSignature: Ubuntu 3.2.0-20.33-generic 3.2.12
Uname: Linux 3.2.0-20-generic x86_64
ApportVersion: 1.95-0ubuntu1
Architecture: amd64
Date: Wed Mar 28 09:33:16 2012
DefaultPlymouth: /lib/plymouth/themes/ubuntu-logo/ubuntu-logo.plymouth
ExecutablePath: /sbin/plymouthd
InstallationMedia: Ubuntu 12.04 LTS "Precise Pangolin" - Alpha amd64 (20120322)
MachineType: LENOVO 4284BZ4
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-20-generic root=UUID=f1bb4518-a890-49c0-9339-ecc3d8bd2658 ro quiet splash vt.handoff=7
ProcCmdline: /sbin/plymouthd --mode=boot --attach-to-session
ProcEnviron:
 TERM=linux
 PATH=(custom, no user)
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-20-generic root=UUID=f1bb4518-a890-49c0-9339-ecc3d8bd2658 ro quiet splash vt.handoff=7
Signal: 6
SourcePackage: plymouth
TextPlymouth: /lib/plymouth/themes/ubuntu-text/ubuntu-text.plymouth
Title: plymouthd crashed with SIGABRT in raise()
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

dmi.bios.date: 01/19/2012
dmi.bios.vendor: LENOVO
dmi.bios.version: 8BET56WW (1.36 )
dmi.board.asset.tag: Not Available
dmi.board.name: 4284BZ4
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
...

Read more...

Changed in plymouth (Ubuntu):
status: Confirmed → Invalid
Bryce Harrington (bryce)
Changed in plymouth (Ubuntu):
status: Invalid → Triaged
Changed in xserver-xorg-video-intel (Ubuntu):
importance: Undecided → High
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Revision history for this message
Bryce Harrington (bryce) wrote :

I've forwarded this bug upstream to https://bugs.freedesktop.org/show_bug.cgi?id=48894 - please subscribe yourself to this bug, in case they need further information or wish you to test something. Thanks ahead of time!

affects: xserver-xorg-video-intel (Ubuntu) → libdrm (Ubuntu)
Changed in libdrm (Ubuntu):
status: New → Triaged
Changed in libdrm (Ubuntu):
status: New → Confirmed
Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created attachment 60277
BootDmesg.txt

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created attachment 60278
CurrentDmesg.txt

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created attachment 60279
Lspci.txt

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created attachment 60280
ProcModules.txt

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created attachment 60281
ProcModules.txt

Revision history for this message
In , Bryce Harrington (bryce) wrote :

Created attachment 60282
ThreadStacktrace.txt

Steve Langasek (vorlon)
Changed in plymouth (Ubuntu):
status: Triaged → Invalid
Revision history for this message
In , Bryce Harrington (bryce) wrote :

This is another bug that we think is the same root cause:

  https://bugs.launchpad.net/ubuntu/+source/libdrm/+bug/982889

in this one, X comes up before the drm device is ready, and so trips on a different chunk of code.

You can see from comparing timestamps in Xorg.0.log and dmesg when drm is accessed vs. when it is reporting itself ready.

We've got a couple ideas on how to fix this in the distro. One is to put a loop around the code paths where the failures occur, to continue retrying for some number of seconds. But that feels like a big hack. The other idea would be if there was an event to indicate the driver is ready for use, that we could listen for and delay plymouth, X, etc. until it's received. But we don't know the feasibility of that.

Revision history for this message
In , Bryce Harrington (bryce) wrote :

We suspect that the reason this happens is due to a Ubuntu kernel patch, which was added to work around other boot crashing problems:

http://kernel.ubuntu.com/git?p=ubuntu/ubuntu-precise.git;a=commitdiff;h=6d74feca6235b463ade4ecddd1dfdb73d30a2ff7;hp=e29a4668d7441aa88d8015da51674a7e8159312b

"When a drm driver is initialised we first allocate and initialise the
drm minor numbers including creating the sysfs files, then we trigger
the driver load method. The act of creating the sysfs files triggers the
uevent. This means udev may start programs which open /dev/dri/card0 and
other interfaces, this can occur before the load method has even started
and thus before the driver has fully initialised its data structures.
In the case of plymouthd this leads to it opening and closing (in disgust)
the interface, which in turn leads to a kernel panic as the mutexes are
yet to be initialised.

"This patch delays the linking up of the drm devices minor numbers until
the driver is fully initialised. As it is possible for consumers of
these interfaces to reach them before they are fully initialised we
arrange for opens of these devices to return EAGAIN until the device is
fully initialised."

Changed in xserver-xorg-video-intel:
importance: Unknown → High
status: Unknown → Confirmed
Revision history for this message
In , Bryce Harrington (bryce) wrote :

<jbarnes> so for 48894 I'd open a separate bug against drm for the core issue: if you access the device too early you get a crash
<jbarnes> there's a similar bug with accessing the dpms status files in sysfs
<jbarnes> if the module is unloading at the time, you can panic the kenrel
<jbarnes> also a kernel bug

I'll move this bug to drm, as I think the core issue is what we're really looking for advice on here.

Revision history for this message
In , Bryce Harrington (bryce) wrote :

<jbarnes> ok looks like a core drm kernel bug
<jbarnes> we don't lock properly around initialization

Revision history for this message
Bryce Harrington (bryce) wrote :

Florin, are you able to reproduce this issue at all?

If so, would you mind testing this kernel that apw thinks might help, and then letting us know if it can't be reproduced after that?

http://people.canonical.com/~apw/lp982889-precise/

Changed in libdrm (Ubuntu):
status: Triaged → Incomplete
Revision history for this message
In , Gitlab-migration (gitlab-migration) wrote :

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/drm/issues/8.

Changed in xserver-xorg-video-intel:
status: Confirmed → Unknown
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.