bbswitch prevents suspend and taints kernel

Bug #1318040 reported by Guillaume Millet
20
This bug affects 3 people
Affects Status Importance Assigned to Milestone
bbswitch (Ubuntu)
New
Undecided
Peter Wu

Bug Description

When initiating suspend (to RAM), the kernel gets tainted. This is a regression that appeared after upgrading from Kubuntu 13.10 to 14.04. My computer has a discrete nVidia G210M card (hybrid SLI system, on ASUS UL80VT). Here is an extract from dmesg:

PM: Preparing system for mem sleep
BUG: unable to handle kernel NULL pointer dereference at 00000000000000b8
IP: [<ffffffff8139ba5b>] pci_bus_read_config_dword+0x4b/0x90

---
ApportVersion: 2.14.1-0ubuntu3
Architecture: amd64
CurrentDesktop: KDE
DistroRelease: Ubuntu 14.04
InstallationDate: Installed on 2010-01-19 (1571 days ago)
InstallationMedia: Kubuntu 9.10 "Karmic Koala" - Release amd64 (20091027)
Package: bbswitch-dkms 0.7-2ubuntu1
PackageArchitecture: amd64
ProcVersionSignature: Ubuntu 3.13.0-26.48-generic 3.13.11
Tags: trusty
Uname: Linux 3.13.0-26-generic x86_64
UpgradeStatus: Upgraded to trusty on 2014-05-07 (2 days ago)
UserGroups: adm admin audio bumblebee cdrom dialout fuse lpadmin plugdev sambashare src users vboxusers
_MarkForUpload: True

Revision history for this message
Guillaume Millet (guimillet) wrote : Dependencies.txt

apport information

affects: xorg (Ubuntu) → bbswitch (Ubuntu)
summary: - Suspend does not work
+ bbswitch prevents suspend and taints kernel
tags: added: apport-collected
description: updated
Revision history for this message
Guillaume Millet (guimillet) wrote : ProcEnviron.txt

apport information

Revision history for this message
Guillaume Millet (guimillet) wrote :

dmesg just after trying to suspend

description: updated
Revision history for this message
Guillaume Millet (guimillet) wrote :

Still apears with linux-image-3.15.0-031500rc5-generic.

For now, I stay with 3.11 kernel from kubuntu 13.10.

Revision history for this message
Peter Wu (lekensteyn) wrote :

Disabling the graphics card seems to trigger an ACPI hotplug event which seems to remove the device from the system.

Could you prevent bbswitch from loading at boot (nvidia/nouveau should not load either), then report the output of the following steps including a dmesg log:

sudo modprobe bbswitch load_state=-1
lspci -nnvvd10de:
sudo tee /proc/acpi/bbswitch <<<OFF
lspci -nnvvd10de:
sudo tee /proc/acpi/bbswitch <<<ON
lspci -nnvv10de:

Changed in bbswitch (Ubuntu):
assignee: nobody → Peter Wu (lekensteyn)
status: New → Incomplete
Revision history for this message
Guillaume Millet (guimillet) wrote :
Download full text (9.4 KiB)

$ sudo modprobe bbswitch load_state=-1
dmesg log:
[ 2780.584599] bbswitch: module verification failed: signature and/or required key missing - tainting kernel
[ 2780.584809] bbswitch: version 0.7
[ 2780.584818] bbswitch: Found integrated VGA device 0000:00:02.0: \_SB_.PCI0.VGA_
[ 2780.584830] bbswitch: Found discrete VGA device 0000:01:00.0: \_SB_.PCI0.P0P1.VGA_
[ 2780.584843] ACPI Warning: \_SB_.PCI0.P0P1.VGA_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20131115/nsarguments-95)
[ 2780.584892] ACPI Warning: \_SB_.PCI0.P0P1.VGA_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20131115/nsarguments-95)
[ 2780.584982] bbswitch: detected a nVidia _DSM function
[ 2780.584999] pci 0000:01:00.0: enabling device (0000 -> 0003)
[ 2780.585046] bbswitch: Succesfully loaded. Discrete card 0000:01:00.0 is on
[ 2790.596471] init: Failed to spawn nvidia-persistenced main process: unable to execute: No such file or directory

$ lspci -nnvvd10de:
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GT218M [GeForce G210M] [10de:0a74] (rev a2) (prog-if 00 [VGA controller])
        Subsystem: ASUSTeK Computer Inc. Device [1043:1bc2]
        Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at fd000000 (32-bit, non-prefetchable) [size=16M]
        Region 1: Memory at e0000000 (64-bit, prefetchable) size=256M]
        Region 3: Memory at fa000000 (64-bit, prefetchable) [size=32M]
        Region 5: I/O ports at dc00 [size=128]
        Expansion ROM at fe980000 [disabled] [size=512K]
        Capabilities: <access denied>

01:00.1 Audio device [0403]: NVIDIA Corporation High Definition Audio Controller [10de:0be3] (rev a1)
        Subsystem: ASUSTeK Computer Inc. Device [1043:1bc2]
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at fe97c000 (32-bit, non-prefetchable) [size=16K]
        Capabilities: <access denied>
        Kernel driver in use: snd_hda_intel

$ sudo tee /proc/acpi/bbswitch <<<OFF
OFF

dmesg log:
[ 3058.083676] bbswitch: disabling discrete graphics
[ 3058.096117] ACPI Warning: \_SB_.PCI0.P0P1.VGA_._DSM: Argument #4 type mismatch - Found [Buffer], ACPI requires [Package] (20131115/nsarguments-95)
[ 3058.200492] vgaarb: device changed decodes: PCI:0000:00:02.0,olddecodes=none,decodes=io+mem:owns=none
[ 3058.312204] ------------[ cut here ]------------
[ 3058.312216] WARNING: CPU: 0 PID: 4 at /build/buildd/linux-3.13.0/fs/sysfs/group.c:214 sysfs_remove_group+0xc6/0xd0()
[ 3058.312219] sysfs group ffffffff81cabae0 not found for kobject 'acpi_video1'
[ 3058.312221] Modules linked in: bbswitch(OF) hid_generic cdc_acm usbhid hid snd_hrtimer ip6table_filter ip6_tables iptable_filter ip_tables...

Read more...

Revision history for this message
Guillaume Millet (guimillet) wrote :
Download full text (4.3 KiB)

What else information is needed?

I have tried with the daily kernel 3.17.0-999.201409272205 and the tainting still occurs. The syslog trace related to suspending event is slightly different (dis_dev_get+0x15/0x40 [bbswitch] instead of pci_bus_read_config_dword+0x64/0x90 in the initial report with kernel 3.13):
[ 3342.022581] init: anacron main process (13582) killed by TERM signal
[ 3342.297888] PM: Syncing filesystems ... done.
[ 3342.489577] PM: Preparing system for mem sleep
[ 3342.489787] general protection fault: 0000 [#1] SMP
[ 3342.489988] Modules linked in: bbswitch(OE) ip6table_filter ip6_tables iptable_filter ip_tables x_tables snd_hrtimer cuse rfcomm bnep bluetooth binfmt_misc dm_crypt snd_hda_codec_hdmi arc4 ath9k ath9k_common ath9k_hw snd_hda_codec_realtek snd_hda_codec_generic ath snd_hda_intel snd_hda_controller snd_hda_codec mac80211 snd_hwdep uvcvideo snd_pcm videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common snd_seq_midi snd_seq_midi_event kvm_intel videodev media snd_rawmidi snd_seq cfg80211 snd_seq_device kvm snd_timer snd parport_pc mxm_wmi ppdev soundcore coretemp joydev lp lpc_ich serio_raw parport mac_hid asus_laptop wmi sparse_keymap input_polldev uas usb_storage psmouse ahci libahci i915 atl1c i2c_algo_bit drm_kms_helper drm video
[ 3342.491087] CPU: 1 PID: 13498 Comm: pm-suspend Tainted: G W OE 3.17.0-999-generic #201409272205
[ 3342.491296] Hardware name: ASUSTeK Computer Inc. UL80VT /UL80VT , BIOS 214 01/17/2011
[ 3342.491505] task: ffff8800b8d1a800 ti: ffff880035860000 task.ti: ffff880035860000
[ 3342.491713] RIP: 0010:[<ffffffffc08ae035>] [<ffffffffc08ae035>] dis_dev_get+0x15/0x40 [bbswitch]
[ 3342.491928] RSP: 0018:ffff880035863d10 EFLAGS: 00010202
[ 3342.492133] RAX: 6e6f697461727564 RBX: 0000000000000003 RCX: 0000000000000002
[ 3342.492335] RDX: 0000000000000000 RSI: 0000000000000003 RDI: ffffffffc08b0000
[ 3342.492535] RBP: ffff880035863d18 R08: ffffffff814e00e0 R09: 0000000000000000
[ 3342.492737] R10: 0000000000000000 R11: 0000000fffffffe0 R12: 0000000000000000
[ 3342.492939] R13: 00000000fffffffa R14: ffffffff81cd3f90 R15: 0000000000000000
[ 3342.493142] FS: 00007f8b8e80d740(0000) GS:ffff88013fd00000(0000) knlGS:0000000000000000
[ 3342.493345] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 3342.493529] CR2: 00007f8b8f73cf10 CR3: 000000009c68b000 CR4: 00000000000407e0
[ 3342.493529] Stack:
[ 3342.493529] ffffffffc08ae545 ffff880035863d58 ffffffff81093f2d ffffffff81ec685c
[ 3342.493529] ffffffff81c521c0 0000000000000003 0000000000000000 00000000ffffffff
[ 3342.493529] 0000000000000000 ffff880035863da8 ffffffff81094188 ffff880035863d48
[ 3342.493529] Call Trace:
[ 3342.493529] [<ffffffffc08ae545>] ? bbswitch_pm_handler+0x55/0x70 [bbswitch]
[ 3342.493529] [<ffffffff81093f2d>] notifier_call_chain+0x4d/0x70
[ 3342.493529] [<ffffffff81094188>] __blocking_notifier_call_chain+0x58/0x80
[ 3342.493529] [<ffffffff810941c6>] blocking_notifier_call_chain+0x16/0x20
[ 3342.493529] [<ffffffff810bd5fa>] pm_notifier_call_chain+0x1a/0x30
[ 3342.493529] [<ffffffff8178aafa>] suspend_prepare+0x46/0xcd
[ 3342.493529] [<ffffffff810be83f>] enter_state+0x16f/0x250
...

Read more...

mcsan (octagonhead)
Changed in bbswitch (Ubuntu):
status: Incomplete → New
Revision history for this message
Peter Wu (lekensteyn) wrote :

This problem may occur when your compiler (gcc) is different than the one used to compile the kernel. Have you installed an older gcc version (maybe via CUDA?). What does `gcc -v` show?

Revision history for this message
mcsan (octagonhead) wrote :

For me on my UL30VT with ubuntu 15.04 it was:
 gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13)

And cat /proc/version
Linux version 3.19.0-15-generic (buildd@komainu) (gcc version 4.9.2 (Ubuntu 4.9.2-10ubuntu13) ) #15-Ubuntu SMP Thu Apr 16 23:32:37 UTC 2015

Nevertheless, I discovered and remove some old gcc versions and re-installed all dkms packages, but still have the same behaviour as detailed above. I wonder if there are any ul*0vt laptops that are still working out there?
I will try installing bbswitch on a clean install of ubuntu when I have time

Revision history for this message
Peter Wu (lekensteyn) wrote :

Do you have a full dmesg since the first boot? I have seen machines having issues due to hotplugging where the device is gone during suspend.

Revision history for this message
Guillaume Millet (guimillet) wrote :

Still facing the issue with kernel 4.2.0-23-generic, gcc version 5.2.1 20151010 (Ubuntu 5.2.1-22ubuntu2), bbswitch-dkms 0.7-2ubuntu1, no bumblebee package installed. I attach a full dmesg (from kern.log) from the boot to the freeze. After booting, I logged into tty1 and entered the following commands:
sudo service sdsm stop
sudo rmmod nouveau
sudo modprobe bbswitch load_state=0
sudo service sdsm start
Ctrl-Alt-F7 and closed the lid

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.