compiz terminates with "intel_do_flush_locked failed: Device or resource busy"

Bug #1094173 reported by Daniel Gnoutcheff on 2012-12-28
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Medium
linux (Ubuntu)
Undecided
Herton R. Krzesinski
Quantal
Undecided
Herton R. Krzesinski
Raring
Undecided
Herton R. Krzesinski

Bug Description

I'm running unity on Intel GM965 graphics. Occasionally, unity/compiz will terminate with

  intel_do_flush_locked failed: Device or resource busy

If I then attempt to restart compiz, it initially starts up successfully, but if I "use" it a little (e.g. if I open a new window or move an existing window around), then within a second or so it terminates again with the same error. Every attempt to restart compiz fails in this way, and it continues to fail even if I restart the xserver. However, once I reboot the system, compiz runs normally again.

I have reported this upstream at https://bugs.freedesktop.org/show_bug.cgi?id=58732, and I've attached various debug logs there.

Chris Wilson believes that commit b4a98e57fc27854b5938fc8b08b68e5e68b91e1f should fix this. I am in the process of verifying that, and since that commit is believed to fix several other bugs in kernels 3.5 and later, it has been submitted for inclusion in 3.7.y stable kernels. I will submit a backport (context update, really) for 3.5.7.y shortly.

ProblemType: Bug
DistroRelease: Ubuntu 12.10
Package: linux-image-3.5.0-21-generic 3.5.0-21.32
ProcVersionSignature: Ubuntu 3.5.0-21.32-generic 3.5.7.1
Uname: Linux 3.5.0-21-generic x86_64
ApportVersion: 2.6.1-0ubuntu9
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: gnoutchd 2653 F.... pulseaudio
Date: Thu Dec 27 20:34:55 2012
HibernationDevice: RESUME=UUID=15733ed1-09e8-4ba7-aa20-c7f0136412a5
MachineType: LENOVO 7733A82
MarkForUpload: True
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcFB: 0 inteldrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.5.0-21-generic root=UUID=d03dbee7-efb8-45ae-bdbc-e06fb7d61b04 ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.5.0-21-generic N/A
 linux-backports-modules-3.5.0-21-generic N/A
 linux-firmware 1.95
RfKill:
 0: phy0: Wireless LAN
  Soft blocked: no
  Hard blocked: no
SourcePackage: linux
UpgradeStatus: Upgraded to quantal on 2012-10-20 (68 days ago)
WpaSupplicantLog:

dmi.bios.date: 04/08/2010
dmi.bios.vendor: LENOVO
dmi.bios.version: 7LETC7WW (2.27 )
dmi.board.name: 7733A82
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7LETC7WW(2.27):bd04/08/2010:svnLENOVO:pn7733A82:pvrThinkPadR61/R61i:rvnLENOVO:rn7733A82:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 7733A82
dmi.product.version: ThinkPad R61/R61i
dmi.sys.vendor: LENOVO

Download full text (4.5 KiB)

Created attachment 72086
dmesg from 2012-12-24 occurrence

I use Ubuntu's compiz-based Unity desktop. Occasionally, compiz will terminate with

  intel_do_flush_locked failed: Device or resource busy

If I then attempt to restart compiz, it initially starts up successfully, but if I "use" it a little (e.g. if I open a new window or move an existing window around), then within a second or so it terminates again with the same error. Every attempt to restart compiz fails in this way, and it continues to fail even if I restart the xserver. However, once I reboot the system, compiz runs normally again.

I'm attaching dmesg output, Xorg log, compiz stderr+stdout with LIBGL_DEBUG=verbose, and intel_reg_dumper output. dmesg contains interesting-looking stack traces, the rest looks unenlightening.

System environment:
-- chipset: GM965
-- system architecture: x86_64
-- xf86-video-intel: 2.20.9 (Ubuntu package version 2:2.20.9-0ubuntu2)
-- xserver: 1.13.0 (Ubuntu package version 2:1.13.0-0ubuntu6.1)
-- mesa: 9.0 (Ubuntu package version 9.0-0ubuntu1)
-- libdrm: 2.4.39 (Ubuntu package version 2.4.39-0ubuntu)
-- kernel: 3.5.? (Ubuntu package version 3.5.0-21-generic)
-- Linux distribution: Ubuntu 12.10
-- Machine or mobo model: Lenovo ThinkPad R61 7733A82
-- Display connector: internal LVDS

Here's the interesting-looking dmesg stacktrace:

  [210169.687137] ------------[ cut here ]------------
  [210169.687186] WARNING: at /build/buildd/linux-3.5.0/drivers/gpu/drm/i915/i915
  _gem.c:3047 i915_gem_object_pin+0x15d/0x1b0 [i915]()
  [210169.687189] Hardware name: 7733A82
  [210169.687190] Modules linked in: hid_generic usbhid hid mmc_block usblp snd_seq_dummy snd_hrtimer nls_iso8859_1 cdc_ether cdc_phonet usbnet phonet cdc_acm usb_storage pci_stub vboxpci(O) vboxnetadp(O) vboxnetflt(O) vboxdrv(O) deflate zlib_deflate twofish_generic ctr twofish_x86_64_3way twofish_x86_64 twofish_common camellia_generic camellia_x86_64 serpent_sse2_x86_64 cryptd lrw serpent_generic xts gf128mul blowfish_generic blowfish_x86_64 blowfish_common cast5 nfsd des_generic xcbc nfs rmd160 sha512_generic crypto_null af_key lockd xfrm_algo fscache auth_rpcgss bnep nfs_acl rfcomm parport_pc sunrpc bluetooth ppdev binfmt_misc dm_crypt arc4 snd_hda_codec_analog snd_hda_intel joydev snd_hda_codec coretemp thinkpad_acpi snd_hwdep snd_pcm ath5k r852 kvm_intel snd_seq_midi sm_common snd_rawmidi nand ath snd_seq_midi_event nand_ids snd_seq mtd kvm nand_bch snd_timer tp_smapi(O) bch r592 thinkpad_ec(O) mac80211 snd_seq_device pcmcia firewire_sbp2 memstick nand_ecc psmouse cfg80211 yenta_socket dm_multipath pcmcia_rsrc snd pcmcia_core scsi_dh snd_page_alloc microcode mac_hid soundcore serio_raw lp nvram lpc_ich tpm_tis parport sdhci_pci sdhci firewire_ohci firewire_core crc_itu_t i915 drm_kms_helper drm i2c_algo_bit e1000e video wmi
  [210169.687280] Pid: 2911, comm: compiz Tainted: G O 3.5.0-21-generic #32-Ubuntu
  [210169.687282] Call Trace:
  [210169.687290] [<ffffffff81051c1f>] warn_slowpath_common+0x7f/0xc0
  [210169.687294] [<ffffffff81051c7a>] warn_slowpath_null+0x1a/0x20
  [210169.687308] [<ffffffffa00da93d>] i915_gem_object_pin+0x15d/0x1b0 [i915]
  [210169.687329] [...

Read more...

Created attachment 72087
compiz output from 2012-12-24 occurrence

Created attachment 72088
Xorg log from 2012-12-24 occurrence

Created attachment 72089
intel_reg_dumper output after 2012-12-24 occurrence

Should be fixed in 3.7:

commit b4a98e57fc27854b5938fc8b08b68e5e68b91e1f
Author: Chris Wilson <email address hidden>
Date: Thu Nov 1 09:26:26 2012 +0000

    drm/i915: Flush outstanding unpin tasks before pageflipping

> Should be fixed in 3.7:
>
> commit b4a98e57fc27854b5938fc8b08b68e5e68b91e1f

Thanks, I see that commit is in kernel 3.8-rc1, though I can't find it
in 3.7.1 or in the stable-queue for 3.7.2. Are there any plans to
include that commit in stable kernels?

In any case, I've cherry-picked that commit on top of 3.5.7.2 (from
Ubuntu's extended longterm series), and I'm running that now. I'll keep
running it for a month or so to verify that it fixes this bug.

Sigh, that is one that Daniel should have earmarked for stable. Can you please test the patch with 3.7 and if it works send an email to <email address hidden> containing the commit id, a reference to this bugzilla and a tested-by.

> Sigh, that is one that Daniel should have earmarked for stable. Can you please
> test the patch with 3.7 and if it works send an email to <email address hidden>
> containing the commit id, a reference to this bugzilla and a tested-by.

OK, applied on 3.7 and verified that nothing blows up. It'll be a while
longer before I can be sure that it fixes this bug, but since it's
already known to fix other bugs, I've submitted it to stable.

I'll follow up with the Ubuntu kernel team about getting it included in
their extended stable kernel as well.

Daniel Gnoutcheff (gnoutchd) wrote :
Changed in linux:
importance: Unknown → Medium
status: Unknown → Fix Released
Tim Gardner (timg-tpi) on 2013-01-02
Changed in linux (Ubuntu):
status: New → In Progress
assignee: nobody → Herton R. Krzesinski (herton)
Changed in linux (Ubuntu Raring):
status: In Progress → Fix Released
Changed in linux (Ubuntu Quantal):
assignee: nobody → Herton R. Krzesinski (herton)
status: New → In Progress

On Fri, Dec 28, 2012 at 3:31 AM, <email address hidden> wrote:
> --- Comment #8 from Daniel Gnoutcheff <email address hidden> ---
> Reported to Ubuntu kernel team at:
> https://bugs.launchpad.net/linux/+bug/1094173

If the patch works on 3.7 you only need to send the commit sha1 (in
the 3.8-rc upstream) of it with a link to this bugzilla and a
tested-by to <email address hidden> It will the automatically trickle
down to stable kernels and distro kernel builds.

Running raring with 3.7.0-7 Generic with Intel GM915. Problem still exists, so I don't know if the fix has gone into effect yet.

> If the patch works on 3.7 you only need to send the commit sha1 (in
> the 3.8-rc upstream) of it with a link to this bugzilla and a
> tested-by to <email address hidden> It will the automatically trickle
> down to stable kernels and distro kernel builds.

Thanks for the advice. I have had an experience, though, with a stable
kernel maintainer who wanted the submitted patches to apply with no
manual effort, not even trivial context changes. The patch in question
wouldn't apply to 3.5.7.y without context changes, so I wanted to be safe.

BTW, is this patch relevant for kernels 3.4 and older? The linked
bugzillas gave me the impression that the bugs addressed by this commit
have only been witnessed in kernels 3.5 and later, and of those, it
appears that only 3.7 is maintained on kernel.org. (3.5.7.y comes from
Ubuntu's "extended longterm kernel" effort).

Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel for Quantal in -proposed solves the problem (3.5.0-24.37). Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-quantal' to 'verification-done-quantal'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-quantal

Alright, I am officially convinced that the aforementioned commit does indeed fix this bug. During the past month and a half, I made sure that I was always running a kernel that included that patch, and I have not seen this bug in that time.

For the record, the patch was released in longterm kernel v3.7.3 as commit
  41c8765e911cf54ad0c71bf3a1642b918af937e8
and in Ubuntu's extended longterm kernel v3.5.7.3 as commit
  13938a31f36fa72027928eddb159327ab5568a46

Thanks!

On 02/08/2013 05:31 AM, Luis Henriques wrote:
> This bug is awaiting verification that the kernel for Quantal in
> -proposed solves the problem (3.5.0-24.37). Please test the kernel and
> update this bug with the results. If the problem is solved, change the
> tag 'verification-needed-quantal' to 'verification-done-quantal'.

OK, I have installed that kernel and I am running it now. Recall,
however, that this bug is not reliably reproducible and can take several
weeks to manifest,

That said, here's what we have now:

- Upstream believes that commit b4a98e57 fixes this bug.
- A backport of that commit was released in v3.7.3.
- After a mouth and a half, I can vouch for the fact that this bug is
  fixed as of v3.7.3. (I didn't check if it was present in v3.7.2,
  though.)
- A backport of the same commit was also released in v3.5.7.3, and I'm
  of the understanding this should have trickled down into this
  -proposed kernel.

Based on that, I'm around 95% confident that this kernel does indeed fix
this bug. If this isn't good enough, then I would need to run this
kernel for at least a month to empirically verify this.

> If verification is not done by one week from today, this fix will be
> dropped from the source code, and this bug will be closed.

I'm not sure what the standards are here -- I'm tempted to just declare
this "verified", but I don't know if that's justified. If it isn't,
then a week will not be enough time. Please advise.

Luis Henriques (henrix) wrote :

Daniel, thanks a lot for you detailed report on this issue. Since this is an upstream backport, and given the facts presented in comment #14, I'm tagging this bug as verified in Quantal.

tags: added: verification-done-quantal
removed: verification-needed-quantal

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

Launchpad Janitor (janitor) wrote :
Download full text (28.9 KiB)

This bug was fixed in the package linux - 3.5.0-24.37

---------------
linux (3.5.0-24.37) quantal-proposed; urgency=low

  [Luis Henriques]

  * Release Tracking Bug
    - LP: #1117492

  [ Tim Gardner ]

  * [Config] CONFIG_ALX=m for x86 only
    - LP: #927782

linux (3.5.0-24.36) quantal-proposed; urgency=low

  [Luis Henriques]

  * Release Tracking Bug
    - LP: #1116501

  [ Adam Lee ]

  * [Config] Enable RTSX_PCI modules
    - LP: #1057089

  [ Andy Whitcroft ]

  * [Config] enable various HVC consoles
    - LP: #1102206

  [ Brad Figg ]

  * Revert "SAUCE: samsung-laptop: disable in UEFI mode"
    - LP: #1111689

  [ Herton Ronaldo Krzesinski ]

  * [Config] updateconfigs for 3.5.7.3 stable update
  * d-i: Add mellanox ethernet drivers to nic-modules
    - LP: #1015339

  [ Kamal Mostafa ]

  * SAUCE: alx driver import script
    - LP: #927782

  [ Qualcomm Atheros, Inc ]

  * SAUCE: alx: Update to heads/master
    - LP: #927782

  [ Seth Forshee ]

  * SAUCE: samsung-laptop: Add quirk for broken acpi_video backlight on
    N250P
    - LP: #1086921

  [ Stefan Bader ]

  * (config) Move 9p modules into generic package
    - LP: #1107658

  [ Tim Gardner ]

  * [debian] Remove dangling symlink from headers package
    - LP: #1112442
  * [config] CONFIG_ALX=m
    - LP: #927782
  * [Config] Add alx to d-i nic-modules
    - LP: #927782

  [ Upstream Kernel Changes ]

  * Revert "8139cp: revert "set ring address before enabling receiver""
    - LP: #1102417
  * Revert "ath9k_hw: Update AR9003 high_power tx gain table"
    - LP: #1102417
  * Revert "drm/i915: no lvds quirk for Zotac ZDBOX SD ID12/ID13"
    - LP: #1102417
  * Revert "ALSA: hda - Shut up pins at power-saving mode with Conexnat
    codecs"
    - LP: #1106966, #886975
  * be2net: don't call vid_config() when there's no vlan config
    - LP: #1083088
  * be2net: cleanup be_vid_config()
    - LP: #1083088
  * be2net: do not modify PCI MaxReadReq size
    - LP: #1083088
  * be2net: fix reporting number of actual rx queues
    - LP: #1083088
  * be2net: do not use SCRATCHPAD register
    - LP: #1083088
  * be2net: Fix driver load for VFs for Lancer
    - LP: #1083088
  * be2net: Explicitly clear the reserved field in the Tx Descriptor
    - LP: #1083088
  * be2net: Regression bug wherein VFs creation broken for multiple cards.
    - LP: #1083088
  * be2net: Fix to trim skb for padded vlan packets to workaround an ASIC
    Bug
    - LP: #1083088
  * be2net: Fix Endian
    - LP: #1083088
  * be2net: Fix error while toggling autoneg of pause parameters
    - LP: #1083088
  * be2net : Fix die temperature stat for Lancer
    - LP: #1083088
  * be2net: Fix initialization sequence for Lancer
    - LP: #1083088
  * be2net: Activate new FW after FW download for Lancer
    - LP: #1083088
  * be2net: Fix cleanup path when EQ creation fails
    - LP: #1083088
  * be2net: Enable RSS UDP hashing for Lancer and Skyhawk
    - LP: #1083088
  * be2net: dont pull too much data in skb linear part
    - LP: #1083088
  * be2net: Fix VF driver load for Lancer
    - LP: #1083088
  * be2net: Ignore physical link async event for Lancer
    - LP: #1083088
  * be2net: Fix to parse RSS hash from Receive compl...

Changed in linux (Ubuntu Quantal):
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.