Radeon VDPAU clients crash at vlVdpDecoderCreate with 1080p videos

Bug #1316689 reported by Marco Trevisan (Treviño)
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Linux
Fix Released
Medium
linux (Ubuntu)
Fix Released
Medium
Unassigned
Trusty
Fix Released
Undecided
Unassigned

Bug Description

I've enabled VDPAU support in flash player, and testing it with normal youtube videos (up to 720p) gives good results.

However, once I switch to 1080p quality, the flash player crashes, and this happens at the mesa vdpau driver level because a Bus Error triggered by mesa since the CPU can't access properly to gpu VRAM.

The same happens loading a 1080p video using VLC or MPlayer (see vlc backtrace http://paste.ubuntu.com/7405255/).

Here's the problem as explained by Christian König:
> It's not the VDPAU driver that's failing here, it's the kernel.
>
> When the kernel can't place a buffer into visible VRAM the buffer
> should be moved into GART instead for CPU access. But instead we
> just return a SIGBUS to the application effectively crashing it.

I'm attaching the patch that fixes this problem, provided by Christian and tested successfully on ubuntu Trusty kernel.

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: linux-image-3.13.0-24-generic 3.13.0-24.47
ProcVersionSignature: Ubuntu 3.13.0-24.47-generic 3.13.9
Uname: Linux 3.13.0-24-generic x86_64
ApportVersion: 2.14.1-0ubuntu3
Architecture: amd64
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: marco 2988 F.... pulseaudio
 /dev/snd/pcmC1D3p: marco 2988 F...m pulseaudio
 /dev/snd/controlC0: marco 2988 F.... pulseaudio
CurrentDesktop: Unity
Date: Tue May 6 18:27:23 2014
HibernationDevice: RESUME=UUID=534ddf57-b0f4-4367-ae56-d2878b2f614f
InstallationDate: Installed on 2010-07-10 (1396 days ago)
InstallationMedia: Ubuntu 10.04 LTS "Lucid Lynx" - Release amd64 (20100429)
MachineType: Acer Aspire 4820TG
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.13.0-24-generic root=UUID=4811a166-63ca-4702-8e63-0c357cc2e2f7 ro quiet splash radeon.audio=1 radeon.dpm=1 crashkernel=384M-2G:64M,2G-:128M crashkernel=384M-2G:64M,2G-:128M crashkernel=384M-:128M crashkernel=384M-:128M crashkernel=384M-:128M crashkernel=384M-:128M crashkernel=384M-:128M vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.13.0-24-generic N/A
 linux-backports-modules-3.13.0-24-generic N/A
 linux-firmware 1.127
SourcePackage: linux
UpgradeStatus: Upgraded to trusty on 2012-10-10 (573 days ago)
dmi.bios.date: 03/16/2011
dmi.bios.vendor: INSYDE
dmi.bios.version: V1.25
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: JM41_CP
dmi.board.vendor: Acer
dmi.board.version: Base Board Version
dmi.chassis.type: 10
dmi.chassis.vendor: Chassis Manufacturer
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnINSYDE:bvrV1.25:bd03/16/2011:svnAcer:pnAspire4820TG:pvrV1.25:rvnAcer:rnJM41_CP:rvrBaseBoardVersion:cvnChassisManufacturer:ct10:cvrChassisVersion:
dmi.product.name: Aspire 4820TG
dmi.product.version: V1.25
dmi.sys.vendor: Acer

Revision history for this message
In , Marco Trevisan (Treviño) (3v1n0) wrote :

Created attachment 98475
Flash player crash using VDPAU with full-hd videos

I've enabled VDPAU support in flash player, and testing it with normal youtube videos (up to 720p) gives good results (congrats!).

However, once I switch to 1080p quality, the flash player crashes, and this seems to happen at the mesa vdpau driver level (for reference, this work using Intel, although that's a different story, but I can't test nouveau here).

Anyway, I'm attaching here the backtrace. I've got it using mesa stock from ubuntu, thus I've linked on the bt a paste of the exact files referenced by the debugger.
Let me know if you need further debugging.

This is what happens in Ubuntu 14.04, but the same seems to happens in Arch Linux (check https://bbs.archlinux.org/viewtopic.php?pid=1409666).

Revision history for this message
In , Marco Trevisan (Treviño) (3v1n0) wrote :

Created attachment 98480
Flash player crash using VDPAU with full-hd videos

Revision history for this message
In , Deathsimple (deathsimple) wrote :

What version of mesa is that? From the backtrace it looks like allocating a buffers fails and then we crash because we want to clear the buffer.

But the line numbers look quite outdated to me.

Revision history for this message
In , Marco Trevisan (Treviño) (3v1n0) wrote :

Created attachment 98482
VLC crash using VDPAU with full-hd videos

This is actually more generic than I thought, as it happens also with VLC and Mplayer.

Attaching the VLC backtrace here (very similar to flash and to mplayer).

Revision history for this message
In , Marco Trevisan (Treviño) (3v1n0) wrote :

(In reply to comment #2)
> What version of mesa is that? From the backtrace it looks like allocating a
> buffers fails and then we crash because we want to clear the buffer.
>
> But the line numbers look quite outdated to me.

As said I've used the ubuntu debug version, so in the backtrace (the last one attached, not the one I deprecated) I've linked the relative files:

src/gallium/state_trackers/vdpau/decode.c: http://paste.ubuntu.com/7398157/
src/gallium/drivers/radeon/radeon_uvd.c: http://paste.ubuntu.com/7398148/

I might test this with a more recent version as well, but since this seems to happen in Arch as well, I suppose it's in git version too.

Revision history for this message
In , Deathsimple (deathsimple) wrote :

(In reply to comment #4)
> (In reply to comment #2)
> > What version of mesa is that? From the backtrace it looks like allocating a
> > buffers fails and then we crash because we want to clear the buffer.
> >
> > But the line numbers look quite outdated to me.
>
> As said I've used the ubuntu debug version, so in the backtrace (the last
> one attached, not the one I deprecated) I've linked the relative files:
>
> src/gallium/state_trackers/vdpau/decode.c: http://paste.ubuntu.com/7398157/
> src/gallium/drivers/radeon/radeon_uvd.c: http://paste.ubuntu.com/7398148/
>
> I might test this with a more recent version as well, but since this seems
> to happen in Arch as well, I suppose it's in git version too.

Source files doesn't help at all, I need a version number or better git hash of what revision this is.

It looks like allocating the decoded picture buffer works, but mapping it returns a NULL pointer. Is there any other error message? Kernel or stderr?

Revision history for this message
In , Marco Trevisan (Treviño) (3v1n0) wrote :

(In reply to comment #5)
> Source files doesn't help at all, I need a version number or better git hash
> of what revision this is.

So, as I reported the ubuntu package is based on mesa 10.1, and the same is on debian git, thus the git hash seems to be 4a86465 (and line numbers reported on these backports matches).

The full tree, with ubuntu patches applied, is available at http://bazaar.launchpad.net/~ubuntu-branches/ubuntu/trusty/mesa/trusty/files (or bzr branch lp:ubuntu/mesa).

> It looks like allocating the decoded picture buffer works, but mapping it
> returns a NULL pointer. Is there any other error message? Kernel or stderr?

Mh, no I'm just getting this (on VLC):

VLC media player 2.1.2 Rincewind (revision 2.1.2-0-ga4c4876)
[0x15adaa8] main interface error: no suitable interface module
[0x13a9148] main libvlc: Running vlc with the default interface. Use 'cvlc' to use vlc without interface.
libva info: VA-API version 0.35.0
libva info: va_getDriverName() returns 0
libva info: User requested driver 'vdpau'
libva info: Trying to open /usr/lib/x86_64-linux-gnu/dri/vdpau_drv_video.so
libva info: Found init function __vaDriverInit_0_32
libva info: va_openDriver() returns 0
[0x7f4358c28e98] avcodec decoder: Using VA API version 0.35 for hardware decoding.
Bus Error (core dumped)

And pretty similar on mplayer (http://pastebin.ubuntu.com/7398928/), nothing more kernel side.

Revision history for this message
In , Deathsimple (deathsimple) wrote :

(In reply to comment #6)
> [0x7f4358c28e98] avcodec decoder: Using VA API version 0.35 for hardware
> decoding.
> Bus Error (core dumped)

Oh! Well that's interesting. It's not an segmentation fault at all.

It means that the CPU can't properly access VRAM. Please provide full dmesg output.

Revision history for this message
In , Marco Trevisan (Treviño) (3v1n0) wrote :

Created attachment 98499
dmesg

(In reply to comment #7)
> (In reply to comment #6)
> > [0x7f4358c28e98] avcodec decoder: Using VA API version 0.35 for hardware
> > decoding.
> > Bus Error (core dumped)
>
> Oh! Well that's interesting. It's not an segmentation fault at all.
>
> It means that the CPU can't properly access VRAM. Please provide full dmesg
> output.

Here you are...
Nothing seems directly related to this, though.

Revision history for this message
In , Deathsimple (deathsimple) wrote :

That's the problem:

[ 1.714537] [drm] Detected VRAM RAM=1024M, BAR=128M

Your PCI BAR is smaller than usually, for a REDWOOD 256M are normal. Going to hack up a patch for this.

Revision history for this message
In , Marco Trevisan (Treviño) (3v1n0) wrote :

(In reply to comment #9)
> That's the problem:
>
> [ 1.714537] [drm] Detected VRAM RAM=1024M, BAR=128M
>
> Your PCI BAR is smaller than usually, for a REDWOOD 256M are normal. Going
> to hack up a patch for this.

Nice to hear!
But in these cases where there's not enough memory, shouldn't the vdpau driver to fail returning an error (making the client to fallback to software rendering), instead of crashing?

Revision history for this message
In , Deathsimple (deathsimple) wrote :

(In reply to comment #10)
> (In reply to comment #9)
> > That's the problem:
> >
> > [ 1.714537] [drm] Detected VRAM RAM=1024M, BAR=128M
> >
> > Your PCI BAR is smaller than usually, for a REDWOOD 256M are normal. Going
> > to hack up a patch for this.
>
> Nice to hear!
> But in these cases where there's not enough memory, shouldn't the vdpau
> driver to fail returning an error (making the client to fallback to software
> rendering), instead of crashing?

It's not the VDPAU driver that's failing here, it's the kernel.

When the kernel can't place a buffer into visible VRAM the buffer should be moved into GART instead for CPU access. But instead we just return a SIGBUS to the application effectively crashing it.

Revision history for this message
In , Deathsimple (deathsimple) wrote :

Created attachment 98501
Possible fix.

Please try the attached kernel patch.

Only tested with 3.15-rc2, but should apply to 3.14 as well.

Revision history for this message
In , Marco Trevisan (Treviño) (3v1n0) wrote :

(In reply to comment #12)
> Created attachment 98501 [details] [review]
> Possible fix.
>
> Please try the attached kernel patch.
>
> Only tested with 3.15-rc2, but should apply to 3.14 as well.

Thanks, it applies to 3.13 as well, but still same issue :(

Exactly same crash (http://paste.ubuntu.com/7405255/) and dmesg (http://paste.ubuntu.com/7405259/)

Revision history for this message
In , Deathsimple (deathsimple) wrote :

(In reply to comment #13)
> (In reply to comment #12)
> > Created attachment 98501 [details] [review] [review]
> > Possible fix.
> >
> > Please try the attached kernel patch.
> >
> > Only tested with 3.15-rc2, but should apply to 3.14 as well.
>
> Thanks, it applies to 3.13 as well, but still same issue :(
>
> Exactly same crash (http://paste.ubuntu.com/7405255/) and dmesg
> (http://paste.ubuntu.com/7405259/)

Mhm, your dmesg looks like you are still booting the standard Ubuntu kernel:
[ 0.000000] Linux version 3.13.0-24-generic (buildd@batsu) (gcc version 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014 (Ubuntu 3.13.0-24.47-generic 3.13.9)

Grub usually selects the kernel with the highest version number for boot. Please double check that you selected the self compiled version in grub while booting.

Revision history for this message
In , Marco Trevisan (Treviño) (3v1n0) wrote :

(In reply to comment #14)
> Mhm, your dmesg looks like you are still booting the standard Ubuntu kernel:
> [ 0.000000] Linux version 3.13.0-24-generic (buildd@batsu) (gcc version
> 4.8.2 (Ubuntu 4.8.2-19ubuntu1) ) #47-Ubuntu SMP Fri May 2 23:30:00 UTC 2014
> (Ubuntu 3.13.0-24.47-generic 3.13.9)
>
> Grub usually selects the kernel with the highest version number for boot.
> Please double check that you selected the self compiled version in grub
> while booting.

Sorry, that wasn't the problem since I patched the ubuntu kernel itself, but I did forgot to update the initramfs, thus it was still loading the old module -_-.

Anyway, once I fixed it this works like a charm!
Thanks a lot (and sorry again for the trouble)!

Revision history for this message
In , Deathsimple (deathsimple) wrote :

Thanks for for the help.

Revision history for this message
Marco Trevisan (Treviño) (3v1n0) wrote :
Changed in linux (Ubuntu):
status: New → Confirmed
Changed in linux:
importance: Unknown → Medium
status: Unknown → Fix Released
tags: added: patch
Changed in linux (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

I don't see the attached patch in Linus' tree as of yet. Do you plan on submitting the patch for inclusion in the mainline tree? If so, it would be best to also cc stable, so it lands in all of the stable trees, including upstream 3.13, which Trusty is based on.

Changed in linux (Ubuntu):
status: Confirmed → Triaged
tags: added: kernel-da-key
Revision history for this message
In , Marco Trevisan (Treviño) (3v1n0) wrote :

(In reply to comment #16)
> Thanks for for the help.

Thank you!

Once the patch gets into a drm branch, could you also please forward this to stable? As it would be very nice to get this included by distros as stable update.

penalvch (penalvch)
tags: added: latest-bios-1.25
Revision history for this message
madbiologist (me-again) wrote :

This patch is in the 3.15 kernel. Ubuntu 14.10 "Utopic Unicorn" is based on the 3.16 kernel.

Changed in linux (Ubuntu):
status: Triaged → Fix Released
Revision history for this message
madbiologist (me-again) wrote :

This change is also in the 3.14.7 kernel.

Revision history for this message
Tim Gardner (timg-tpi) wrote :

git describe --contains 7daea9b011b50e53feee4156be2c636ee3dbbd2a
Ubuntu-3.13.0-31.55~91

Changed in linux (Ubuntu Trusty):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.