apport-gpu-error-intel.py has never succeedded on gpu error on this computer

Bug #776580 reported by steubens
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
xdiagnose (Ubuntu)
Fix Released
High
Bryce Harrington

Bug Description

Binary package hint: xserver-xorg-video-intel

as far as i know for sure this script hasn't succeeded since maverick

this time in particular it was a page allocation failure:
 [237412.088904] [drm:i915_report_and_clear_eir] *ERROR* EIR stuck: 0x00000010, masking
 [237414.369027] intel_gpu_dump: page allocation failure. order:8, mode:0x40d0
 [237414.369035] Pid: 7250, comm: intel_gpu_dump Not tainted 2.6.38-8-generic #42-Ubuntu
 [237414.369037] Call Trace:
 [237414.369048] [<ffffffff811147c4>] ? __alloc_pages_nodemask+0x604/0x840
 [237414.369053] [<ffffffff81149f45>] ? alloc_pages_current+0xa5/0x110
 [237414.369057] [<ffffffff811107ee>] ? __get_free_pages+0xe/0x50
 [237414.369061] [<ffffffff81154e7f>] ? kmalloc_order_trace+0x3f/0xb0
 [237414.369064] [<ffffffff81155dfa>] ? __kmalloc+0x13a/0x160
 [237414.369068] [<ffffffff81117fa6>] ? put_page+0x36/0x40
 [237414.369072] [<ffffffff8118607d>] ? seq_read+0x1bd/0x3f0
 [237414.369076] [<ffffffff81164f73>] ? vfs_read+0xc3/0x180
 [237414.369079] [<ffffffff81165081>] ? sys_read+0x51/0x90
 [237414.369084] [<ffffffff8100c002>] ? system_call_fastpath+0x16/0x1b
 [237414.369086] Mem-Info:
 [237414.369089] Node 0 DMA per-cpu:
 [237414.369091] CPU 0: hi: 0, btch: 1 usd: 0
 [237414.369094] CPU 1: hi: 0, btch: 1 usd: 0
 [237414.369095] Node 0 DMA32 per-cpu:
 [237414.369098] CPU 0: hi: 186, btch: 31 usd: 0
 [237414.369100] CPU 1: hi: 186, btch: 31 usd: 0
 [237414.369105] active_anon:465972 inactive_anon:136408 isolated_anon:0
 [237414.369106] active_file:30754 inactive_file:32344 isolated_file:0
 [237414.369107] unevictable:8 dirty:341 writeback:0 unstable:0
 [237414.369108] free:31889 slab_reclaimable:9965 slab_unreclaimable:7911
 [237414.369109] mapped:16354 shmem:187401 pagetables:9624 bounce:0
 [237414.369112] Node 0 DMA free:11844kB min:36kB low:44kB high:52kB active_anon:1708kB inactive_anon:1856kB active_file:96kB inactive_file:344kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15684kB mlocked:0kB dirty:0kB writeback:0kB mapped:168kB shmem:12kB slab_reclaimable:16kB slab_unreclaimable:40kB kernel_stack:0kB pagetables:4kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
 [237414.369122] lowmem_reserve[]: 0 2946 2946 2946
 [237414.369126] Node 0 DMA32 free:115712kB min:6924kB low:8652kB high:10384kB active_anon:1862180kB inactive_anon:543776kB active_file:122920kB inactive_file:129032kB unevictable:32kB isolated(anon):0kB isolated(file):0kB present:3017444kB mlocked:32kB dirty:1364kB writeback:0kB mapped:65248kB shmem:749592kB slab_reclaimable:39844kB slab_unreclaimable:31604kB kernel_stack:3240kB pagetables:38492kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
 [237414.369137] lowmem_reserve[]: 0 0 0 0
 [237414.369140] Node 0 DMA: 1*4kB 2*8kB 3*16kB 2*32kB 5*64kB 1*128kB 2*256kB 1*512kB 2*1024kB 2*2048kB 1*4096kB = 11844kB
 [237414.369151] Node 0 DMA32: 3396*4kB 2314*8kB 4530*16kB 280*32kB 2*64kB 0*128kB 2*256kB 1*512kB 1*1024kB 0*2048kB 0*4096kB = 115712kB
 [237414.369161] 273144 total pagecache pages
 [237414.369163] 22621 pages in swap cache
 [237414.369166] Swap cache stats: add 2387571, delete 2364950, find 708789/931626
 [237414.369168] Free swap = 4996032kB
 [237414.369170] Total swap = 6291452kB
 [237414.382559] 769008 pages RAM
 [237414.382562] 13805 pages reserved
 [237414.382564] 148898 pages shared
 [237414.382565] 624613 pages non-shared

and the last few times on natty it was the same, page allocation failure

i'd have to grep old logs on the hd i took out of this machine to find out what it was on mav, let me know if that would be useful

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: xserver-xorg-video-intel 2:2.14.0-4ubuntu7.1
ProcVersionSignature: Ubuntu 2.6.38-8.42-generic 2.6.38.2
Uname: Linux 2.6.38-8-generic x86_64
Architecture: amd64
CompizPlugins: [core,crashhandler,composite,opengl,compiztoolbox,imgjpeg,decor,neg,mousepoll,move,resize,regex,imgpng,grid,text,extrawm,copytex,shift,obs,animation,dbus,put,commands,resizeinfo,place,gnomecompat,imgsvg,workarounds,session,wobbly,notification,cube,rotate,scale,expo,scaleaddon,ezoom,staticswitcher]
CompositorRunning: compiz
DRM.card0.DP.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.DP.2:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.HDMI.A.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
DRM.card0.LVDS.1:
 status: connected
 enabled: enabled
 dpms: On
 modes: 1366x768
 edid-base64: AP///////wBMo1E0AAAAAAASAQOAIhN4Cof1lFdPjCcnUFQAAAABAQEBAQEBAQEBAQEBAQEBEhtWclAADDAwICUAWMIQAAAZAAAADwAAAAAAAAAAAB60AnQAAAAA/gBTQU1TVU5HCiAgICAgAAAA/gAxNTZBVDAxLUgwMQogABs=
DRM.card0.VGA.1:
 status: disconnected
 enabled: disabled
 dpms: Off
 modes:
 edid-base64:
Date: Tue May 3 10:18:40 2011
DistUpgraded: Log time: 2011-03-17 02:39:21.770457
DistroCodename: natty
DistroVariant: ubuntu
GraphicsCard:
 Intel Corporation Mobile 4 Series Chipset Integrated Graphics Controller [8086:2a42] (rev 07) (prog-if 00 [VGA controller])
   Subsystem: Hewlett-Packard Company Device [103c:360b]
   Subsystem: Hewlett-Packard Company Device [103c:360b]
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release amd64 (20101007)
MachineType: Hewlett-Packard Compaq Presario CQ60 Notebook PC
ProcEnviron:
 LANGUAGE=en_US:en
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.38-8-generic root=UUID=d332acb1-e554-4637-84fe-731b5e39212b ro quiet splash vt.handoff=7
Renderer: Unknown
SourcePackage: xserver-xorg-video-intel
UpgradeStatus: Upgraded to natty on 2011-03-17 (47 days ago)
dmi.bios.date: 12/15/2010
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: F.65
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: 3612
dmi.board.vendor: Hewlett-Packard
dmi.board.version: 09.67
dmi.chassis.asset.tag: Chassis Asset Tag
dmi.chassis.type: 10
dmi.chassis.vendor: Hewlett-Packard
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnHewlett-Packard:bvrF.65:bd12/15/2010:svnHewlett-Packard:pnCompaqPresarioCQ60NotebookPC:pvrPCID:rvnHewlett-Packard:rn3612:rvr09.67:cvnHewlett-Packard:ct10:cvrChassisVersion:
dmi.product.name: Compaq Presario CQ60 Notebook PC
dmi.product.version: PCID
dmi.sys.vendor: Hewlett-Packard
version.compiz: compiz 1:0.9.4+bzr20110415-0ubuntu2
version.ia32-libs: ia32-libs 20090808ubuntu13
version.libdrm2: libdrm2 2.4.23-1ubuntu6
version.libgl1-mesa-dri: libgl1-mesa-dri 7.10.2-0ubuntu2
version.libgl1-mesa-dri-experimental: libgl1-mesa-dri-experimental N/A
version.libgl1-mesa-glx: libgl1-mesa-glx 7.10.2-0ubuntu2
version.xserver-xorg: xserver-xorg 1:7.6+4ubuntu3
version.xserver-xorg-video-ati: xserver-xorg-video-ati N/A
version.xserver-xorg-video-intel: xserver-xorg-video-intel 2:2.14.0-4ubuntu7.1
version.xserver-xorg-video-nouveau: xserver-xorg-video-nouveau N/A

Revision history for this message
steubens (steubens) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

Yeah the problem is that the dump tool needs to allocate memory to create the dump info but sometimes the system is in such a state that it can't do it, so the tool obviously fails.

But for GPU lockup bugs with Intel graphics, all you really need to collect is the output of 'dmesg' and your /sys/kernel/debug/dri/0/i915_error_state file. Both of these must be collected while the machine is locked up (e.g. by sshing into the sick machine over ethernet), but that should work even if the dump tool is failing. See https://wiki.ubuntu.com/X/Troubleshooting/Freeze for additional info.

If you do that, please file a NEW bug report for the freeze. We'll focus this bug report on the issue that apport-gpu-error-intel.py fails under these conditions. I need to rejigger the script to stop using the dump tool and instead use the error file directly.

affects: xserver-xorg-video-intel (Ubuntu) → xdiagnose (Ubuntu)
Changed in xdiagnose (Ubuntu):
assignee: nobody → Bryce Harrington (bryce)
importance: Undecided → High
milestone: none → oneiric-alpha-1
status: New → Triaged
Revision history for this message
steubens (steubens) wrote :

i tend to ask the intel or x ubuntu guys directly if i have a problem they already know of (they almost always do), but thanks for the information

Changed in xdiagnose (Ubuntu):
milestone: oneiric-alpha-1 → oneiric-alpha-2
Martin Pitt (pitti)
Changed in xdiagnose (Ubuntu):
milestone: oneiric-alpha-2 → oneiric-alpha-3
Revision history for this message
Bryce Harrington (bryce) wrote :

xdiagnose 0.3 (or thereabouts) switches the apport gpu hook from using the intel dumper tool (which has this memory allocation bug in certain cases) to copying the kernel gpu file directly. That won't be subject to memory allocation errors, thus this bug should no longer occur.

Changed in xdiagnose (Ubuntu):
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.