[i965q] Fujitsu Siemens Esprimo E: changing resolution results in non working X

Bug #586325 reported by Torsten Spindler on 2010-05-27
24
This bug affects 2 people
Affects Status Importance Assigned to Milestone
xf86-video-intel
Fix Released
Medium
linux (Ubuntu)
Medium
Unassigned
Lucid
Undecided
Unassigned

Bug Description

SRU Justification

Impact: when changing resolution X hangs due to GPU hangs

Fix Description: handles the case where the cursor is off the active area, ensuring the cursor ends up valid; an invalid cursor can lead to a GPU hang

Patch: TBA

Risks: non-affected systems may be impacted by the change which is in generic i915 code.

TEST CASE: Place the cursor in the bottom right and reduce resolution

===

NOTE: the machine is not necessarily hung as in some cases the use reports being able to see the logout dialog from the power button (albeit all corrupted), there are also results of ssh'ing into the machine.

===

Binary package hint: xserver-xorg-video-intel

When changing the resolution on a Fujitsu Siemens Esprimo and a Fulitsu Siemens 24" display, the X server is hung with colored graphic blocks.

Steps to reproduce:
1) Open gnome-display-properties
2) Select 1680x1050 or 1024x768 (instead of 1900x1200)
3) Screen may either turn black or start to flicker or both happens after each other

From the system itself only a hard power off is possible. It may be accessible via ssh but it cannot be tested because of a restricted testing environment.

Created an attachment (id=30728)
Output of dmesg

Created an attachment (id=30729)
Ouput of intel_gpu_dump (before resoluation change)

Created an attachment (id=30730)
Ouput of intel_gpu_dump (after resolution change)

Created an attachment (id=30731)
Xorg.0.log

Created an attachment (id=30732)
Screenshot

Hi Christian,

Thanks for the bug report. I'm curious if the output of "xrandr --verbose" is
different between when things work and when things don't. Could you look at
that and perhaps attach the output if different.

I'll assign this bug to <email address hidden> who has some experience in this
area.

Thanks,

-Carl

(In reply to comment #6)
Hi Carl,

> Hi Christian,
>
> Thanks for the bug report. I'm curious if the output of "xrandr --verbose" is
> different between when things work and when things don't. Could you look at
> that and perhaps attach the output if different.

does xrandr work over ssh? How can I get the information when the graphics output is disturbed?

(In reply to comment #6)
> I'm curious if the output of "xrandr --verbose" is
> different between when things work and when things don't. Could you look at
> that and perhaps attach the output if different.

The output is nearly identical for both cases. Only the two lines with "Timestamp:" have different values.

For me it looks like the chance that resolution switching does not work is a little bit higher when I use krandrtray instead of xrandr.

(In reply to comment #6)
> I'll assign this bug to <email address hidden> who has some experience in this
> area.

Hi Carl,

is there any progress with this issue? Can I expect that this bug will be fixed in the near future? My housemate told me that he has similar problems on his laptop so I assume that more people are suffering from this error.

regards
Christian

Hi, Christian
     Sorry for the late response.
     Will you please try the latest linux kernel(for example:2.6.32.2) and see whether the issue still exists?(Please add the boot option of "drm.debug=0x06" and boot the system with KMS enabled).
     Will you please add the "modedebug" option in xorg.conf and attach the output of Xorg.0.log?
     >Option "modedebug" "True"

     It will be great if you can attach the output of intel_reg_dumper when the issue happens.

Thanks.

(In reply to comment #11)
> ping Chris...
>

Sorry, I was on holiday for the previous days...

Unfortunately it's difficult for me to test with a particular Kernel/Xorg-Driver because I made the investigations with a Live-CD of my Linux-Distro. My real working environment is a little bit outdated an I think it might be difficult to update all components.

My housemate uses Gentoo, so for him it should be much easier to make the tests on his Laptop. Unfortunately he's also on holiday until the 8. January...

Do you know whether there's another Live-CD available which uses the versions of Kernel/xorg-driver which I shall test for you? I could also install the most recent Kernel on my machine but I think it would be difficult to update the X-Server and other required components...

regards
Christian

Created an attachment (id=32515)
Output of dmesg (linux-2.6.32.2)

Created an attachment (id=32516)
Xorg.0.log (linux-2.6.32.2)

Created an attachment (id=32517)
Output of intel_reg_dumper (linux-2.6.32.2)

(In reply to comment #10)
> Will you please try the latest linux kernel(for example:2.6.32.2) and see
> whether the issue still exists?(Please add the boot option of "drm.debug=0x06"
> and boot the system with KMS enabled).
> Will you please add the "modedebug" option in xorg.conf and attach the
> output of Xorg.0.log?
> >Option "modedebug" "True"
>
> It will be great if you can attach the output of intel_reg_dumper when the
> issue happens.

In the meantime I've installed openSUSE 11.2 on hard disk. I hope I can respond faster now if you need further input.

Also after upgrading to 2.6.32.2 ("SUSE Kernel Of The Day"), the problem still persists (perhaps the possibility that the problem happens on mode switching may be even higher!). The same result is for the second PC (Laptop with Gentoo).

regards
Christian

Hi,

any news...?

regards
Christian

(In reply to comment #17)
> Hi,
> any news...?
> regards
> Christian

Sorry for the late response.

Can you try the following patch set on 2.6.33-rc5 kernel and see whether
the issue still exists?
    >http://lists.freedesktop.org/archives/intel-gfx/2010-January/005505.html

BTW: the patch 1 can be skipped as it is already shipped in 2.6.33-rc5 kernel.

Thanks.
   Yakui

(In reply to comment #18)
> Can you try the following patch set on 2.6.33-rc5 kernel and see whether
> the issue still exists?
> >http://lists.freedesktop.org/archives/intel-gfx/2010-January/005505.html

Dear Yakui,

kernel compilation stressed my hard drive a little bit. Maybe I'll have to replace it...

I hope I can continue testing the next days and can give you the result end of this week.

For now I've tested with "vanilla" 2.6.33-rc5+your patches. With i915.modeset=1 the monitor goes to standby mode in the middle of the boot process (maybe when X starts).

regards
Christian

(In reply to comment #16)
> (In reply to comment #10)
> Also after upgrading to 2.6.32.2 ("SUSE Kernel Of The Day"), the problem still
> persists (perhaps the possibility that the problem happens on mode switching
> may be even higher!). The same result is for the second PC (Laptop with
> Gentoo).
>

Do you mean you have same problem another PC (laptop with Gentoo)? What's it HW configuration ? Is it the same as the first PC? thanks.

> regards
> Christian
>

(In reply to comment #20)
Dear Michael,
>
> Do you mean you have same problem another PC (laptop with Gentoo)? What's it HW
> configuration ? Is it the same as the first PC? thanks.

1st PC: Intel mainboard with i965G chipset
2nd PC: Laptop with GM965 (X3100) chipset

All test outputs are generated on the first system, but the symptoms are the same on the second system.

I'll try to test with kernel 2.6.33-RC5 the next days after replacing my defective hard drive.

regards
Christian

(In reply to comment #18)
> Can you try the following patch set on 2.6.33-rc5 kernel and see whether
> the issue still exists?
> >http://lists.freedesktop.org/archives/intel-gfx/2010-January/005505.html
>
After serious hard drive problems I've reinstalled everything and tested 3 configurations (on System "1"):

a) 2.6.31 (openSUSE, w/o your patches)
b) 2.6.33-rc5 (vanilla, w/o your patches)
c) 2.6.33-rc5 (vanilla, w/ your patches)

Results:
a) Problems when resolutions is changed with krandrtray (original problem, looks like "lost of sync" on crts)
b) Similar to a), but usually the monitor enters standby mode instead of showing the "lost of sync" pattern)
c) The monitor enters standby mode during booting. It seems that this happens when the i915 module is loaded (not when starting X as previously guessed). So I could not test what happens when resolution is changed with krandrtray.

So something has changed between 2.6.31 and 2.6.33, but this doesn't solve the problem. And unfortunately your patches also didn't help.

regards
Christian

I'll be on vacation between Feb. 12 and Feb. 28

d) 2.6.33-rc7 (vanilla, w/o your patches)

Results:
d) Same as b)

(In reply to comment #23)
2.6.33 (final) also doesn't work.

Is there any chance that this bug will be fixed in the near future?

regards
Christian

If it helps in any way: I have the same problem with

System environment:
 -- chipset: G965
 -- system architecture: x86_64
 -- xf86-video-intel: git snapshot with last commit 8ece6cf5afa1bb0d8d9328696422f42f3c3adbd6 from Sat Mar 6 14:09:12 2010 -0500
 -- xserver: Server 1.7.5
 -- mesa: 7.7
 -- libdrm: 2.4.17
 -- kernel: 2.6.31-gentoo-r6
 -- Linux distribution: Gentoo
 -- Machine or mobo model: Intel DG965SS
 -- Display connector: VGA/DE-15

I have *two* working resolutions with a monitor "Captiva E1902W" (cheap flatscreen): 1440x900@59,9 and 1280x1024@75.0. I can switch between these two modes and everything works fine. For any other tested mode (1024x768, 800x600, 640x480), I encounter the same out-of-sync problem as descried above. From that state, the only way to recover I have found is a reboot. Switching back to a working mode does not help: the screen stays corrupted. Switching to a VT makes things even worse (black screen, sometimes freeze when afterwards switching back to X).

Please let me know if you need any further information from me.

I have tested this again with a recent git snapshot of xf86-video-intel, but with a different monitor (acer AL1917). As far as I remember, this problem existed with both monitors, the Captiva E1902W and the acer AL1917.

However, with the acer monitor, I just made about 20 mode switches and I did not experience any problems. Maybe this has been fixed during the last weeks?

I will check with the Captiva monitor as soon as I get my hands on it again. :)

changes that happended to my system configuration:

-- xf86-video-intel: git snapshot from one day ago (last commit
362a49e71fc41541b6dc121660d98e29da4b14e8)
-- xserver: Server 1.7.6
-- mesa: git snapshot from one day ago (last commit
9eaadfeaa54d15fc3eb90d4137795ace4f920b2f)
-- libdrm: 2.4.19
-- kernel: 2.6.31-gentoo-r10

(In reply to comment #26)
> changes that happended to my system configuration:
>
> -- xf86-video-intel: git snapshot from one day ago (last commit
> 362a49e71fc41541b6dc121660d98e29da4b14e8)
> -- xserver: Server 1.7.6
> -- mesa: git snapshot from one day ago (last commit
> 9eaadfeaa54d15fc3eb90d4137795ace4f920b2f)
> -- libdrm: 2.4.19
> -- kernel: 2.6.31-gentoo-r10
>
I've updated my openSUSE 11.2 to the following package versions:
-- xf86-video-intel: git snapshot from 2010-03-10
-- xserver: Server 1.8.0 RC2
-- mesa: 7.7.99
-- libdrm: 2.4.19
-- kernel: 2.6.33

Result: No changes. The problem is still present as before. Btw: Does it make any sense to update components other than the kernel?

> However, with the acer monitor, I just made about 20 mode switches and I did
> not experience any problems. Maybe this has been fixed during the last weeks?

This doesn't surprise me. Sometimes everything seems to work, but I only need to reboot once and the problem is present again.

regards
Christian

> >
> I've updated my openSUSE 11.2 to the following package versions:
> -- xf86-video-intel: git snapshot from 2010-03-10
> -- xserver: Server 1.8.0 RC2
> -- mesa: 7.7.99
> -- libdrm: 2.4.19
> -- kernel: 2.6.33
>
> Result: No changes. The problem is still present as before. Btw: Does it make
> any sense to update components other than the kernel?

Sorry for the late response.
Can you add the boot option of "drm.debug=0x04" and attach the output of dmesg, xrandr -q --verbose?

Do you have an opportunity to try another monitor and see whether the issue still can be reproduced?

Thanks.

Created an attachment (id=34533)
dmesg (for comment #28)

Created an attachment (id=34534)
xrandr -q --verbose (for comment #28)

(In reply to comment #28)
> Sorry for the late response.
> Can you add the boot option of "drm.debug=0x04" and attach the output of dmesg,
> xrandr -q --verbose?
Done

> Do you have an opportunity to try another monitor and see whether the issue
> still can be reproduced?

Same problem with Monitor A, B and A+B.

A: HP LP2065, 20.1" LCD, DVI-D (SDVO card), 9.2 kg
B: Viewsonic E790, 19" CRT, VGA, 22.5 kg (~30 kg when moving from/to the basement)

regards
Christian

Created an attachment (id=34537)
try the debug patch that updates the self-refresh watermark on 965 platform

From the log in comment #29 it seems that the SR watermark is 1. It is incorrect.

Will you please try the attached debug patch on 2.6.33 kernel and see whether the issue still exists?

thanks.

Created an attachment (id=34538)
try the debug patch that updates the self-refresh watermark on 965 platform

Sorry for the typo.

Please try the updated patch.

Created an attachment (id=34575)
dmesg (for comment #33)

Applied you patch to 2.6.33. Result is (at least nearly) the same: After switching resolution from 1600x1200 to 1024x768, the monitor went to standby mode. After a few seconds I pressed ESC and krandrtray switched back to the previous settings.

Then I tried another mode change which produced the "out of sync" pattern. Some other guy has posted a screenshot which shows the result [1]. Reverting to the previous mode didn't work anymore.

After that I tried to switch to console which filled the screen completely which a single color (also the same behavior as before).

[1]
http://lists.freedesktop.org/archives/intel-gfx/2009-September/004362.html

(In reply to comment #27)
> This doesn't surprise me. Sometimes everything seems to work, but I only need
> to reboot once and the problem is present again.

You are right. A reboot broke things again. I made a lot of reboots now and I have not seen any pattern in when things break and when not.

I compared the dmesg output with drm.debug=0x04 from a working case and from a broken case: I did not see any differences besides minor changes in detected cpu frequency, BogoMIPS and the order of some lines about hard disks and network interfaces.

However, I guess I am missing something in my kernel configuration, because I see far less drm output than in the log file in comment #29.

$ grep drm dmesg
Command line: root=/dev/sdb1 drm.debug=0x04
Kernel command line: root=/dev/sdb1 drm.debug=0x04
[drm] Initialized drm 1.1.0 20060810
[drm] DAC-6: set mode 1280x1024 17
[drm] fb0: inteldrmfb frame buffer device
[drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0

(In reply to comment #35)
> You are right. A reboot broke things again. I made a lot of reboots now and I
> have not seen any pattern in when things break and when not.

Did you use krandrtray or xrandr? Please try "xrandr -q --verbose" and then switch through all available modes

# xrandr --output DVI1 --mode 0x45
# xrandr --output DVI1 --mode 0x44
# xrandr --output DVI1 --mode 0x46
...

In my case, for instance mode 0x49 (1024x768@85Hz) doesn't work. But this mode is chosen by xrandr when no explicit refresh rate is given. With xrandr I could successfully revert to the previous (working) mode by blindly using the bash history.

Assumptions:
- Maybe that krandrtray (which I used for nearly all previous tests) also choses 85Hz for 1024x768 regardless of the refresh rate which is actually selected by the user.

- Maybe that krandrtray has a bug in the "restore to the previous mode" function.

Could you please check this?

regards
Christian

Hi, Christian
     From the dmesg log it seems that the VESA fb driver is also loaded.
    > vesafb: framebuffer at 0x80000000, mapped to 0xffffc90008980000, using 7500k, total 7616k
     > vesafb: mode is 1600x1200x16, linelength=3200, pages=1

     And after the i915 driver is loaded, the following message is complained:
     >fb: conflicting fb hw usage inteldrmfb vs VESA VGA - removing generic driver

     Can you disable the vesa fb driver in kernel configuration and see whether the issue still exists?

Thanks.

     I

Created an attachment (id=34795)
dmesg (for comment #38)

> Can you disable the vesa fb driver in kernel configuration and see whether the issue still exists?

Without the vesa fb driver there no big difference, only the "vga=" kernel parameter doesn't work anymore. The display switches from text mode to higher resolution later when the i915 module is loaded.

Yesterday I had again the situation where I could NOT recover from a "bad mode" (0x49) to a working one. After running "xrandr --output DVI1 --mode 0x49" the monitor went to standby mode. Then I tried to revert to mode 0x43, which resulted in the "out of sync" pattern (see attached dmesg).

Intermediary result:
At least some modes which are shown by "xrandr -q --verbose" are "bad". Switching to these modes puts the monitor in standby mode. Sometimes it's possible to switch back to a working mode by blindly using the shell history. Other times this causes the "out of sync" pattern.

For the moment I'm not sure whether the problem can only happen with specific "bad modes" or also with other ones.

regards
Christian

(In reply to comment #38)
> Created an attachment (id=34795) [details]
> dmesg (for comment #38)
>
> > Can you disable the vesa fb driver in kernel configuration and see whether the issue still exists?
>

Thanks for the testing.

> Without the vesa fb driver there no big difference, only the "vga=" kernel
> parameter doesn't work anymore. The display switches from text mode to higher
> resolution later when the i915 module is loaded.

It seems that the issue still exists after removing the vesa fb driver.

>
> Yesterday I had again the situation where I could NOT recover from a "bad mode"
> (0x49) to a working one. After running "xrandr --output DVI1 --mode 0x49" the
> monitor went to standby mode. Then I tried to revert to mode 0x43, which
> resulted in the "out of sync" pattern (see attached dmesg).

The message of "out of sync" is related with SDVO DVI. I am not sure whether the issue is related with SDVO.

Can you connect this monitor by using VGA connector and see whether the issue can also be reproduced?

Thanks
   Yakui

Created an attachment (id=34811)
try the debug patch that dumps the output pixel clock range of sdvo device

From the dmesg log we can get one message related with SDVO.
    >drm:intel_sdvo_debug_write], SDVOB: W: 16 48 3F 40 30 62 B0 32 40 (SDVO_CMD_SET_OUTPUT_TIMINGS_PART1)
    > [drm:intel_sdvo_debug_response], SDVOB: R: (Not supported)

    It seems that this SDVO device can't support the command of setting the output timing of SDVO device. I am not sure whether the high resolution is supported by this SDVO device.

    Will you please try the debug patch and attach the output of dmesg?

Thanks.
   Yakiu

Bryce Harrington (bryce) on 2010-05-27
tags: added: needs-xorglog
tags: added: needs-lspci-vvnn
Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Incomplete
Bryce Harrington (bryce) on 2010-05-27
tags: removed: needs-xorglog
tags: removed: needs-lspci-vvnn
Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → Confirmed
summary: - changing resolution results in non working X
+ Fujitsu Siemens Esprimo E: changing resolution results in non working X
Bryce Harrington (bryce) on 2010-05-27
tags: added: hardy
Stenten (stenten) on 2010-05-29
tags: added: 965q i386 lucid
removed: hardy
Stenten (stenten) on 2010-05-29
summary: - Fujitsu Siemens Esprimo E: changing resolution results in non working X
+ [965q] Fujitsu Siemens Esprimo E: changing resolution results in non
+ working X
Changed in xserver-xorg-video-intel (Ubuntu):
status: Confirmed → Incomplete
Bryce Harrington (bryce) on 2010-05-30
tags: added: resolution
Bryce Harrington (bryce) on 2010-05-30
tags: added: hardy
Stenten (stenten) on 2010-06-02
summary: - [965q] Fujitsu Siemens Esprimo E: changing resolution results in non
+ [i965q] Fujitsu Siemens Esprimo E: changing resolution results in non
working X
Changed in xserver-xorg-video-intel (Ubuntu):
importance: Undecided → Low
status: Incomplete → Confirmed
Changed in xserver-xorg-video-intel (Ubuntu):
status: Confirmed → Incomplete
Changed in xserver-xorg-video-intel (Ubuntu):
status: Incomplete → Confirmed
48 comments hidden view all 128 comments
Torsten Spindler (tspindler) wrote :

Attached are two dmesg outputs, given with drm debug set to 4. One is where the xrandr -s 1024x768 --rate 60 worked, the other one where it failed.

Torsten Spindler (tspindler) wrote :

Created an attachment (id=36780)
Break the mouse cursor, fix resolution changing

I suspect that the resolution changes are only a part of this problem. Now that I've got some hardware to test on I noticed that moving the mouse changes the behaviour of the broken screen, so I suspected that the hardware cursor might be involved.

Investigating this, I came up with the attached patch which simply causes all calls to set the cursor to set a null cursor. This fixes resolution changing for me - I can cycle through the xrandr mode list to my heart's content.

Of course, this patch also ensures that you can never see a mouse cursor, so it's hardly a fix.

Wow, never would have suspected that. The next step would seem to be to test disabling the cursor around modesetting.

Chris Halse Rogers (raof) wrote :

I've now got a Q965 system to test locally, and my isn't this bug fun!

I've come to suspect that this might be a hardware-cursor issue. For me, when this bug gets triggered, moving the mouse up and down changes the piece of the framebuffer that's displayed, and moving the mouse left and right changes how fast it flickers - to the point where moving the mouse to a certain x-position results in a stable (albeit vertically shifted) display.

I'll test whether fiddling with this cursor code makes a difference.

(In reply to comment #53)
> Created an attachment (id=36780) [details]
> Break the mouse cursor, fix resolution changing
>
Thank you for giving the hint for the real source of the problem:

http://fstatic1.mtb-news.de/img/photos/3/3/8/8/0/_/large/ScheeseamAbgrund.JPG

The bike is the cursor, the cliff is your screen. Now reduce the size of the cliff about 10-50% without moving the bike...

How can we move the bike BEFORE reducing the size of the cliff?

regards
Christian

Yup, just disabling the cursor around modesetting works.

(In reply to comment #56)
> Yup, just disabling the cursor around modesetting works.

Does this also work when the cursor is in an area which will be "outside" the new resolution? Where will the cursor be positioned when it's re-enabled after switching?

Instead of disabling the cursor it may be sufficient to ensure that the cursor will not be outside the new resolution. I "parked" my cursor on the upper left of the screen and cycled resolution about fifty times without any problems.

I'll be on vacation from Thursday until Sunday, 18th. If you would provide a patch, I'll test either today or in the week after my vacation.

regards
Christian

(In reply to comment #57)
> (In reply to comment #56)
> > Yup, just disabling the cursor around modesetting works.
>
> Does this also work when the cursor is in an area which will be "outside" the
> new resolution? Where will the cursor be positioned when it's re-enabled after
> switching?

It does indeed work - until you move the pointer.

>
> Instead of disabling the cursor it may be sufficient to ensure that the cursor
> will not be outside the new resolution. I "parked" my cursor on the upper left
> of the screen and cycled resolution about fifty times without any problems.
>

I've also just noticed this. So, the problem manifests when the pointer is outside the framebuffer and gets touched. Is this as simple as the cursor scribbling on memory it's not meant to?

Created an attachment (id=36827)
Unset cursor if out of bounds.

Following on from Christopher's hint is this patch that should disable the cursor on a mode change if it results in an invalid cursor position.

Chris Halse Rogers (raof) wrote :

Fiddling with the cursor code does indeed make a difference. Hiding the cursor before changing resolution and showing it afterwards fixes this for me.

Once I've got a driver that does this properly (currently it unconditionally re-shows the cursor after mode switch) I'll attach it here for testing.

The changes should be small and self-contained, making this appropriate for an SRU.

Created an attachment (id=36828)
Unset cursor if out of bounds.

I scanned through the docs I have on hand and the only caveat for cursor positioning is that the VGA popup cursor must entirely be within the bounds of the pipe. Since we don't use that cursor...

(In reply to comment #61)
> I scanned through the docs I have on hand and the only caveat for cursor
> positioning is that the VGA popup cursor must entirely be within the bounds of
> the pipe. Since we don't use that cursor...

It's in the X/Y sign bit register documentation for CURAPOS and CURBPOS - “For normal high resolution display modes, the cursor must have at least a single pixel positioned over the active screen.” (p143, p148 of the hardware registers docs).

(In reply to comment #60)
> Created an attachment (id=36828) [details]
> Unset cursor if out of bounds.

Ding! This works. Thanks for fixing this while I slept :)

Tested-by: Christopher Halse Rogers <email address hidden>

(In reply to comment #60)
> Created an attachment (id=36828) [details]
> Unset cursor if out of bounds.

Thank you!!!

This patch works great for xrandr but not for krandrtray. With krandrtray the problem is still present when the mouse cursor is outside the new resolution.

When I move the control bar (and systray) to the upper left of the screen, everything is fine. When the kdrandrtray icon is on the bottom right, the graphic crashes.

Perhaps there's a bug in krandrtray, because even if the graphics doesn't crash, the window sizes are not adjusted to the new resolution (in contrast to xrandr). But even in this case that should not provoke a crash of the graphics.

I'll be on vacation for the next 10 days, so I can not provide any test feedback during this time. If the problem with krandrtray is too complicated, I suggest to apply the current patch immediately in treat the krandrtray problem as a different bug.

regards
Christian

Chris Halse Rogers (raof) wrote :

Moving this to the kernel. There's a kernel patch available https://bugs.freedesktop.org/attachment.cgi?id=36828 which I've tested fixes this, and corresponds with what the hardware docs say.

I'll get in contact with the kernel team to get this moving on their end.

affects: xserver-xorg-video-intel (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu):
importance: Low → Medium
Chris Halse Rogers (raof) wrote :

The upstream patch needs a tiny bit of futzing to apply to the 2.6.33 drm we've got in Lucid's 2.6.32-24 kernel. Attached is a git commit applying it to our kernel tree.

tags: added: kernel-graphics kernel-needs-review

Christian, I've seen other reports where kdrandrtray behaves differently than xrandr, so I'm inclined to believe that therein lies a few other bugs.

I'll send this off to Eric with the tested-by, thanks!

The patch is still slightly broken:

In intel_crtc_update_cursor you have
...
if (crtc->fb) {
 base = intel_crtc->cursor_addr;
 if (x > crtc->fb->width)
  base = 0;
...
with x a signed int and crtc->fb->width an unsigned integer, and similarly for y and height. This makes the cursor disappear near the far left and top of the screen (as (unsigned)-1 > 1440) - there's no guarantee the hot point is the top-left of the image, and indeed for me it's not.

Also, reading the docs it seems that it's required that CUR*BASE be written to update any of the cursor regs (vol 3, p142). I presume that I'm reading it incorrectly, since the cursor appears to update fine without doing that.

I forgot to check whether fb->width was signed or not. Hmm, should check with sparse more often I guess. Thanks for spotting that.

CURAPOS:
"This register specifies the physical memory address at which the cursor image data is located. Writes to this register acts like a trigger that enables atomic updates of the cursor registers. When updating the cursor registers, this register should be written last in the sequence. This register should be written even if the actual contents did not change to allow the holding registers to move to the active registers on the next VBLANK."

Hmm, but if the cursor is enabled first with an invalid position what happens? Yes, the update to base and the pos updated is buffered until the next vblank, but that smells like a race and elsewhere the docs say that CUR*BASE should be written last to trigger the updates...

(In reply to comment #66)
> CURAPOS:

Idiot left in charge of keyboard, again. This is CURABASE:

> "This register specifies the physical memory address at which the cursor image
> data is located. Writes to this register acts like a trigger that enables
> atomic updates of the cursor registers. When updating the cursor registers,
> this register should be written last in the sequence. This register should be
> written even if the actual contents did not change to allow the holding
> registers to move to the active registers on the next VBLANK."

CURAPOS:

"This register can be loaded atomically (requires that the base address be
written) and is double buffered."

Andy Whitcroft (apw) on 2010-07-08
tags: added: kernel-candidate kernel-reviewed
removed: kernel-needs-review
Chris Halse Rogers (raof) wrote :

As noted on the upstream bug this patch does not quite completely resolve this. I've only managed to reproduce this once, but it seems that there's probably an off-by-one error in the bounds checking.

I'm tracking this down now.

Andy Whitcroft (apw) on 2010-07-08
description: updated
tags: added: patch
Chris Halse Rogers (raof) wrote :

Ok. There was a problem with a signed/unsigned comparison in the initial patch. I'm testing the revised patch now.

Chris Halse Rogers (raof) wrote :

The revised patch also has problems - the cursor will sometimes be replaced with a corrupted pixmap when it's unhidden. I think I've identified where it behaves differently to the existing code, and I haven't yet found any problems with the new patch.

Chris Halse Rogers (raof) wrote :

I've added my Reviewed-by and Tested-by to the revised patch on the intel-gfx mailing list here: http://lists.freedesktop.org/archives/intel-gfx/2010-July/007379.html . This should now go into the mainline kernel and the stable trees, and we could pick it up as an SRU.

tags: removed: kernel-candidate
Changed in linux (Ubuntu):
status: Confirmed → Triaged

I think this one is fixed now.

Chris Halse Rogers (raof) wrote :

And is now in linux-stable¹, in 2.6.35.1, so this is fixed in Maverick.

Can we get this patch also applied to Lucid?

1: git commit is http://git.kernel.org/?p=linux/kernel/git/hpa/linux-2.6-allstable.git;a=commit;h=5078304217e1e87bc7ffe8d7a4076e4cb0c0a318

Changed in linux (Ubuntu):
status: Triaged → Fix Released
Andy Whitcroft (apw) on 2010-08-18
tags: added: upstream-stable-patch
Andy Whitcroft (apw) wrote :

The patch suggested above has hit v2.6.35.1 stable.

Changed in linux (Ubuntu Lucid):
status: New → Triaged
Changed in xserver-xorg-video-intel:
importance: Unknown → Medium
status: Unknown → Fix Released
Torsten Spindler (tspindler) wrote :

Can this bug fix be applied to Lucid per SRU? The customer would like to use the standard kernel again.

Andy Whitcroft (apw) wrote :

@Torsten -- the patch has now come back down from stable to v2.6.35.x. I have backported it to v2.6.33 drm that we have in Lucid but that did involve some manual application. Could you test the kernels at the URL below and confirm whether they work ok for you:

    http://people.canonical.com/~apw/lp586325-lucid/

Please test and report back here. Thanks!

Torsten Spindler (tspindler) wrote :

I have the kernel up and running on an Fujitsu-Siemens Esprimo E and a Lenovo Thinkpad T61. On both systems I can change the screen resolution without any problems. I also tested suspend/resume and it continues to work fine. I've asked the customer to test the kernel as well.

I also tested the given kernel on a Fujitsu-Siemens Esprimo E and it looks good. I was able to change the resolution without any problem.

Andy Whitcroft (apw) on 2010-10-14
description: updated

Accepted linux into lucid-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in linux (Ubuntu Lucid):
status: Triaged → Fix Committed
tags: added: verification-needed
Torsten Spindler (tspindler) wrote :

Running the kernel since last week on my laptop (nvidia graphics though) without detecting any regressions.

Testing the kernel on an Fuj-Siemens Esprimo E today for 1.5 hours, doing a loop over xrandr:
while true
do
   xrandr -s 1024x768
   sleep 30
   xrandr -s 1920x1200
   sleep 30
done

During the loop I move the mouse from time to time and from edge to edge. All resolution changes work.

Running the kernel since friday on my Lenovo x61 without any problems. Looks good.

Thanks for testing. Marking as verification-done.

tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :
Download full text (18.8 KiB)

This bug was fixed in the package linux - 2.6.32-26.47

---------------
linux (2.6.32-26.47) lucid-proposed; urgency=low

  [ Steve Conklin ]

  * Revert "SAUCE: ALSA: HDA: Enable internal mic on Dell E6410 and Dell
    E6510"
  * Revert "[Config] Added be2net, be2scsi to udebs"

  [ Upstream Kernel Changes ]

  * Revert "(ore-stable) ALSA: hda - Apply ALC269 VAIO fix-up to all Sony
    laptops with ALC269"
  * Revert "(pre-stable) ALSA: HDA: Correctly apply position_fix quirks for
    ATI and VIA controllers"
  * Revert "ALSA: hda: Use LPIB for another mainboard"
  * Revert "ALSA: hda: Use LPIB for ASUS M2V"
  * Revert "ALSA: hda: Use LPIB for an ASUS device"
  * Buglink Fixup for reverted unverified fixes

linux (2.6.32-26.46) lucid-proposed; urgency=low

  [ Brad Figg ]

  * SAUCE: ALSA: HDA: Enable internal mic on Dell E6410 and Dell E6510
    - See: #605047, #628961

  [ Tim Gardner ]

  * [Config] Added be2net, be2scsi to udebs
    - See: #628776

  [ Upstream Kernel Changes ]

  * Revert "(pre-stable) drm/i915: add PANEL_UNLOCK_REGS definition"
    - LP: #645444
  * Revert "(pre-stable) drm/i915: make sure we shut off the panel in eDP
    configs"
    - LP: #645444
  * Revert "(pre-stable) drm/i915: make sure eDP panel is turned on"
    - LP: #645444
  * Revert "(pre-stable) drm/radeon/kms: initialize set_surface_reg reg for
    rs600 asic"
    - LP: #645371
  * Revert "drm/nouveau: Fix fbcon corruption with font width not divisible
    by 8"
    - LP: #663176
  * mmc: fix all hangs related to mmc/sd card insert/removal during
    suspend/resume
    - LP: #477106
  * mmc: build fix: mmc_pm_notify is only available with CONFIG_PM=y
    - LP: #477106
  * hwmon: (k8temp) Differentiate between AM2 and ASB1
    - LP: #644694
  * xen: handle events as edge-triggered
    - LP: #644694
  * xen: use percpu interrupts for IPIs and VIRQs
    - LP: #644694
  * ALSA: hda - Rename iMic to Int Mic on Lenovo NB0763
    - LP: #605101, #644694
  * sata_mv: fix broken DSM/TRIM support (v2)
    - LP: #644694
  * x86, tsc, sched: Recompute cyc2ns_offset's during resume from sleep
    states
    - LP: #644694
  * PCI: MSI: Remove unsafe and unnecessary hardware access
    - LP: #644694
  * PCI: MSI: Restore read_msi_msg_desc(); add get_cached_msi_msg_desc()
    - LP: #644694
  * sched: kill migration thread in CPU_POST_DEAD instead of CPU_DEAD
    - LP: #644694
  * sched: revert stable c6fc81a sched: Fix a race between ttwu() and
    migrate_task()
    - LP: #644694
  * staging: hv: Fix missing functions for net_device_ops
    - LP: #644694
  * staging: hv: Fixed bounce kmap problem by using correct index
    - LP: #644694
  * staging: hv: Fixed the value of the 64bit-hole inside ring buffer
    - LP: #644694
  * staging: hv: Increased storvsc ringbuffer and max_io_requests
    - LP: #644694
  * staging: hv: Fixed lockup problem with bounce_buffer scatter list
    - LP: #644694
  * fuse: flush background queue on connection close
    - LP: #644694
  * ath9k_hw: fix parsing of HT40 5 GHz CTLs
    - LP: #644694
  * ocfs2: Fix incorrect checksum validation error
    - LP: #644694
  * USB: ehci-ppc-of: problems in unwind
    - LP: #644694
  * USB: Fix kernel oo...

Changed in linux (Ubuntu Lucid):
status: Fix Committed → Fix Released
Changed in xserver-xorg-video-intel:
importance: Medium → Unknown
Changed in xserver-xorg-video-intel:
importance: Unknown → Medium
tags: added: testcase

Unfortunately it seems that with openSUSE 12.2 (kernel 3.4.11) the problem (or a similar one) is present again. Switching to another mode crashes the graphics system with a possibility of 90 percent!

I've filled a new bug report here:
https://bugs.freedesktop.org/show_bug.cgi?id=59066

Unfortunately there hasn't been much progress in the last weeks so I hope someone who has been related to this bug could help.

Could somebody please check whether there's at least a chance to fix it, or it would be more wise to purchase another graphics adapter?

Thank you very much
Christian

I can confirm that the same problem, or something very similar, has returned in Debian Wheezy.

Displaying first 40 and last 40 comments. View all 128 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.