[gen3] Bad GPU performance whilst CPU is in deep sleep

Bug #1087582 reported by Benjamin Laisure
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
xf86-video-intel
Fix Released
Medium
xserver-xorg-video-intel (Ubuntu)
Incomplete
Low
Unassigned

Bug Description

I have an Acer AOD250(KAV60) netbook.
On Ubuntu 12.10 with glxgears I get 50-60 fps, while on Lubuntu I get half that and sometimes worse.
The odd thing about this is that when I start moving my mouse around it straightens up and goes back to roughly 55-60 fps, and once I stop moving the mouse, the performance drops again.
I know the chipset and GPU isn't the best out there, but it shouldn't be acting like this(before, I was able to use Celestia with ease but now I get below 5 fps).

Latest available BIOS(v1.29)
Intel Atom N270 1.6GHz(1 core, 2 threads)
1024RAM
Intel 945GMA(8MB VRAM from BIOS)

Tested glxgears from mesa-utils "8.0.1+git20110129+d8f7d6b-0ubuntu2"

ProblemType: Bug
DistroRelease: Ubuntu 12.10
Package: xserver-xorg-video-intel 2:2.20.9-0ubuntu2
ProcVersionSignature: Ubuntu 3.5.0-19.30-generic 3.5.7
Uname: Linux 3.5.0-19-generic i686
ApportVersion: 2.6.1-0ubuntu6
Architecture: i386
Date: Fri Dec 7 02:37:04 2012
InstallationDate: Installed on 2012-12-07 (0 days ago)
InstallationMedia: Lubuntu 12.10 "Quantal Quetzal" - Release i386 (20121017.1)
MarkForUpload: True
SourcePackage: xserver-xorg-video-intel
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

Created attachment 38937
xorg log

Chipset: 945GM
Kernel version: 2.6.35 (same problem in 2.6.34 but not in 2.6.33)
Arch: i686
xorg-server: 1.8.1.902
mesa / intel-dri: 7.8.2
xf86-video-intel: 2.12.0
libdrm version: 2.4.21

Linux distribution: Arch linux (similar issues reported for fedora too)
Machine model: Asus 1005HA
Display Connector: LVDS (happens on VGA too)

Reproducible: Always

Step to reproduce:
-------------------
Compile mesa demos
Launch teapot
Observe the framerate
Move the mouse around
Observe the framerate jumping
(in my case it went from 14~125 to 30 just by putting my finger on the touchpad)

Roll back to kernel 2.6.33
Launch teapot
Observe that there is no difference in framerate if you move the mouse around, notice the framerate is "high" (30fps for me) in any case.

Even if just changing kernel version makes the bug disappear, i think it is more logical to file a bug report here.
I i made i mistake, i apologize.

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

Created attachment 38938
dmesg log

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

Created attachment 38939
xorg.conf

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

Sorry for the typo:
"(in my case it went from 14~125[..]
Is:
"(in my case it went from 14~15[..]

Revision history for this message
In , Chris Wilson (ickle) wrote :

Swapbuffer vs interrupts.

Revision history for this message
In , Chris Wilson (ickle) wrote :

Is the teapot fullscreen? Are page-flips enabled? Is something disabling the interrupts on your system?

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Looks like the processor c-state issue we have on some other 945 machines. If you boot with processor.max_cstate=1 does the problem go away?

The issue is that we rely on vblank interrupts arriving at the correct frequency, and on some platforms when the CPU is in a deep sleep state, it won't wake up when a vblank interrupt arrives, but it will wake up when other device interrupts arrive. That's why you see the performance increase when you move the mouse.

I still don't know the root cause, but if the above works for you, then it's a duplicate of a known bug at least.

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

(In reply to comment #5)
> Is the teapot fullscreen? Are page-flips enabled? Is something disabling the
> interrupts on your system?

I don't know how to make teapot runs in fullscreen mode, the best i've done was to launch it in a empty X screen by xterm without any WM and the problem persists.
With compiz enabled (and unredirect fullscreen windows) i was able to make it fullscreen too with a shortcut, the problem persists.

Pagefilps were disabled in my system because of stability issues (enabled in kernel, disabled in X), i tried to recompile the driver to enable them again for a while and tried again, no success.

I can't say if something is blocking interrupts in my system, sorry, anyway, listening to an mp3 in background helped a bit (fps went from 15 to 20 in teapot)

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

(In reply to comment #6)
> Looks like the processor c-state issue we have on some other 945 machines. If
> you boot with processor.max_cstate=1 does the problem go away?
>
> The issue is that we rely on vblank interrupts arriving at the correct
> frequency, and on some platforms when the CPU is in a deep sleep state, it
> won't wake up when a vblank interrupt arrives, but it will wake up when other
> device interrupts arrive. That's why you see the performance increase when you
> move the mouse.
>
> I still don't know the root cause, but if the above works for you, then it's a
> duplicate of a known bug at least.

I tried that boot option and the problem disappeared.
Unfortunately, and as expected that thing makes my netbook power hungry, using powertop i noticed that it went from ~6.5..7W to ~8W+ just idling, and expected uptime battery life dropped from about ~10hrs to ~8.

Out of curiosity, is a vblank interrupt still needed when one doesn't need (or doesn't care) about vsync?

At least, thank you very much for answering and claryfing things, at this point it is clear that this is a duplicate bug, if could you mark it to the right one?

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

(In reply to comment #8)
> I tried that boot option and the problem disappeared.
> Unfortunately, and as expected that thing makes my netbook power hungry, using
> powertop i noticed that it went from ~6.5..7W to ~8W+ just idling, and expected
> uptime battery life dropped from about ~10hrs to ~8.

Yeah, it's unfortunate. I don't think they see this problem on Windows because they probably can't reach a deep enough sleep state to be affected (Windows and its applications tend to have lots of timers running that keep the CPU awake).

> Out of curiosity, is a vblank interrupt still needed when one doesn't need (or
> doesn't care) about vsync?

Yes, if you don't have apps waiting for vsync or doing buffer swaps, you shouldn't need the vblank interrupt (the kernel will shut it off). But anything using GL will do buffer swaps and thus need the vsync interrupt, unless you disable it entirely using vblank_mode=0 in your dri configuration file (.drirc or /etc/drirc iirc).

> At least, thank you very much for answering and claryfing things, at this point
> it is clear that this is a duplicate bug, if could you mark it to the right
> one?

Actually I don't think we have bug open on this, so we'll use this one. :) All the discussion of this so far has just been on the mailing lists.

Revision history for this message
In , anarsoul (anarsoul) wrote :

Jesse, what about pm_qos stuff mentioned on maillist?

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

I don't have a tool to set that from userspace, and I didn't see a good way of doing it from within the kernel, but I expect it just limits the processor max c state, just like the boot param.

Another thing to try, that worked on my aspireone, is to boot with maxcpus=1.

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

(In reply to comment #9)

Anyway, the same driver on kernel 2.6.33 performs just fine for me (low power consumption and right vblank interrupts), so i think this problem has definitely a solution lying around.

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

2.6.33 doesn't support vblank events, so you wouldn't be able to run the code that exposes this problem. I'm sure the interrupt issue still exists on 2.6.33 though, you just don't see it because you're not running code that's sensitive to interrupt latency.

Revision history for this message
In , anarsoul (anarsoul) wrote :

I tried using ShadowFB as workaround, and found that it works _much_ better with KDE 4.5 and latest intel driver :) (at least konsole is not jerky)

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

(In reply to comment #13)
> 2.6.33 doesn't support vblank events, so you wouldn't be able to run the code
> that exposes this problem. I'm sure the interrupt issue still exists on 2.6.33
> though, you just don't see it because you're not running code that's sensitive
> to interrupt latency.

Please, excuse in advance my ignorance and probably the stupid question, but what are the advantages (if any) on running that code?
I'm asking because i didn't noticed any performance or tearing difference with 2.6.35+processor.max_cstate=1 compared to 2.6.33.

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

The new code has some potential performance benefits (it allows page flipping and won't waste GPU time on frames that won't be displayed), and adds back several missing GL features.

You can get the same behavior with current code as in 2.6.33 by disabling the new features. You can do this by setting vblank_mode=0 in your environment or drirc config file.

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

(In reply to comment #16)
> The new code has some potential performance benefits (it allows page flipping
> and won't waste GPU time on frames that won't be displayed), and adds back
> several missing GL features.
>
> You can get the same behavior with current code as in 2.6.33 by disabling the
> new features. You can do this by setting vblank_mode=0 in your environment or
> drirc config file.

I just readed that answer by Vasily Khoruzhick on the mailing list:
"That doesn't help, glxgears shows ~1000fps, but it's output is jerky"

Anyway thank you for the suggestion, i'll try by myself as soon as possible.

Revision history for this message
In , Chris Wilson (ickle) wrote :

(In reply to comment #17)
> I just readed that answer by Vasily Khoruzhick on the mailing list:
> "That doesn't help, glxgears shows ~1000fps, but it's output is jerky"
>
> Anyway thank you for the suggestion, i'll try by myself as soon as possible.

If I've got it right, that should be fixed on -next with the per-process throttling.

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

(In reply to comment #18)
> (In reply to comment #17)
> > I just readed that answer by Vasily Khoruzhick on the mailing list:
> > "That doesn't help, glxgears shows ~1000fps, but it's output is jerky"
> >
> > Anyway thank you for the suggestion, i'll try by myself as soon as possible.
>
> If I've got it right, that should be fixed on -next with the per-process
> throttling.

Can't understand fully what you said, but let's wait for the next release then.

Revision history for this message
In , anarsoul (anarsoul) wrote :

(In reply to comment #18)
> (In reply to comment #17)
> > I just readed that answer by Vasily Khoruzhick on the mailing list:
> > "That doesn't help, glxgears shows ~1000fps, but it's output is jerky"
> >
> > Anyway thank you for the suggestion, i'll try by myself as soon as possible.
>
> If I've got it right, that should be fixed on -next with the per-process
> throttling.

Please give a link to commit/patch when it's ready. Thanks

Revision history for this message
In , anarsoul (anarsoul) wrote :

(In reply to comment #18)
> If I've got it right, that should be fixed on -next with the per-process
> throttling.

Tried drm-intel-next from today, bug still remains.

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Created attachment 39347
ICH7 LPC debug driver

Can you load this driver and tell me what it outputs? I wonder if BM_BREAK_EN is 0 on your machine as well...

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

This patch on top of the last attachment should let the CPU wake up much more frequently, assuming the break reg is 0, give it a try and see if it helps your performance problem.

diff --git a/drivers/platform/x86/intel_lpc.c b/drivers/platform/x86/intel_lpc.c
index d3c5ef5..3be93c1 100644
--- a/drivers/platform/x86/intel_lpc.c
+++ b/drivers/platform/x86/intel_lpc.c
@@ -50,6 +50,8 @@ static int lpc_probe(struct pci_dev *dev, const struct pci_dev
        dev_err(&dev->dev, "ACPI_CX_STATE_CONF: 0x%02x\n", cxstate);
        dev_err(&dev->dev, "ACPI_BM_BREAK_EN: 0x%02x\n", break_en);

+ pci_write_config_byte(dev, ACPI_BM_BREAK_EN, 0xf3);
+
 out:
        return ret;
 }

Revision history for this message
In , anarsoul (anarsoul) wrote :

[ 565.573458] intel lpc 0000:00:1f.0: ACPI_CX_STATE_CONF: 0x1c
[ 565.573464] intel lpc 0000:00:1f.0: ACPI_BM_BREAK_EN: 0x00

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

(In reply to comment #23)
> This patch on top of the last attachment should let the CPU wake up much more
> frequently, assuming the break reg is 0, give it a try and see if it helps your
> performance problem.

I didn't tried out the patch yet because i'm not so familiar with kernel patching and we need this netbook daily.

But i was wondering if is possible (and how) to use setpci to try different configurations for BM_BREAK_EN register at runtime.

Thank you very much for your efforts.

Revision history for this message
In , anarsoul (anarsoul) wrote :

Bug is reproducible on following machines:

Lenovo 3000 N100 laptop, Core 2 Duo T5500 CPU,
00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03), pciid: 8086:27a2

Acer Aspire AOA110 netbook, Atom N270 CPU,
00:02.0 VGA compatible controller: Intel Corporation Mobile 945GME Express Integrated Graphics Controller (rev 03), pciid: 8086:27ae

Revision history for this message
In , anarsoul (anarsoul) wrote :

Also reproducible on Acer extensa 5513 laptop, with C2D T5500 CPU,
00:02.0 VGA compatible controller: Intel Corporation Mobile 945GM/GMS, 943/940GML Express Integrated Graphics Controller (rev 03), pciid: 8086:27a2

Revision history for this message
In , Oleksij Rempel (olerem) wrote :

Just to add my two cents:
have same issue atom+945gm
if i add more load on cpu frame rate will grow too.
processor.max_cstat option didn't changed anything, powertop show there is still c4 (may be some other kernel bug)

maxcpus=1 solve the problem, it work with C4, powersaving and better performance.

So haw about the problem with sheduler or irq balancing on SMP?

I'll will test the patch von Jesse ASAP.

Revision history for this message
In , Oleksij Rempel (olerem) wrote :

The patch from comment 23 do not make any difference for me. disable SMP is best configuration for me.

Revision history for this message
In , T-artem (t-artem) wrote :

This bug probably affects Intel HD Graphics too:

glxgears with idle CPU:

4925 frames in 5.0 seconds = 984.918 FPS
4941 frames in 5.0 seconds = 988.052 FPS
4996 frames in 5.0 seconds = 999.137 FPS
4973 frames in 5.0 seconds = 994.512 FPS

glxgears with 100% loaded CPU (one thread only):

7544 frames in 5.0 seconds = 1508.685 FPS
7458 frames in 5.0 seconds = 1491.536 FPS
7378 frames in 5.0 seconds = 1475.574 FPS
7415 frames in 5.0 seconds = 1482.973 FPS

roughly 50%(!) faster.

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

Today i tried with 2.6.36, and obviously the results are the same, so i'm still using 2.6.33.
For me disabling a core or the hyperthreading is not an option due to the higher power consumption and the shorter battery life.

If i understood properly, the issue still appears to be unresolved and the hypothesis made doesn't seems to be able to make anything really useful.

I understood thet the new code is looking forward to provide a "gain" in performance, but now my proposal is to do some kind of workaround for the specific chipsets that expose the problem so that at least their users will be able to upgrade to newer kernels without suffernig any performance "loss".

Could such a thing be done in the video driver itself or does it requires patches or special config options of the kernel (?).

Revision history for this message
In , Chris Wilson (ickle) wrote :

Created attachment 40923
Use PM QoS latency to prevent dropping below C2 on Atom

Proof-of-principle?

Revision history for this message
In , T-artem (t-artem) wrote :

(In reply to comment #32)
> Created an attachment (id=40923) [details]
> Use PM QoS latency to prevent dropping below C2 on Atom
>
> Proof-of-principle?

This patch helped only marginally (10% better than without it in idle mode):

$ glxgears (power savings on, CPU running @ 1.2GHz)
5601 frames in 5.0 seconds = 1120.104 FPS
5612 frames in 5.0 seconds = 1122.275 FPS
5603 frames in 5.0 seconds = 1120.483 FPS
5606 frames in 5.0 seconds = 1121.091 FPS
5587 frames in 5.0 seconds = 1117.238 FPS

$ glxgears (power savings off, CPU running @ 3.2GHz)
7089 frames in 5.0 seconds = 1417.741 FPS
7068 frames in 5.0 seconds = 1413.511 FPS
7082 frames in 5.0 seconds = 1416.285 FPS
7079 frames in 5.0 seconds = 1415.792 FPS
7057 frames in 5.0 seconds = 1411.390 FPS

P.S. I have Intel HD 1st generation graphics.

Revision history for this message
In , Kokoko3k (kokoko3k) wrote :

As .drirc configuration file is finally honoured in the latest intel-dri/mesa (i have 7.9.0.git20101207), setting vblank_mode=0 (as explicitely suggested by Jesse Barnes) now works and the issue is gone for me.
Strangely enough, i can't see any tearing in glxgears.

I know this is a workaround, but on such poor hardware enabling vsync would be a bad idea anyway.

Revision history for this message
In , anarsoul (anarsoul) wrote :

(In reply to comment #32)
> Created an attachment (id=40923) [details]
> Use PM QoS latency to prevent dropping below C2 on Atom
>
> Proof-of-principle?

As I stated on IRC, it does not help in my case - glxgears still shows 30-40fps instead of 60. I want to note that it's not only tearing/vblank issue, response on user actions in KDE with effects enabled is not good (it was much better earlier)

Revision history for this message
In , Lambchop468 (lambchop468) wrote :

Created attachment 41720
Use PM QoS latency to keep CPU from dropping below C1 when vblanks enabled

(In reply to comment #32)
> Created an attachment (id=40923) [details]
> Use PM QoS latency to prevent dropping below C2 on Atom
>
> Proof-of-principle?

Here is a variant of that patch I tried that does fix the issue on my hardware:
Acer Aspire One 9" Netbook AOA150, 945GSE and Intel N270 Processor

It does produce a few WARNs because I am calling pm_qos_add_request from an interrupt disabled context. (also attached)

testcase used is vblank_mode=2 glxgears

Revision history for this message
In , Lambchop468 (lambchop468) wrote :

Created attachment 41721
WARNs from using "Use PM QoS latency to keep CPU from dropping below C1 when vblanks enabled"

Revision history for this message
In , Chris Wilson (ickle) wrote :

*** Bug 32916 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Chris Wilson (ickle) wrote :

Created attachment 41796
Use PM QoS to prevent C-State starvation of gen3 GPU

Raise you a work function.

Revision history for this message
In , anarsoul (anarsoul) wrote :

(In reply to comment #39)
> Created an attachment (id=41796) [details]
> Use PM QoS to prevent C-State starvation of gen3 GPU
>
> Raise you a work function.

It does not apply on top of 2.6.37, could you please prepare version for stable kernel?

Revision history for this message
In , Lambchop468 (lambchop468) wrote :

Created attachment 41814
Use PM QoS to prevent C-State starvation of gen3 GPU for 2.6.37

(In reply to comment #40)
> (In reply to comment #39)
> > Created an attachment (id=41796) [details] [details]
> > Use PM QoS to prevent C-State starvation of gen3 GPU
> >
> > Raise you a work function.
>
> It does not apply on top of 2.6.37, could you please prepare version for stable
> kernel?

Chris's patch mangled to work with 2.6.37 (two changes, s/irq_lock/user_irq_lock/ in two places)

Revision history for this message
In , Lambchop468 (lambchop468) wrote :

(In reply to comment #39)
> Created an attachment (id=41796) [details]
> Use PM QoS to prevent C-State starvation of gen3 GPU
>
> Raise you a work function.

Confirming that this works on 2.6.37 on:

Acer Aspire One 9" Netbook AOA150, 945GSE and Intel N270 Processor

testcase
vblank_mode=2 glxgears

(I probably should test with -next but don't have time at the moment)

Revision history for this message
In , anarsoul (anarsoul) wrote :

(In reply to comment #41)
> Created an attachment (id=41814) [details]
> Use PM QoS to prevent C-State starvation of gen3 GPU for 2.6.37
>
> Chris's patch mangled to work with 2.6.37 (two changes,
> s/irq_lock/user_irq_lock/ in two places)

Thanks, looks like it works.

Revision history for this message
In , anarsoul (anarsoul) wrote :

(In reply to comment #43)

> Thanks, looks like it works.

But it does not work after update to xf86-video-intel-2.14.0 :( 20-30 fps in glxgears instead of 60.

Revision history for this message
In , Lambchop468 (lambchop468) wrote :

(In reply to comment #44)
> (In reply to comment #43)
>
> > Thanks, looks like it works.
>
> But it does not work after update to xf86-video-intel-2.14.0 :( 20-30 fps in
> glxgears instead of 60.

I'm not seeing this with xf86-video-intel-2.14.0

Hmm...

libdrm-git version: bad5242a
xf86-video-intel version: 2.14.0
mesa version: 7.10
xorg-server: 1.9.3.901-1
kernel: (not vanilla) 2.6.37 + patch in attachment 41814

Revision history for this message
In , Jesse Barnes (jbarnes-virtuousgeek) wrote :

Reassigning back to Chris; doesn't look like we'll be able to find a hardware solution to this one.

Revision history for this message
In , Chris Wilson (ickle) wrote :

I've applied Alexander's patch to drm-intel-next, so please give that branch a thorough testing!

Revision history for this message
In , Chris Wilson (ickle) wrote :

Tentatively closing with the patch landing in -next.

Things to look out for:

1. fps stuttering (i.e. the reoccurrence of the original bug);

2. obscene power consumption;

3. aliens.

Revision history for this message
In , Chris Wilson (ickle) wrote :

Created attachment 42959
Twiddle INSTPM bit11

New patch time!

Revision history for this message
In , Oleksij Rempel (olerem) wrote :

I tested last patch (replace vblank PM QoS with "Interrupt-Based AGPBUSY#"),

it return first issue, fps stuttering.
power usage is ok.

Revision history for this message
In , Chris Wilson (ickle) wrote :

Created attachment 44065
Move INSTPM bit twiddling to intel_mark_busy

How about with this patch?

Revision history for this message
In , Oleksij Rempel (olerem) wrote :

no noticeable difference.

Revision history for this message
In , Lambchop468 (lambchop468) wrote :

(In reply to comment #51)
> Created an attachment (id=44065) [details]
> Move INSTPM bit twiddling to intel_mark_busy
>
> How about with this patch?

plain drm-intel-next (47ae63e) with and without this patch resulted in missing vblanks & stuttery glxgears.

As discussed on IRC, my BIOS doesn't set INSTPM_AGPBUSY_DIS (INSTPM bit 11), so this won't fix it anyway.

Revision history for this message
In , Chris Wilson (ickle) wrote :

*** Bug 37966 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Chris Wilson (ickle) wrote :
Revision history for this message
Benjamin Laisure (axxon95) wrote :
bugbot (bugbot)
tags: added: performance
Revision history for this message
In , Rodrigo-vivi (rodrigo-vivi) wrote :

Is this issue still there at new kernel? What is the latest kernel this issue was seen?

Does any one tested this better-gpu_cpufreq branch?

Revision history for this message
In , anarsoul (anarsoul) wrote :

Still here on 3.6, will test on 3.7 as soon as it get into archlinux repos

Revision history for this message
In , Chris Wilson (ickle) wrote :

No need, it's a known design feature of the power management hardware. The only question is whether we can find an acceptable workaround.

Revision history for this message
In , Chris Wilson (ickle) wrote :

*** Bug 59895 has been marked as a duplicate of this bug. ***

Revision history for this message
In , Sergio Callegari (callegar) wrote :

Thanks for pointing out so quickly the status of Bug 59895 as a duplicate of this one! This thread was an intersting read.

Changed in xserver-xorg-video-intel:
importance: Unknown → Medium
status: Unknown → Confirmed
Chris Wilson (ickle)
Changed in xserver-xorg-video-intel (Ubuntu):
status: New → Confirmed
Chris Wilson (ickle)
summary: - Ubuntu vs. Lubuntu odd GPU Performance
+ [gen3] Bad GPU performance whilst CPU is in deep sleep
Revision history for this message
In , Daniel-ffwll (daniel-ffwll) wrote :

I guess it's time to give up - the only approach with restricting the deep sleep states resulted in horrid power consumption figures ... Just wiggle your mouse a bit :(

Revision history for this message
penalvch (penalvch) wrote :

Benjamin Laisure, this bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? If so, could you please test for this with the latest development release of Ubuntu? ISO images are available from http://cdimage.ubuntu.com/daily-live/current/ .

If it remains an issue, could you please run the following command in the development release from a Terminal (Applications->Accessories->Terminal), as it will automatically gather and attach updated debug information to this report:

apport-collect -p xserver-xorg-video-intel REPLACE-WITH-BUG-NUMBER

Please note, given that the information from the prior release is already available, doing this on a release prior to the development one would not be helpful.

Thank you for your understanding.

Helpful bug reporting tips:
https://wiki.ubuntu.com/ReportingBugs

Changed in xserver-xorg-video-intel (Ubuntu):
importance: Undecided → Low
status: Confirmed → Incomplete
Revision history for this message
In , Lambchop468 (lambchop468) wrote :
Revision history for this message
In , Oleksij Rempel (olerem) wrote :

I'll be able to test them in 2-3 weeks.

Changed in xserver-xorg-video-intel:
status: Confirmed → Won't Fix
Revision history for this message
In , Chris Wilson (ickle) wrote :

As a reminder to myself, my only surviving non-pnv machine (915gm) has a processor that does not support C-states (only speedstep). I tried the patches and only keeping the CPU at maximum is sufficient to hit glxgears vrefresh.

Changed in xserver-xorg-video-intel:
status: Won't Fix → Incomplete
Revision history for this message
In , Oleksij Rempel (olerem) wrote :

So, i can test it.
Are there any place where i can pull all patches together? On top of which branch should i test?

Revision history for this message
In , Ville-syrjala-e (ville-syrjala-e) wrote :

(In reply to comment #65)
> So, i can test it.
> Are there any place where i can pull all patches together? On top of which
> branch should i test?

I pushed the patches here:
git://gitorious.org/vsyrjala/linux.git agpbusy

I also reorganized them so it's easy to revert the top commit, which is something you might as well try in case there's no improvement with the branch as is.

Revision history for this message
In , Oleksij Rempel (olerem) wrote :

Hmm... i do not see noticeable changes.
I tested this patches on ubuntu 13.10 with unity/compize deskotop.
Glxgears show same performance before and after patches - about 58fps.
C4ATM usage seems to be identical too.

Do you have some suggestions what should i test?

Revision history for this message
In , Chris Wilson (ickle) wrote :

It would be easier to reproduce on a bare X.

If you do from a vt:

sudo service ligthdm stop
sudo Xorg -ac -noreset & sleep 3; DISPLAY=:0 xterm

then launch glxgears from the xterm, does it show the behaviour we need to fix?
i.e. runs at below refresh rate unless there is another source of interrupts (e.g. wiggling the mouse)?

If you can reproduce that, we can begin to test the patches.

Revision history for this message
In , Oleksij Rempel (olerem) wrote :

No, i can't reproduce initial bug.
After powertop optimisation i get about 20 wk/s. Just to make sure the suystem is idle.
On plain Xorg i get 125fps. Without any glitches.
With and without patches i get same results.

Revision history for this message
In , Chris Wilson (ickle) wrote :

(In reply to comment #69)
> On plain Xorg i get 125fps. Without any glitches.
> With and without patches i get same results.

Ah, that's broken - we are not using vsync. Presumably it failed to get permission to open /dev/dri/card0 and so is using indirect rendering (which does not respect vsync).

Try "LIBGL_DEBUG=1 glxinfo" and see if (a) reports indirect rendering and (b) why.

Revision history for this message
In , Oleksij Rempel (olerem) wrote :

You was right, there was no access to dri.
Now i tested it with sudo glxgears.
So results are absolutely unusable. with moving mouse fps will drop to 5fps. With moving mouse - 60fps. Results are same, before and after this patch set.

Revision history for this message
In , Oleksij Rempel (olerem) wrote :

Typo in previous comment:
without mouse - 5fps
with mouse - 60fps

Revision history for this message
In , Oleksij Rempel (olerem) wrote :

If it will some how help, i can give ssh access to this machine.

Revision history for this message
In , Ville-syrjala-e (ville-syrjala-e) wrote :

(In reply to comment #71)
> You was right, there was no access to dri.
> Now i tested it with sudo glxgears.
> So results are absolutely unusable. with moving mouse fps will drop to 5fps.
> With moving mouse - 60fps. Results are same, before and after this patch set.

Hmm. Was it running fullscreen or under a GL compositor that page flips?

Something like: 'vblank_mode=3 glxgears -fullscreen' should force it to do what we want, assuming your wm isn't totally crap.

Revision history for this message
In , Chris Wilson (ickle) wrote :

(In reply to comment #74)
> (In reply to comment #71)
> > You was right, there was no access to dri.
> > Now i tested it with sudo glxgears.
> > So results are absolutely unusable. with moving mouse fps will drop to 5fps.
> > With moving mouse - 60fps. Results are same, before and after this patch set.
>
> Hmm. Was it running fullscreen or under a GL compositor that page flips?

Windowed under bare X.

> Something like: 'vblank_mode=3 glxgears -fullscreen' should force it to do
> what we want, assuming your wm isn't totally crap.

We don't need to force fullscreen to cause us to loose vblank interrupts whilst the processor is asleep (and so render very slowly).

Revision history for this message
In , Ville-syrjala-e (ville-syrjala-e) wrote :

(In reply to comment #75)
> We don't need to force fullscreen to cause us to loose vblank interrupts
> whilst the processor is asleep (and so render very slowly).

Oh right. Not sure where I got the idea that we wouldn't use vblank irqs unless fullscreen.

After thinking about this for a while I started to question why we're frobbing the AGPBUSY bit all the time. It won't force an exit from C3 unless there's a pending interrupt, so we should just be able to leave it on all the time.

I pushed that idea here:
git://gitorious.org/vsyrjala/linux.git agpbusy2

I guess the chances of it working are slim, but migth as well try.

Revision history for this message
In , Oleksij Rempel (olerem) wrote :

kernel 3.13.0-00966-gec441a0, same result. 5-10fps on idle system, and 60fps with moving mouse.

Revision history for this message
In , Daniel-ffwll (daniel-ffwll) wrote :

Yeah I guess that's it, time to give up on this one. Wiggling the mouse or running with wayland should fix this.

Thanks for reporting this bug and testing ideas, sorry that we couldn't make this work :(

Changed in xserver-xorg-video-intel:
status: Incomplete → Won't Fix
Revision history for this message
In , Ville-syrjala-e (ville-syrjala-e) wrote :

I got fed up with my 945gm not being capabile of 60fps glxgears.

commit d938da6b132a2d6addeba4c57a67ec3c07824843
Author: Ville Syrjälä <email address hidden>
Date: Fri Mar 22 20:08:03 2019 +0200

    drm/i915: Disable C3 when enabling vblank interrupts on i945gm

The main difference compared to the older pm_qos attempts is that I found a way to dig out the exact c3 disable latency, so we should have a reasonable guarantee that we do disable c3 but not c2. The power cost of not using c3 seems to be about 0.7W on my machine (with the display on), so this isn't exactly cheap :(

I did spend quite a bit of time at some point digging through the chipset docs (such as they are). It's been a while since I did that but I'll try to summarize what I recall; Gen3 introduced some kind of new mechanism by which the gmch can wake up the CPU. The old AGPBUSY/PM_BUSY involved the ICH as well IIRC, whereas the new mechanism supposedly does not. IIRC the new mechanism already appears in the i915gm docs, but my theory is that i945gm is where it actually got into use and either it is broken or we're missing some magic undocumented bit somewhere. I did try (blindly if necessary) poking at various registers that seemed relevant. Alas, I was unable to find a magic bit to make C3+vblank interrupts cooperate.

Changed in xserver-xorg-video-intel:
status: Won't Fix → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.