NVIDIA X driver causes flickering, crashes or hangs

Bug #717873 reported by Chris on 2011-02-12
This bug affects 6 people
Affects Status Importance Assigned to Milestone
NVIDIA Drivers Ubuntu
nvidia-graphics-drivers (Ubuntu)

Bug Description

Doing nothing special, my display flickered and those messages appeared in the log files:

[ 25121.917] (WW) NVIDIA(0): The NVIDIA X driver has encountered too many errors. Falling
[ 25121.917] (WW) NVIDIA(0): back to write-back cached memory.

Feb 12 21:26:57 obiwan kernel: [25078.679920] NVRM: Xid (0001:00): 6, PE0001
Feb 12 21:26:57 obiwan kernel: [25078.701158] NVRM: Xid (0001:00): 6, PE0001
Feb 12 21:26:57 obiwan kernel: [25078.721347] NVRM: Xid (0001:00): 6, PE0001
Feb 12 21:26:57 obiwan kernel: [25078.741324] NVRM: Xid (0001:00): 6, PE0001
Feb 12 21:26:57 obiwan kernel: [25078.761275] NVRM: Xid (0001:00): 6, PE0001
Feb 12 21:26:57 obiwan kernel: [25078.781157] NVRM: Xid (0001:00): 6, PE0001
Feb 12 21:26:57 obiwan kernel: [25078.801296] NVRM: Xid (0001:00): 6, PE0001
Feb 12 21:26:57 obiwan kernel: [25078.821163] NVRM: Xid (0001:00): 6, PE0001

I'm using an (lspci -vv):
01:00.0 VGA compatible controller: nVidia Corporation G92 [Quadro FX 2800M] (rev a2) (prog-if 00 [VGA controller])
        Subsystem: Dell Device 02ef
        Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Interrupt: pin A routed to IRQ 16
        Region 0: Memory at f5000000 (32-bit, non-prefetchable) [size=16M]
        Region 1: Memory at e0000000 (64-bit, prefetchable) [size=256M]
        Region 3: Memory at f2000000 (64-bit, non-prefetchable) [size=32M]
        Region 5: I/O ports at df00 [size=128]
        [virtual] Expansion ROM at f6e00000 [disabled] [size=128K]
        Capabilities: [60] Power Management version 3
                Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
                Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
        Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
                Address: 0000000000000000 Data: 0000
        Capabilities: [78] Express (v2) Endpoint, MSI 00
                DevCap: MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <4us
                        ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
                DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
                        RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+
                        MaxPayload 128 bytes, MaxReadReq 256 bytes
                DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
                LnkCap: Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Latency L0 <512ns, L1 <1us
                        ClockPM- Surprise- LLActRep- BwNot-
                LnkCtl: ASPM L1 Enabled; RCB 128 bytes Disabled- Retrain- CommClk+
                        ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
                LnkSta: Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
                DevCap2: Completion Timeout: Not Supported, TimeoutDis+
                DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
                LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-, Selectable De-emphasis: -6dB
                         Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
                         Compliance De-emphasis: -6dB
                LnkSta2: Current De-emphasis Level: -6dB
        Capabilities: [100 v1] Virtual Channel
                Caps: LPEVC=0 RefClk=100ns PATEntryBits=1
                Arb: Fixed- WRR32- WRR64- WRR128-
                Ctrl: ArbSelect=Fixed
                Status: InProgress-
                VC0: Caps: PATOffset=00 MaxTimeSlots=1 RejSnoopTrans-
                        Arb: Fixed- WRR32- WRR64- WRR128- TWRR128- WRR256-
                        Ctrl: Enable+ ID=0 ArbSelect=Fixed TC/VC=01
                        Status: NegoPending- InProgress-
        Capabilities: [128 v1] Power Budgeting <?>
        Capabilities: [600 v1] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
        Kernel driver in use: nvidia
        Kernel modules: nvidia-current, nvidia-173, nouveau, nvidiafb

System is Kubuntu 10.10:
#uname -a
Linux obiwan 2.6.35-25-generic #44-Ubuntu SMP Fri Jan 21 17:40:44 UTC 2011 x86_64 GNU/Linux

The driver is
nvidia-current 270.18-0ubuntu1~maverick~xup1

bugbot (bugbot) on 2011-02-16
tags: added: kubuntu
Chris (mail-christianmayer) wrote :

The X server is also sometimes reseting itself (and thus closing the current X session) or even freezing the system. syslog entries at the relevant time:

Feb 18 23:07:08 obiwan kernel: [ 2582.965683] NVRM: Xid (0001:00): 6, PE0001
Feb 18 23:07:08 obiwan kernel: [ 2582.985837] NVRM: Xid (0001:00): 6, PE0001
Feb 18 23:07:08 obiwan kernel: [ 2583.005110] NVRM: Xid (0001:00): 13, 0001 00000000 00005097 0000020c b357e460 0000000c
Feb 18 23:08:12 obiwan kernel: [ 2646.812524] NVRM: Xid (0001:00): 6, PE0001
Feb 18 23:08:12 obiwan kernel: [ 2646.835663] NVRM: Xid (0001:00): 13, 0001 00000000 00005097 000015e4 ffe9e8e8 00000005
Feb 18 23:09:53 obiwan kernel: [ 2747.659517] NVRM: Xid (0001:00): 6, PE0001
Feb 18 23:09:53 obiwan kernel: [ 2747.680985] NVRM: Xid (0001:00): 6, PE0001
Feb 18 23:10:20 obiwan kernel: [ 2774.711218] dell-wmi: Received unknown WMI event (0x11)
Feb 18 23:10:21 obiwan kdm[1256]: X server for display :0 terminated unexpectedly

b2ag (thomas-b2ag) wrote :

I got this "NVRM: Xid (0001:00): 6, PE0001" in dmesg for second time now and i think it is related to faulty hardware. My laptop began to behave odd short after service by manufactor. I also had some few freezes, laptop not returning from suspend and hanging on entering suspend. These are all randomly occurring problems which used to work pretty stable before last service. Seems to be also pretty normal for this manufactor to send people destroying my hardware when they should fix stuff by replacing some parts.

I did some searching on this "NVRAM: Xid ..." and got 100% hardware related issues:

Sorry for these bad news.

pepre (me-pepre) wrote :

After upgrading to 8GB RAM i got the same error (0) and X segfaulted.

According to (1) i switched off the dual channel mode and the problem disappeared.

So i think this is not - as assumed above - a HW failure but a problem of the nvidia module which is confused by large RAM in dualchannel mode.

(0) NVRM: Xid (0000:03:00): 6, PE0001
(1) https://bbs.archlinux.de/viewtopic.php?id=18836

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in nvidia-graphics-drivers (Ubuntu):
status: New → Confirmed
linas (linasvepstas) wrote :

Recently (the last few weeks) a desktop here has been hanging in the middle of the night. The last thing in syslog before the hang is "NVRM: Xid (0001:00): 6, PE0001" This is on lucid.

linas (linasvepstas) wrote :

Out of 4 hangs, its the last message twice. It does not appear anywhere else in a weeks worth of logs.

linas (linasvepstas) wrote :

For the other mystery hangs, the last message in syslog is this:

 type=1503 audit(1349642278.078:46): operation="open" pid=5234 parent=5145 profile="/usr/lib/firefox/firefox{,*[^s][^h]}" requested_mask="::rw" denied_mask="::rw" fsuid=1002 ouid=0 name="/dev/nvidiactl"

No clue why "audit" is reporting a denied(?) request by firefox to write to nvidia, and then hanging ... Hmmm don't know what to make of this.

summary: - NVIDIA X driver causes flickering
+ NVIDIA X driver causes flickering, crashes or hangs
Calin Chiorean (celin) wrote :

I can confirm what pepre wrote. Kernel 3.9, nvidia-drivers-310, Ubuntu 13.04, PC stuck no matter what solution I did tried.
I did remove 2 GB RAM and it did work immediately.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers