if only VGA is connected, Nouveau fails to initialise

Bug #1013270 reported by C de-Avillez
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Triaged
High
Unassigned

Bug Description

Running on a HP Z400, with a nVidia GF100 Quadro 4000 card. If only VGA is connected, system initialisation never completes, with the log showing the following:

Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.029026] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 128
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.029030] Raw EDID:
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.029032] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.029034] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.029036] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.029038] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.029040] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.029042] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.029044] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.029045] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.127639] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 128
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.127643] Raw EDID:
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.127644] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.127646] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.127648] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.127650] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.127652] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.127654] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.127656] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.127658] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.226422] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 128
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.226426] Raw EDID:
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.226428] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.226431] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.226433] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.226435] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.226437] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.226439] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.226441] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.226444] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.324953] [drm:drm_edid_block_valid] *ERROR* EDID checksum is invalid, remainder is 128
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.324957] Raw EDID:
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.324958] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.324961] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.324963] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.324965] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.324967] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.324969] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.324971] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.324973] ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.324977] nouveau 0000:0f:00.0: DVI-I-1: EDID block 0 invalid.
Jun 13 17:58:45 precise-server-x86-64 kernel: [ 7.324979] [drm] nouveau 0000:0f:00.0: DDC responded, but no EDID for DVI-I-1

that repeats every few seconds.

ProblemType: BugDistroRelease: Ubuntu 12.04
Package: linux-image-3.2.0-25-generic 3.2.0-25.40
ProcVersionSignature: Ubuntu 3.2.0-25.40-generic 3.2.18
Uname: Linux 3.2.0-25-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.24.
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.0.1-0ubuntu8
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/by-path', '/dev/snd/controlC1', '/dev/snd/hwC1D0', '/dev/snd/hwC1D1', '/dev/snd/hwC1D2', '/dev/snd/hwC1D3', '/dev/snd/pcmC1D3p', '/dev/snd/pcmC1D7p', '/dev/snd/pcmC1D8p', '/dev/snd/pcmC1D9p', '/dev/snd/controlC0', '/dev/snd/hwC0D0', '/dev/snd/pcmC0D0c', '/dev/snd/pcmC0D0p', '/dev/snd/pcmC0D2c', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info: Error: [Errno 2] No such file or directory
Card0.Amixer.values: Error: [Errno 2] No such file or directory
Card1.Amixer.info: Error: [Errno 2] No such file or directory
Card1.Amixer.values: Error: [Errno 2] No such file or directory
Date: Thu Jun 14 16:01:33 2012
HibernationDevice: RESUME=UUID=799ee7fe-d3c9-4b53-ae26-2e97ac78ec4a
IwConfig: Error: [Errno 2] No such file or directory
MachineType: Hewlett-Packard HP Z400 Workstation
ProcFB: 0 nouveaufb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.2.0-25-generic root=UUID=76296454-3f76-4cb7-9cfd-1380c3a120c2 ro quiet
RelatedPackageVersions:
 linux-restricted-modules-3.2.0-25-generic N/A
 linux-backports-modules-3.2.0-25-generic N/A
 linux-firmware 1.79
RfKill: Error: [Errno 2] No such file or directorySourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 11/02/2011
dmi.bios.vendor: Hewlett-Packard
dmi.bios.version: 786G3 v03.54
dmi.board.asset.tag: 2UA2200P3V
dmi.board.name: 0B4Ch
dmi.board.vendor: Hewlett-Packard
dmi.board.version: D
dmi.chassis.asset.tag: 2UA2200P3V
dmi.chassis.type: 6
dmi.chassis.vendor: Hewlett-Packard
dmi.modalias: dmi:bvnHewlett-Packard:bvr786G3v03.54:bd11/02/2011:svnHewlett-Packard:pnHPZ400Workstation:pvr:rvnHewlett-Packard:rn0B4Ch:rvrD:cvnHewlett-Packard:ct6:cvr:
dmi.product.name: HP Z400 Workstation
dmi.sys.vendor: Hewlett-Packard

Revision history for this message
C de-Avillez (hggdh2) wrote :
Brad Figg (brad-figg)
Changed in linux (Ubuntu):
status: New → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.5kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag(Only that one tag, please leave the other tags). This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.5-rc2-quantal/

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: needs-upstream-testing
Changed in linux (Ubuntu):
status: Confirmed → Incomplete
tags: added: kernel-da-key
Revision history for this message
C de-Avillez (hggdh2) wrote :

This is similar to bug 712075; the most important difference is that here the EDID is all 0xff (which is to say, no real EDID).

I will try the mainline in a few, right now regressing thru the available Ubuntu versions. So far it happens on Quantal, Precise, Oneiric.

Revision history for this message
C de-Avillez (hggdh2) wrote :

Fails with Ubuntu kernels from Natty to Quantal; succeeds on Lucid, but probably because it defaults to the VESA driver.

Mainline kernel also fails; I have attached the mainline dmseg.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
C de-Avillez (hggdh2) wrote :
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report at bugzilla.kernel.org [1]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

If you are comfortable with opening a bug upstream, It would be great if you can report back the upstream bug number in this bug report. That will allow us to link this bug to the upstream report.

[1] https://wiki.ubuntu.com/Bugs/Upstream/kernel

Changed in linux (Ubuntu):
status: Confirmed → Triaged
tags: removed: needs-upstream-testing
Revision history for this message
Gema Gomez (gema) wrote :

This issue is a blocker for testing LTS HWE. Is there any chance that you guys could speed it up rather than having to wait for a fix from upstream? Or do you have contact with someone there that could help expedit this?

Changed in linux (Ubuntu):
importance: Medium → High
tags: added: kernel-key
tags: added: quantal
Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Just adding some initial thoughts which I've also relayed via IRC/mumble discussions...

It would appear to me that the scope of this issue is much wider than just the the LTS HWE Quantal testing in Precise. Based on comment #4, I'd expect this issue to present for all releases from Natty->Quantal. That being said it would also seem to me that the most obvious workaround in the mean time would be to make sure a monitor is connected. That would at least allow for this system to re-enter the testing pool.

I am now however receiving completely contradictory information between this bug report and IRC conversations:

From #ubuntu-kernel on FreeNode:
[10:12:56] <ogasawara> hggdh: and just so I have it clear in my head, when exhibiting this issue, the system in question is just connected to a kvm swtich, but no monitor is attached to that switch correct?
[10:13:36] <hggdh> ogasawara: the switch does have a monitor
[10:28:03] <ogasawara> hggdh: so I wanted to post a comment to the bug, but I'm still a little confused as to where the "if no monitor is connected..." part comes into play then.
[10:30:31] <hggdh> ogasawara: the machines connect to a KVM (in this case using a DVI<-> VGA converter; the KVM has a local monitor that allows one to connect and use a system; it also allows one to remotely connect to the KVM, and select a system to work on
[10:33:03] <hggdh> amd firepro V5900 prints out a few EDID errors and is done; system can be used from the console. nVidia keeps on printing EDID errors, making console usage impossible (server images, 3.5.0-10, now installing desktop)
[10:45:58] <ogasawara> hggdh: so a monitor *is* connected (albeit via a kvm switch) when you're seeing these repeated EDID errors.
[10:55:46] <hggdh> ogasawara: indeed
[10:56:45] <ogasawara> hggdh: so the bug report then is completely misleading with the title and description being "if no monitor is connected..."
[10:57:43] <hggdh> ogasawara: I agree

It would be helpful to get a clear and updated bug title and description of what is actually happening.

In the mean time, it would seem there might be a proposed patch upstream which may help. I'll try building a test kernel.

https://patchwork.kernel.org/patch/1301341/

Revision history for this message
C de-Avillez (hggdh2) wrote :

Based on our chat you could also have updated the title and description.

Anyway. Behaviour under 3.5.0-10 is different -- I only see a sequence of EDID errors on either server or desktop early in the boot process; server console is usable after install; desktop does complete boot and shows the login screen, but the KVM shows the screen losing sync and repainting every few seconds. Both server and desktop are SSH-accessible.

summary: - if no monitor is connected, Nouveau fails to initialise
+ if only VGA is connected, Nouveau fails to initialise
description: updated
Revision history for this message
C de-Avillez (hggdh2) wrote :

Of the machines I tested today on magners only the nVidia Quadro still shows continuous EDID errors (but is accessible via SSH, so initialisation completed). This was on Quantal, still to install other versions.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

I'm slightly confused again by comments #9 and #10.... Comment #9 says "I only see a sequence of EDID errors on either server or desktop early in the boot process; server console is usable after install;..." This led me to believe the repeated flood of EDID messages to the console is no longer present with the 3.5.0-10 Quantal kernel, ie. you'd only see a finite number of messages upon boot. However, comment #10 seems to indicate just the opposite, ie. "the nVidia Quadro still shows continuous EDID errors (but is accessible via SSH, so initialisation completed). This was on Quantal...". I'm not quite sure what to make of these contradictory statements. Could you maybe clarify? Is comment #9 referring to different hardware?, eg the amd firepro V5900?

If you are still seeing the repeated flood of EDID messages to the console with the latest 3.5.0-10.10 Quantal kernel, please give the following Quantal test kernel a try and let us know your results. Thanks.

http://people.canonical.com/~ogasawara/lp1013270/amd64/

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Please note, if you install the Quantal test kernel from comment #11, please make sure to install both the linux-image and linux-image-extra packages.

Revision history for this message
C de-Avillez (hggdh2) wrote :

Yes, there are different video cards. So far only the nVidia Quadro keeps on showing issues. I have not tested all hardware, though.

Revision history for this message
C de-Avillez (hggdh2) wrote :

This is on the nVidia Quadro system.

Behaviour changed from -10 to -11: I no longer see the EDID dumps, only a series of lines like:

Aug 16 20:20:21 nvidia-quadro4000 kernel: [ 2721.010452] [drm] nouveau 0000:0f:00.0: DDC responded, but no EDID for DVI-I-1

The screen showed by the KVM still flickers and resyncs every few seconds.

The attached syslog shows both.

tags: removed: kernel-key
Para Siva (psivaa)
tags: added: rls-q-incoming
penalvch (penalvch)
tags: added: bios-outdated-v03.57
tags: added: oneiric
tags: added: natty
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.