[Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu

Bug #1152484 reported by TienFu Chen on 2013-03-08
90
This bug affects 17 people
Affects Status Importance Assigned to Milestone
HWE Next
Medium
Unassigned
linux (Ubuntu)
Medium
Unassigned
Nominated for Precise by James M. Leddy
Nominated for Quantal by James M. Leddy

Bug Description

suspend_30_cycles.log is attached.

ProblemType: Bug
DistroRelease: Ubuntu 12.04
Package: linux-image-3.5.0-25-generic 3.5.0-25.39~precise1
ProcVersionSignature: Ubuntu 3.5.0-25.39~precise1-generic 3.5.7.4
Uname: Linux 3.5.0-25-generic x86_64
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.25.
ApportVersion: 2.0.1-0ubuntu17.1
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 1: Generic [HD-Audio Generic], device 0: STAC92xx Analog [STAC92xx Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC1: ubuntu 1577 F.... pulseaudio
 /dev/snd/controlC0: ubuntu 1577 F.... pulseaudio
Card0.Amixer.info:
 Card hw:0 'HDMI'/'HDA ATI HDMI at 0xf0344000 irq 46'
   Mixer name : 'ATI R6xx HDMI'
   Components : 'HDA:1002aa01,00aa0100,00100300'
   Controls : 6
   Simple ctrls : 1
Card0.Amixer.values:
 Simple mixer control 'IEC958',0
   Capabilities: pswitch pswitch-joined penum
   Playback channels: Mono
   Mono: Playback [on]
Card1.Amixer.info:
 Card hw:1 'Generic'/'HD-Audio Generic at 0xf0340000 irq 16'
   Mixer name : 'IDT 92HD99BXX'
   Components : 'HDA:111d76e5,103c1937,00100303'
   Controls : 18
   Simple ctrls : 10
Date: Fri Mar 8 02:43:22 2013
HibernationDevice: RESUME=UUID=b745d689-656e-45a5-b635-c37b6ac2888f
InstallationMedia: Ubuntu 12.04.2 LTS "Precise Pangolin" - Release amd64 (20130213)
MachineType: Hewlett-Packard HP Pavilion Sleekbook 14 PC
MarkForUpload: True
ProcEnviron:
 TERM=xterm
 PATH=(custom, no user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 VESA VGA
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-3.5.0-25-generic root=UUID=7b5bc75d-d89d-454f-84f6-18291d5a1735 ro quiet splash initcall_debug vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-3.5.0-25-generic N/A
 linux-backports-modules-3.5.0-25-generic N/A
 linux-firmware 1.79.1
SourcePackage: linux
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 11/01/2012
dmi.bios.vendor: Insyde
dmi.bios.version: F.13
dmi.board.asset.tag: Base Board Asset Tag
dmi.board.name: 1937
dmi.board.vendor: Hewlett-Packard
dmi.board.version: 87.0B
dmi.chassis.type: 10
dmi.chassis.vendor: Hewlett-Packard
dmi.chassis.version: Chassis Version
dmi.modalias: dmi:bvnInsyde:bvrF.13:bd11/01/2012:svnHewlett-Packard:pnHPPavilionSleekbook14PC:pvr0892110000005900000320100:rvnHewlett-Packard:rn1937:rvr87.0B:cvnHewlett-Packard:ct10:cvrChassisVersion:
dmi.product.name: HP Pavilion Sleekbook 14 PC
dmi.product.version: 0892110000005900000320100
dmi.sys.vendor: Hewlett-Packard

TienFu Chen (ctf) wrote :

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v3.9 kernel[0] (Not a kernel in the daily directory) and install both the linux-image and linux-image-extra .deb packages.

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

If you are unable to test the mainline kernel, for example it will not boot, please add the tag: 'kernel-unable-to-test-upstream'.
Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.9-rc1-raring/

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Jeff Lane (bladernr) wrote :

The system did not fail suspend 30 times, it's FWTS errors in the log.

Changed in linux (Ubuntu):
assignee: nobody → Firmware Test Suite Bug Team (firmware-bug-team)
Changed in linux (Ubuntu):
assignee: Firmware Test Suite Bug Team (firmware-bug-team) → Ivan Hu (ivan.hu)
Anthony Wong (anthonywong) wrote :

"[Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu"

This is a BIOS bug, very common on AMD platforms.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Ara Pulido (ara) wrote :

We would need confirmation from one of the BIOS engineers whether we should be caring about this issue for certification or not

James M. Leddy (jm-leddy) wrote :

Hi Ivan,

Would you comment as to whether this bug should block system certification or not?

Ivan Hu (ivan.hu) wrote :

In this case the BIOS assigns both offsets for MCE (0xf9) and IBS
(0x400) vectors to offset 0, which is why the second APIC setup (IBS)
failed.

According to AMD spec, AMD64 Architecture Programmer’s Manual Volume 2, the IBS seems a software debug and performance monitor facility. Since the S3 function is normal, I don't see any reasons which block the certification.

Colin Ian King (colin-king) wrote :

@Ivan, if this is a common "false positive" it may be worth adding some more intelligence into the pattern matching in the klog database to explain these warning messages and maybe to lower the warning level.

TienFu Chen (ctf) wrote :

As confirmed by Ivan, removing blocks-hwcert tags.

tags: removed: blocks-hwcert blocks-hwcert-enablement
Robert Richter (rric) wrote :

This occurs after suspend/resume on AMD family 10h systems. The upstream kernel (v3.9-rc3) should be also affected.

The BIOS is known to report wrong (zero'ed) lvt offsets for mce and ibs which are set to the same value of 0. The BIOS wont be fixed. There is a quirk in the kernel that fixes this (force_ibs_eilvt_setup()). The fix reassigns the ibs lvt offet to 1 (instead of 0 that conflicts with mce threshold). The register reassignment must be applied on each cpu.

Current quirk does this during boot time, but after suspend/resume the register contents get lost and is not reinitialized anymore. Thus, after resume the apic setup code detects the offset conflict caused be the reset registers and throws the [hardware error] message.

In a result the ibs lvt offset is not properly initialized and using ibs (used for hardware monitoring by the perf and oprofile tool) may fail. The system should otherwise function correctly.

I will try to submit a patch upstream next weeks that fixes the problem.

-Robert

tags: added: blocks-hwcert-enablement
Changed in hwe-next:
status: New → Confirmed
importance: Undecided → Medium
Changed in linux (Ubuntu):
assignee: Ivan Hu (ivan.hu) → nobody
summary: - system fails suspend_30_cycles
+ [Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector
+ 0x400, but the register is already in use for vector 0xf9 on another cpu

I have this issue on 2 systems, both AMD (AM3 socket), on asus boards, cpu is a Phenom II 965, and are using nvidia GPUs

~$ dmesg | grep Firmware
[ 180.567295] [Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
[ 180.580493] [Firmware Bug]: cpu 2, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
[ 180.593659] [Firmware Bug]: cpu 3, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
[ 1084.886504] [Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
[ 1084.899717] [Firmware Bug]: cpu 2, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
[ 1084.912870] [Firmware Bug]: cpu 3, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
~$ uname -r
3.9.0-030900-generic

also on the 3.9.2 kernel
~$ dmesg | grep -i firmware
[ 100.816643] [Firmware Bug]: cpu 1, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
[ 100.829849] [Firmware Bug]: cpu 2, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
[ 100.843014] [Firmware Bug]: cpu 3, try to use APIC500 (LVT offset 0) for vector 0x400, but the register is already in use for vector 0xf9 on another cpu
~$ uname -r
3.9.2-030902-generic

James M. Leddy (jm-leddy) wrote :

Since this will only affect profile information and that is not required for cert, I'm marking this invalid.

Changed in linux (Ubuntu):
status: Confirmed → Invalid
Changed in hwe-next:
status: Confirmed → Invalid
Samic (i-samic) on 2013-12-11
Changed in hwe-next:
status: Invalid → Confirmed
Changed in linux (Ubuntu):
status: Invalid → Confirmed
Changed in hwe-next:
status: Confirmed → Invalid

Samic, so your hardware may be tracked, could you please file a new report by executing the following in a terminal while booted into a Ubuntu repository kernel (not a mainline one) via:
ubuntu-bug linux

For more on this, please read the official Ubuntu documentation:
Ubuntu Bug Control and Ubuntu Bug Squad: https://wiki.ubuntu.com/Bugs/BestPractices#X.2BAC8-Reporting.Focus_on_One_Issue
Ubuntu Kernel Team: https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Filing_Kernel_Bug_reports
Ubuntu Community: https://help.ubuntu.com/community/ReportingBugs#Bug_reporting_etiquette

When opening up the new report, please feel free to subscribe me to it.

Thank you for your understanding.

Changed in linux (Ubuntu):
status: Confirmed → Invalid
Robert Richter (rric) wrote :

I posted a fix today on lkml:

 https://<email address hidden>

-Robert

Changed in linux (Ubuntu):
status: Invalid → Confirmed
Colin Ian King (colin-king) wrote :

Upstream commit has Robert's fix: bee09ed91cacdbffdbcd3b05de8409c77ec9fcd6

Hello :)

I is realy fixed??

I am on kernel 4.3.0 and my AMD trinity CPU is throthling 100% often I did not manage to notice when this happens, only that computer gets hot.

ec 07 16:23:08 NORMANDY kernel: smpboot: CPU 2 is now offline
Dec 07 16:23:08 NORMANDY kernel: smpboot: CPU 3 is now offline
Dec 07 16:23:08 NORMANDY kernel: ACPI: Low-level resume complete
Dec 07 16:23:08 NORMANDY kernel: ACPI : EC: EC started
Dec 07 16:23:08 NORMANDY kernel: PM: Restoring platform NVS memory
Dec 07 16:23:08 NORMANDY kernel: Using NULL legacy PIC
Dec 07 16:23:08 NORMANDY kernel: LVT offset 0 assigned for vector 0x400
Dec 07 16:23:08 NORMANDY kernel: [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0) for vector 0xf9, but the register is already in use for vect
Dec 07 16:23:08 NORMANDY kernel: [Firmware Bug]: cpu 0, failed to setup threshold interrupt for bank 4, block 0 (MSR00000413=0xc000000001000000)
Dec 07 16:23:08 NORMANDY kernel: [Firmware Bug]: cpu 0, try to use APIC500 (LVT offset 0) for vector 0xf9, but the register is already in use for vect
Dec 07 16:23:08 NORMANDY kernel: [Firmware Bug]: cpu 0, failed to setup threshold interrupt for bank 4, block 1 (MSRC0000408=0xc000000001000000)
Dec 07 16:23:08 NORMANDY kernel: Enabling non-boot CPUs ...
Dec 07 16:23:08 NORMANDY kernel: x86: Booting SMP configuration:
Dec 07 16:23:08 NORMANDY kernel: smpboot: Booting Node 0 Processor 1 APIC 0x11
Dec 07 16:23:08 NORMANDY kernel: [Firmware Info]: CPU: Re-enabling disabled Topology Extensions Support.

Dec 07 16:23:08 NORMANDY kernel: xhci_hcd 0000:00:10.0: port 0 resume PLC timeout

Robert Richter (rric) wrote :

The kernel fix is only for AMD family 10h cpus.

Trinity is family 15h. So your [Firmware Bug] message indicates a real firmware bug. What you face is a different vector assignment to the same lvt offset across cpus. On all cpus the same vector must be assigned to the same offset. Firmware should assign vector 0xf9 (MCE threshold) to offset 1 instead, since offset 0 is already used for vector 400 (IBS NMI).

-Robert

Johansen (johansense) wrote :

I had a functioning system for about a month, fresh install of 14.04, before ubuntu stopped suspending.
cpu is A6-5400k, ubuntu 14.04. presently updating to 15.10 right now.

error reported is same as in comment #14, but only cpu0 is listed. will report back if 15.10 also has same problem.

Johansen (johansense) wrote :

upgraded to 15.10, same problem, however the system did suspend a few times successfully. after startup system can be suspended once and only once.

system is presently held up by about 45 instances gst-plugin-scanner, an unkillable process.

Linux (palswim) wrote :

Sounds like rric has fixed a harmless error message, but that a firmware bug still exists, from comment: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1152484/comments/20

Are you suggesting that we should open a new bug report for the actual bug? That would cause no small amount of confusion because searching for the error text will lead directly to this report.

I am running an A8-4500M AMD processor.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers