x86: add support for AMD Rome

Bug #1819485 reported by Kim Naru on 2019-03-11
24
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Andrea Righi
Bionic
Undecided
Unassigned

Bug Description

[Impact]

* Some upstream patches are missing to properly support the AMD Rome architecture, without them the affected systems cannot work correctly.

[Test Case]

* No test case provided (tests have been made by the bug reporter using the affected platform)

[Fix]

* The following commits are required to properly support this architecture:

818b7587b4d3 x86: irq_remapping: Move irq remapping mode enum
e881dbd5d4a6 iommu/amd: Add support for higher 64-bit IOMMU Control Register
90fcffd9cf5e iommu/amd: Add support for IOMMU XT mode
210ba1201ff9 hwmon/k10temp: Add support for AMD family 17h, model 30h CPUs
be3518a16ef2 x86/amd_nb: Add PCI device IDs for family 17h, model 30h
556e4c62baff x86/amd_nb: Add support for newer PCI topologies
dedf7dce4cec hwmon/k10temp, x86/amd_nb: Consolidate shared device IDs
60c8144afc28 x86/MCE/AMD: Fix the thresholding machinery initialization order

[Regression Potential]

* All the patches have been applied upstream and they have been tested on the newly supported platform with positive feedback.

[Original bug report]

The following patches will bring Rome support into 18.04. The patches are all localized to AMD source code.

CCP/PSP:
dcbc0c6e4aa1ef269179351ac615fd08ddefc849 crypto: ccp - Add support for new CCP/PSP device ID
ad01a984f512c42c9f4fe79d36b9cddbc6156a3f crypto: ccp - Support register differences between PSP devices

A few others:
210ba1201ff950b3d05bfd8fa5d47540cea393c0 hwmon/k10temp: Add support for AMD family 17h, model 30h CPUs
be3518a16ef270e3b030a6ae96055f83f51bd3dd x86/amd_nb: Add PCI device IDs for family 17h, model 30h
556e4c62baffa71e2045a298379db7e57dd47f3d x86/amd_nb: Add support for newer PCI topologies
dedf7dce4cec5c0abe69f4fa6938d5100398220b hwmon/k10temp, x86/amd_nb: Consolidate shared device IDs
60c8144afc287ef09ce8c1230c6aa972659ba1bb x86/MCE/AMD: Fix the thresholding machinery initialization order
---
ProblemType: Bug
ApportVersion: 2.20.9-0ubuntu7.5
Architecture: amd64
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
CurrentDesktop: ubuntu:GNOME
DistroRelease: Ubuntu 18.04
InstallationDate: Installed on 2019-02-28 (10 days ago)
InstallationMedia: Ubuntu 18.04.2 LTS "Bionic Beaver" - Release amd64 (20190210)
IwConfig:
 enp33s0f0 no wireless extensions.

 lo no wireless extensions.

 enp33s0f1 no wireless extensions.
MachineType: AMD Corporation DAYTONA_X
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 astdrmfb
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.18.0-15-generic root=UUID=6500165b-63c3-4a3e-bbee-ca9f43bb1784 ro quiet splash vt.handoff=1
ProcVersionSignature: Ubuntu 4.18.0-15.16~18.04.1-generic 4.18.20
RelatedPackageVersions:
 linux-restricted-modules-4.18.0-15-generic N/A
 linux-backports-modules-4.18.0-15-generic N/A
 linux-firmware 1.173.3
RfKill:

Tags: bionic
Uname: Linux 4.18.0-15-generic x86_64
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups: adm cdrom dip lpadmin plugdev sambashare sudo
_MarkForUpload: True
dmi.bios.date: 01/18/2019
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: RDY0071B
dmi.board.asset.tag: To be filled by O.E.M.
dmi.board.name: DAYTONA_X
dmi.board.vendor: AMD Corporation
dmi.board.version: To be filled by O.E.M.
dmi.chassis.asset.tag: To be filled by O.E.M.
dmi.chassis.type: 2
dmi.chassis.vendor: To be filled by O.E.M.
dmi.chassis.version: To be filled by O.E.M.
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvrRDY0071B:bd01/18/2019:svnAMDCorporation:pnDAYTONA_X:pvrTobefilledbyO.E.M.:rvnAMDCorporation:rnDAYTONA_X:rvrTobefilledbyO.E.M.:cvnTobefilledbyO.E.M.:ct2:cvrTobefilledbyO.E.M.:
dmi.product.family: Default string
dmi.product.name: DAYTONA_X
dmi.product.sku: Default string
dmi.product.version: To be filled by O.E.M.
dmi.sys.vendor: AMD Corporation

apport information

tags: added: apport-collected bionic
description: updated
Kim Naru (kim-naru) wrote : CRDA.txt

apport information

apport information

Kim Naru (kim-naru) wrote : Lspci.txt

apport information

Kim Naru (kim-naru) wrote : Lsusb.txt

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed

Does this work in Disco w/ 5.0 kernel?

Kim Naru (kim-naru) wrote :

Jeff, Yes this works with Disco and 5.0 kernel. My objective is to submit these patches to 18.04.1 LTS. There are 3 other patches as well.(Launchpad:1816669).

I was going to submit the patches under 18.10(cosmic) to the kernel team and request for 18.04.1(bionic) inclusion. Would that be the right way to do it?

--thanks
--kim

Kim Naru (kim-naru) wrote :

SRU

[Impact]
Some house cleaning and adding PCI devices for Rome. The PCI topology patch is so that the correct root PCI device gets used for the correct SMU access. With the PCI topology patch, k10temp(thermal monitoring) will accurately provide the temperature for both sockets. The MCE patch adds more robustness.

[Test Case]

 On a Rome system, just load k10temp, and the sysfs entries are in:
localhost:/sys/bus/pci/drivers/k10temp/0000:00:*.3/hwmon/hwmonN/
on Rome and if you have sensors installed, sensors should display the information.

[Regression Potential]
Risk is low as the code is localized to AMD specific hardware and what is added is PCI awareness for devices on Rome.

[Other Info]
 All these patches have already been accepted upstream.

Kim Naru (kim-naru) wrote :

The following patches will not be submitted:

CCP/PSP:
dcbc0c6e4aa1ef269179351ac615fd08ddefc849 crypto: ccp - Add support for new CCP/PSP device ID
ad01a984f512c42c9f4fe79d36b9cddbc6156a3f crypto: ccp - Support register differences between PSP devices

Michael Reed (mreed8855) wrote :

Hi Kim,

I have tested this on a system, using Disco with 5.0 and I am currently seeing only 255 cores, has support landed in Disco yet?

Kim Naru (kim-naru) wrote :

Michael,
You will need the following for the 256 Cores:

https://bugs.launchpad.net/amd/+bug/1816669

--kim

Andrea Righi (arighi) wrote :

Hi Kim,

I've applied the following patches to the bionic kernel (4.15):

[from LP #1816669]
818b7587b4d3 x86: irq_remapping: Move irq remapping mode enum
e881dbd5d4a6 iommu/amd: Add support for higher 64-bit IOMMU Control Register
90fcffd9cf5e iommu/amd: Add support for IOMMU XT mode

[from LP #1819485]
210ba1201ff9 hwmon/k10temp: Add support for AMD family 17h, model 30h CPUs
be3518a16ef2 x86/amd_nb: Add PCI device IDs for family 17h, model 30h
556e4c62baff x86/amd_nb: Add support for newer PCI topologies
dedf7dce4cec hwmon/k10temp, x86/amd_nb: Consolidate shared device IDs
60c8144afc28 x86/MCE/AMD: Fix the thresholding machinery initialization order

I haven't applied the following crypo patches yet (because they require additional backporting in bionic):
dcbc0c6e4aa1 crypto: ccp - Add support for new CCP/PSP device ID
ad01a984f512 crypto: ccp - Support register differences between PSP devices

And here is a test kernel:
https://kernel.ubuntu.com/~arighi/LP-1819485/

Could you test it and see if it's working as intented? Thanks!

Kim Naru (kim-naru) wrote :

Andrea,
Thank you I will test.

Please IGNORE the CCP/PSP patches, we do not want them back ported. Look at comment #17.

--kim

Kim Naru (kim-naru) wrote :

Andrea,
I have confirmed it is working as intended.

-kim

Andrea Righi (arighi) on 2019-05-21
summary: - AMD Rome : Additional patches
+ x86: add support to AMD Rome CPU
Andrea Righi (arighi) on 2019-05-21
description: updated
Changed in linux (Ubuntu):
assignee: nobody → Andrea Righi (arighi)
Andrea Righi (arighi) on 2019-05-21
summary: - x86: add support to AMD Rome CPU
+ x86: add support for AMD Rome
Michael Reed (mreed8855) wrote :

I have also tested this using the test kernel in comment #20 and it is working as intended.

Changed in linux (Ubuntu Bionic):
status: New → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
Kim Naru (kim-naru) on 2019-06-19
tags: added: verification-done-bionic
removed: apport-collected bionic verification-needed-bionic
Launchpad Janitor (janitor) wrote :
Download full text (11.2 KiB)

This bug was fixed in the package linux - 4.15.0-55.60

---------------
linux (4.15.0-55.60) bionic; urgency=medium

  * linux: 4.15.0-55.60 -proposed tracker (LP: #1834954)

  * Request backport of ceph commits into bionic (LP: #1834235)
    - ceph: use atomic_t for ceph_inode_info::i_shared_gen
    - ceph: define argument structure for handle_cap_grant
    - ceph: flush pending works before shutdown super
    - ceph: send cap releases more aggressively
    - ceph: single workqueue for inode related works
    - ceph: avoid dereferencing invalid pointer during cached readdir
    - ceph: quota: add initial infrastructure to support cephfs quotas
    - ceph: quota: support for ceph.quota.max_files
    - ceph: quota: don't allow cross-quota renames
    - ceph: fix root quota realm check
    - ceph: quota: support for ceph.quota.max_bytes
    - ceph: quota: update MDS when max_bytes is approaching
    - ceph: quota: add counter for snaprealms with quota
    - ceph: avoid iput_final() while holding mutex or in dispatch thread

  * QCA9377 isn't being recognized sometimes (LP: #1757218)
    - SAUCE: USB: Disable USB2 LPM at shutdown

  * hns: fix ICMP6 neighbor solicitation messages discard problem (LP: #1833140)
    - net: hns: fix ICMP6 neighbor solicitation messages discard problem
    - net: hns: fix unsigned comparison to less than zero

  * Fix occasional boot time crash in hns driver (LP: #1833138)
    - net: hns: Fix probabilistic memory overwrite when HNS driver initialized

  * use-after-free in hns_nic_net_xmit_hw (LP: #1833136)
    - net: hns: fix KASAN: use-after-free in hns_nic_net_xmit_hw()

  * hns: attempt to restart autoneg when disabled should report error
    (LP: #1833147)
    - net: hns: Restart autoneg need return failed when autoneg off

  * systemd 237-3ubuntu10.14 ADT test failure on Bionic ppc64el (test-seccomp)
    (LP: #1821625)
    - powerpc: sys_pkey_alloc() and sys_pkey_free() system calls
    - powerpc: sys_pkey_mprotect() system call

  * [UBUNTU] pkey: Indicate old mkvp only if old and curr. mkvp are different
    (LP: #1832625)
    - pkey: Indicate old mkvp only if old and current mkvp are different

  * [UBUNTU] kernel: Fix gcm-aes-s390 wrong scatter-gather list processing
    (LP: #1832623)
    - s390/crypto: fix gcm-aes-s390 selftest failures

  * System crashes on hot adding a core with drmgr command (4.15.0-48-generic)
    (LP: #1833716)
    - powerpc/numa: improve control of topology updates
    - powerpc/numa: document topology_updates_enabled, disable by default

  * Kernel modules generated incorrectly when system is localized to a non-
    English language (LP: #1828084)
    - scripts: override locale from environment when running recordmcount.pl

  * [UBUNTU] kernel: Fix wrong dispatching for control domain CPRBs
    (LP: #1832624)
    - s390/zcrypt: Fix wrong dispatching for control domain CPRBs

  * CVE-2019-11815
    - net: rds: force to destroy connection if t_sock is NULL in
      rds_tcp_kill_sock().

  * Sound device not detected after resume from hibernate (LP: #1826868)
    - drm/i915: Force 2*96 MHz cdclk on glk/cnl when audio power is enabled
    - drm/i915: Save the old CDCLK atomic state
...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers