Comment 74 for bug 1690085

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote : Re: [Bug 1690085] Re: Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks

> On 5 Feb 2018, at 7:21 PM, Peridot <email address hidden> wrote:
>
> This is still an issue even with the bionic dailies. The easiest way
> (read: without recompiling kernels) I have found to get it to work is to
> disable IOMMU in your BIOS and add "iommu=soft" to the kernel booting
> options in grub.
>
> linux can then detect everything properly (all cores) and I've had zero
> crashes. The only issue is that it's using software IOMMU.
>
> Without these options you will either get crashes or hangs relating to
> the ACPI table (booting with acpi=off will only show a single core).
> Lots of AMD-Vi Logged events. and irq crashes.
>
> I believe this is related to
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1671360 <https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1671360>

This particular bug is about interrupt storm from AMD GPIO driver.
Can you file a separate bug instead?

>
> I'm on a Ryzen 1800X and Biostar B350GT5.
>
> --
> You received this bug notification because you are subscribed to linux
> in Ubuntu.
> https://bugs.launchpad.net/bugs/1690085
>
> Title:
> Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks
>
> Status in Linux:
> Unknown
> Status in linux package in Ubuntu:
> Confirmed
>
> Bug description:
> Hi,
>
>
> We aregetting various kernel crash on a pretty new config.
> We're using Ryzen 1800X CPU with X370 Gaming Pro Carbon MB (7A32V1) using latest BIOS available (1.52)
>
> We are running Ubuntu 17.04 (amd64), we've tried different kernel version, native one and releases from http://kernel.ubuntu.com/~kernel-ppa/mainline/ too.
> Tested kernel version:
>
> native 17.04 kernel
> 4.10.15
>
> Issues are the same, we're getting random freeze on the machine.
>
> Here is kern.log entry when happening :
>
> May 10 22:41:56 dev2 kernel: [24366.186246] INFO: rcu_sched detected stalls on CPUs/tasks:
> May 10 22:41:56 dev2 kernel: [24366.187618] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=913449
> May 10 22:41:56 dev2 kernel: [24366.188977] (detected by 12, t=1860207 jiffies, g=10001, c=10000, q=4656)
> May 10 22:41:56 dev2 kernel: [24366.190344] Task dump for CPU 0:
> May 10 22:41:56 dev2 kernel: [24366.190345] swapper/0 R running task 0 0 0 0x00000008
> May 10 22:41:56 dev2 kernel: [24366.190348] Call Trace:
> May 10 22:41:56 dev2 kernel: [24366.190354] ? native_safe_halt+0x6/0x10
> May 10 22:41:56 dev2 kernel: [24366.190355] ? default_idle+0x20/0xd0
> May 10 22:41:56 dev2 kernel: [24366.190358] ? arch_cpu_idle+0xf/0x20
> May 10 22:41:56 dev2 kernel: [24366.190360] ? default_idle_call+0x23/0x30
> May 10 22:41:56 dev2 kernel: [24366.190362] ? do_idle+0x16f/0x200
> May 10 22:41:56 dev2 kernel: [24366.190364] ? cpu_startup_entry+0x71/0x80
> May 10 22:41:56 dev2 kernel: [24366.190366] ? rest_init+0x77/0x80
> May 10 22:41:56 dev2 kernel: [24366.190368] ? start_kernel+0x464/0x485
> May 10 22:41:56 dev2 kernel: [24366.190369] ? early_idt_handler_array+0x120/0x120
> May 10 22:41:56 dev2 kernel: [24366.190371] ? x86_64_start_reservations+0x24/0x26
> May 10 22:41:56 dev2 kernel: [24366.190372] ? x86_64_start_kernel+0x14d/0x170
> May 10 22:41:56 dev2 kernel: [24366.190373] ? start_cpu+0x14/0x14
> May 10 22:44:56 dev2 kernel: [24546.188093] INFO: rcu_sched detected stalls on CPUs/tasks:
> May 10 22:44:56 dev2 kernel: [24546.189461] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=935027
> May 10 22:44:56 dev2 kernel: [24546.190823] (detected by 14, t=1905212 jiffies, g=10001, c=10000, q=4740)
> May 10 22:44:56 dev2 kernel: [24546.192191] Task dump for CPU 0:
> May 10 22:44:56 dev2 kernel: [24546.192192] swapper/0 R running task 0 0 0 0x00000008
> May 10 22:44:56 dev2 kernel: [24546.192195] Call Trace:
> May 10 22:44:56 dev2 kernel: [24546.192199] ? native_safe_halt+0x6/0x10
> May 10 22:44:56 dev2 kernel: [24546.192201] ? default_idle+0x20/0xd0
> May 10 22:44:56 dev2 kernel: [24546.192203] ? arch_cpu_idle+0xf/0x20
> May 10 22:44:56 dev2 kernel: [24546.192204] ? default_idle_call+0x23/0x30
> May 10 22:44:56 dev2 kernel: [24546.192206] ? do_idle+0x16f/0x200
> May 10 22:44:56 dev2 kernel: [24546.192208] ? cpu_startup_entry+0x71/0x80
> May 10 22:44:56 dev2 kernel: [24546.192210] ? rest_init+0x77/0x80
> May 10 22:44:56 dev2 kernel: [24546.192211] ? start_kernel+0x464/0x485
> May 10 22:44:56 dev2 kernel: [24546.192213] ? early_idt_handler_array+0x120/0x120
> May 10 22:44:56 dev2 kernel: [24546.192214] ? x86_64_start_reservations+0x24/0x26
> May 10 22:44:56 dev2 kernel: [24546.192215] ? x86_64_start_kernel+0x14d/0x170
> May 10 22:44:56 dev2 kernel: [24546.192217] ? start_cpu+0x14/0x14
>
> Depending on the kernel version, we've got NMI watchdog errors related to CPU stuck (mentioning the CPU core id, which is random).
> Crash is happening randomly, but in general after some hours (3-4h).
>
> Now, we've installed kernel 4.11.0-041100-generic #201705041534 this morning and waiting for crash...
> For now, the machine is not "used", at least, it's not CPU stressed...
>
>
> Thanks
> ---
> ApportVersion: 2.20.4-0ubuntu4
> Architecture: amd64
> DistroRelease: Ubuntu 17.04
> InstallationDate: Installed on 2017-05-09 (1 days ago)
> InstallationMedia: Ubuntu-Server 17.04 "Zesty Zapus" - Release amd64 (20170412)
> Package: linux (not installed)
> ProcEnviron:
> TERM=xterm-256color
> PATH=(custom, no user)
> XDG_RUNTIME_DIR=<set>
> LANG=fr_FR.UTF-8
> SHELL=/bin/bash
> Tags: zesty
> Uname: Linux 4.11.0-041100-generic x86_64
> UnreportableReason: The running kernel is not an Ubuntu kernel
> UpgradeStatus: No upgrade log present (probably fresh install)
> UserGroups:
>
> _MarkForUpload: True
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/linux/+bug/1690085/+subscriptions