Ryzen 1800X freeze - rcu_sched detected stalls on CPUs/tasks

Bug #1690085 reported by Vincent on 2017-05-11
210
This bug affects 36 people
Affects Status Importance Assigned to Milestone
Linux
Expired
Medium
linux (Ubuntu)
High
Unassigned

Bug Description

Hi,

We aregetting various kernel crash on a pretty new config.
We're using Ryzen 1800X CPU with X370 Gaming Pro Carbon MB (7A32V1) using latest BIOS available (1.52)

We are running Ubuntu 17.04 (amd64), we've tried different kernel version, native one and releases from http://kernel.ubuntu.com/~kernel-ppa/mainline/ too.
Tested kernel version:

native 17.04 kernel
4.10.15

Issues are the same, we're getting random freeze on the machine.

Here is kern.log entry when happening :

May 10 22:41:56 dev2 kernel: [24366.186246] INFO: rcu_sched detected stalls on CPUs/tasks:
May 10 22:41:56 dev2 kernel: [24366.187618] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=913449
May 10 22:41:56 dev2 kernel: [24366.188977] (detected by 12, t=1860207 jiffies, g=10001, c=10000, q=4656)
May 10 22:41:56 dev2 kernel: [24366.190344] Task dump for CPU 0:
May 10 22:41:56 dev2 kernel: [24366.190345] swapper/0 R running task 0 0 0 0x00000008
May 10 22:41:56 dev2 kernel: [24366.190348] Call Trace:
May 10 22:41:56 dev2 kernel: [24366.190354] ? native_safe_halt+0x6/0x10
May 10 22:41:56 dev2 kernel: [24366.190355] ? default_idle+0x20/0xd0
May 10 22:41:56 dev2 kernel: [24366.190358] ? arch_cpu_idle+0xf/0x20
May 10 22:41:56 dev2 kernel: [24366.190360] ? default_idle_call+0x23/0x30
May 10 22:41:56 dev2 kernel: [24366.190362] ? do_idle+0x16f/0x200
May 10 22:41:56 dev2 kernel: [24366.190364] ? cpu_startup_entry+0x71/0x80
May 10 22:41:56 dev2 kernel: [24366.190366] ? rest_init+0x77/0x80
May 10 22:41:56 dev2 kernel: [24366.190368] ? start_kernel+0x464/0x485
May 10 22:41:56 dev2 kernel: [24366.190369] ? early_idt_handler_array+0x120/0x120
May 10 22:41:56 dev2 kernel: [24366.190371] ? x86_64_start_reservations+0x24/0x26
May 10 22:41:56 dev2 kernel: [24366.190372] ? x86_64_start_kernel+0x14d/0x170
May 10 22:41:56 dev2 kernel: [24366.190373] ? start_cpu+0x14/0x14
May 10 22:44:56 dev2 kernel: [24546.188093] INFO: rcu_sched detected stalls on CPUs/tasks:
May 10 22:44:56 dev2 kernel: [24546.189461] 0-...: (1 GPs behind) idle=49b/1/0 softirq=28561/28563 fqs=935027
May 10 22:44:56 dev2 kernel: [24546.190823] (detected by 14, t=1905212 jiffies, g=10001, c=10000, q=4740)
May 10 22:44:56 dev2 kernel: [24546.192191] Task dump for CPU 0:
May 10 22:44:56 dev2 kernel: [24546.192192] swapper/0 R running task 0 0 0 0x00000008
May 10 22:44:56 dev2 kernel: [24546.192195] Call Trace:
May 10 22:44:56 dev2 kernel: [24546.192199] ? native_safe_halt+0x6/0x10
May 10 22:44:56 dev2 kernel: [24546.192201] ? default_idle+0x20/0xd0
May 10 22:44:56 dev2 kernel: [24546.192203] ? arch_cpu_idle+0xf/0x20
May 10 22:44:56 dev2 kernel: [24546.192204] ? default_idle_call+0x23/0x30
May 10 22:44:56 dev2 kernel: [24546.192206] ? do_idle+0x16f/0x200
May 10 22:44:56 dev2 kernel: [24546.192208] ? cpu_startup_entry+0x71/0x80
May 10 22:44:56 dev2 kernel: [24546.192210] ? rest_init+0x77/0x80
May 10 22:44:56 dev2 kernel: [24546.192211] ? start_kernel+0x464/0x485
May 10 22:44:56 dev2 kernel: [24546.192213] ? early_idt_handler_array+0x120/0x120
May 10 22:44:56 dev2 kernel: [24546.192214] ? x86_64_start_reservations+0x24/0x26
May 10 22:44:56 dev2 kernel: [24546.192215] ? x86_64_start_kernel+0x14d/0x170
May 10 22:44:56 dev2 kernel: [24546.192217] ? start_cpu+0x14/0x14

Depending on the kernel version, we've got NMI watchdog errors related to CPU stuck (mentioning the CPU core id, which is random).
Crash is happening randomly, but in general after some hours (3-4h).

Now, we've installed kernel 4.11.0-041100-generic #201705041534 this morning and waiting for crash...
For now, the machine is not "used", at least, it's not CPU stressed...

Thanks
---
ApportVersion: 2.20.4-0ubuntu4
Architecture: amd64
DistroRelease: Ubuntu 17.04
InstallationDate: Installed on 2017-05-09 (1 days ago)
InstallationMedia: Ubuntu-Server 17.04 "Zesty Zapus" - Release amd64 (20170412)
Package: linux (not installed)
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=fr_FR.UTF-8
 SHELL=/bin/bash
Tags: zesty
Uname: Linux 4.11.0-041100-generic x86_64
UnreportableReason: The running kernel is not an Ubuntu kernel
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: True

Vincent (hvincent13) wrote :

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1690085

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete

apport information

tags: added: apport-collected zesty
description: updated

apport information

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Lipido (lipido) wrote :

Hi Vicent,

Did you experience more crashes with kernel 4.11?

Thank you!

Vincent (hvincent13) wrote :

Hi Lipido,

Yes, i'm still experiencing crashes, even when using 4.11

Please find full kern.log in attachment.

Regards,

Lipido (lipido) wrote :

Ok, me too :-(

Crashes appear from 24-48h of operation.

My hardware is:
- SSUS PRIME B350-Plus
- Amd Ryzen 5 1600

OS: Ubuntu 16.04

Tried:
- Update BIOS to the latest (v 609)
- Kernel 4.10
- Disable SMT in bios (from 12 threads to 6 threads)
- Boot clocksource=tsc iommu=soft
- Disable IOMMU in bios

None of these work. I was waiting for 4.11 to be available for Ubuntu as an official package, but it seems that this will not work either.

Another link talking about what seems to be the same issue:

https://forum.level1techs.com/t/ryzen-vs-ubuntu/115715/22

Joseph Salisbury (jsalisbury) wrote :

Did this issue start happening after an update/upgrade? Was there a prior kernel version where you were not having this particular problem?

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.12 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12-rc1/

Changed in linux (Ubuntu):
importance: Undecided → High
status: Confirmed → Incomplete
tags: added: kernel-da-key
Vincent (hvincent13) wrote :

Hi Joseph,

I've just installed 4.12-rc1.
Now waiting for a crash... or not (hope so !)

Regards,

Vincent (hvincent13) wrote :

Just go a kernel panic...

How to add the tag kernel-bug-exists-upstream ?

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-bug-exists-upstream
Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. Would it be possible for you to open an upstream bug report[0]? That will allow the upstream Developers to examine the issue, and may provide a quicker resolution to the bug.

Please follow the instructions on the wiki page[0]. The first step is to email the appropriate mailing list. If no response is received, then a bug may be opened on bugzilla.kernel.org.

Once this bug is reported upstream, please add the tag: 'kernel-bug-reported-upstream'.

[0] https://wiki.ubuntu.com/Bugs/Upstream/kernel

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Vincent (hvincent13) wrote :

Hi,

Could you please tell me which address I have to send the mail to?
I don't really understand how to achieve this bug report on the mailing list.

Thanks,

Vincent (hvincent13) wrote :

Hi,

It looks like that the system is stable when removing "nouveau" driver.
Waiting 24/48h and will post again.

Regards,

camparijet (iichikolamp) wrote :

Hi Vincent,

I also met the problem, and your advice removing "nouveau" succeeded for work-around to stabilize system. Before it freeze every 3-5hr after booting up, but now working without problem for 2 days.

My enviroment is:
- cpu: Ryzen 1700
- motherboard: ASUS B350M-A
- graphics card: NVIDIA GK208
- kernel: 4.11.0-041100rc8-generic
- dist: 17.04

Vincent (hvincent13) wrote :

Hi,

5 days update now and no crash.

camparijet => did you installed NVIDIA driver instead ?

Regards,

camparijet (iichikolamp) wrote :

Hi Vincent,

> camparijet => did you installed NVIDIA driver instead ?

No. I don't have to use the card for my purpose, so simply i disable it.

Alex Jones (blenheimears) wrote :

I'm seeing this crash even with the Nvidia official driver.

Alex Jones (blenheimears) wrote :

This is a hardware bug in the CPU. This ticket should be closed as invalid.

Changed in linux (Ubuntu):
status: Triaged → Invalid
Alex Jones (blenheimears) wrote :

Reopening because even though this is a known issue with the CPU we could still implement a workaround. One workaround is to disable address space layout randomization:

echo 0 >/proc/sys/kernel/randomize_va_space

However, that would be disabling a security feature.

Changed in linux (Ubuntu):
status: Invalid → New

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: artful
Marc Rene Schädler (suaefar) wrote :

Was this issue officially confirmed to be a hardware bug in Ryzen processors by AMD?
If so, could you provide a link to the statement?

Disabling address space layout randomization (ASLR) seems to alleviate the problem, but does it solve it?

I am investigating unstable behavior under load which could be related.
There, disabling ASLR is not sufficient!
People suggest to increase SOC voltages, use specific versions of the kernel and the like.
See https://community.amd.com/thread/215773?start=135&tstart=0 for more info.

Alex Jones (blenheimears) wrote :

AMD has not publicly commented on this issue that I'm aware of. This issue has been seen on many different operating systems. DragonFlyBSD includes a workaround for this issue. The workaround on Linux is to compile the kernel with CONFIG_RCU_NOCB_CPU, CONFIG_RCU_NOCB_CPU_ALL, and disable ASLR using "echo 0 >/proc/sys/kernel/randomize_va_space". This can be put into rc.local. It's also possible to hardcode ASLR as disabled into the kernel, but this requires modifying the kernel source, not just the config file. There is a new AGESA update released about a week ago, 1.0.0.6a, although I have not tested whether the new AGESA alone (without any kernel changes) solves the issue.

Alex Jones (blenheimears) wrote :

I just tested 1.0.0.6a AGESA, and it does not solve this issue. In some cases just one program will crash, and in other cases the entire system will crash. I will test the workaround above later.

Alex Jones (blenheimears) wrote :

If any of your RAM timings are odd (eg. 17), setting them to the next even number (eg. 18) helps a lot. Recompiling the kernel with CONFIG_RCU_NOCB_CPU and CONFIG_RCU_NOCB_CPU_ALL, and disabling ASLR is still necessary though. It may also be a good idea to give the SOC slightly more voltage, but not more than 1.2 V.

Kai-Heng Feng (kaihengfeng) wrote :

Alex,

Can you provide link on how DragonflyBSD fixed this issue?

Alex Jones (blenheimears) wrote :

This is the DragonFlyBSD commit. https://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/b48dd28447fc8ef62fbc963accd301557fd9ac20

It appears that there are two different ways that the system can crash, which is why it is necessary to both disable ASLR and to compile the kernel with CONFIG_RCU_NOCB_CPU and CONFIG_RCU_NOCB_CPU_ALL. The ASLR-related crash usually results in a single or a few programs crashing (although if an important program crashes it can bring down the entire system) and happens under heavy load. The other crash only happens if the kernel was compiled without CONFIG_RCU_NOCB_CPU and CONFIG_RCU_NOCB_CPU_ALL (Ubuntu's kernel is compiled without these options) and happens when the system is idle or nearly idle, and results in a complete system crash.

Alex Jones (blenheimears) wrote :

If the motherboard allows it, disabling the OpCache will completely prevent (or at least greatly reduce the probability of) the ASLR-related crash, even if ASLR is enabled in the kernel. As far as I'm aware it has no effect on the other type of crash.

Alex Jones (blenheimears) wrote :

I've also determined that changing the CPU, memory, or SOC voltages or timings has little or no effect on either type of crash.

Kai-Heng Feng (kaihengfeng) wrote :

This is beyond my expertise - let's see what upstream can do.

Torge (cyslider) wrote :

I also experience this problem since I updated yesterday.

I am using Kubuntu 16.04 with KDE backports enabled.
I also have a Ryzen 1800X and an Asrock X370 Gaming professional

I experience this either when I boot and don't log in promptly or when I enter the lockscreen.

After reading this page I tried to run
  seq 1 | xargs -P0 -n1 md5sum /dev/zero &

as root to always keep one CPU core busy and indeed this seems to prevent the lockscreen problem...

I would also like to note that the system was not 100% frozen, once every few minutes it seemed to response for a short time, enabling me to switch to the TTY. Then I always saw some processes hanging at 100% that I could not kill, or the kill was delayed for a long time. I always see the errors in the initial posts during that time. The TTY seemes to work fine though, once entered though.

David Wilson (plottt) wrote :

I've ran into this problem several times. After disabling C-states in the motherboard, my system has been running stable for ~5 weeks uptime so far.

Torge (cyslider) wrote :

Ok, this trick did not help and my PC did not make it through the night without freezing again.

However I finally found the real problem. I installed oibaf a while back as my Kubuntu was flickering all over the place. Now it seems the be the source of my problem. I deinstalled oibaf again and now everything seems fine. No lock screen freezing for me anymore.

Stuart Page (sdpagent) wrote :

Just wanting to report that I am also experiencing this issue with a freshly installed Ubuntu Server 16.04.3 with 4.10 kernel on the 22nd August 2017. All the updates applied and running as a KVM host with hardware:

Ryzen 1700 (non-x)
Motherboard: Asus prime B350-Plus
Bios: - Version 0805
 - Disabled SMT
 - Disabled c-states

xb5i7o (xb5i7o) wrote :

Hi guys,

Im getting the same issues, brand new build.

Ryzen 1800x
Asrock X370 Taichi
BIOS: 3.10 (latest)
I tried disabling cool n quiet and c-state

However there is another option under advanced - for GLOBAL C-States that i just disabled today and i am waiting to see what will happen.

On BIOS v3.00 PC would freeze after 6 hours of idling or not touching anything. Come back to see my keyboard and mouse and everything was frozen.
With BIOS v.3.10 now i get random reboots atleast once a day.

Im running Ubuntu 16.04.3 Kernel 4.4.0-93

Anyone know to direct me where the SoC voltage would be in BIOS? is it the VDD SoC?
Anyone know how to update my kernel to 4.10 if it tells me i have the latest?

Stuart Page (sdpagent) wrote :

I finally managed to figure out how to compile a kernel with RCU_NOCB and disabled ASLR as Alex Jones mentioned, and it appears to have worked for me and another guy who helped me put the tutorial together:

http://blog.programster.org/ubuntu-16-04-compile-custom-kernel-for-ryzen

Kai-Heng Feng (kaihengfeng) wrote :

Stuart,

Does this issue also happen on latest mainline kernel?

Franck Charras (franckc) wrote :

Hi,
I'm getting the same issues on several identical builds, with an ASUS prime X370-PRO motherboard.
It's very hard to analyze since it happens randomly every other week, and it leaves no logs. It seems that the freeze happens at idle after very high memory load (observation after logging CPU and RAM loads).
Compiling the kernel as suggested in this thread didn't work (new freeze this morning).

tgui (eric-c-morgan) wrote :

I too still have random computer shutdowns without logs. Uptime varies from a couple days to a week or so. It happens it seems after being relatively idle for a long period. I do have high memory usage because of the VMs I run.

I've disabled C-states, cool and quiet, tested memory, and so forth. I also compiled a new kernel as mentioned by Stuart. I do not have segfaults with compilations.

I am willing to run tests and provide info. Please let me know if anyone wants something.

Ubuntu 16.04
Asrock x370 ITX
32GB Ram
Ryzen 1700

Franck Charras (franckc) wrote :

Ryzen CPUs manufactured before week 24 of this year were known to have issues (especially the segfault issue). It was officially fixed for all ryzen manufactured after week 30. All my ryzen are pre-24 CPUs and they all have this silent crash issue. Does it also happen with the post week 30 ryzen ? (@tgui what is yours ?)

AS (as2008) wrote :

I can confirm the issue with a week 33 Ryzen 1700 on Asus Prime B350 Plus.
I had segfaults with my previous 1700 (week 22 iirc). No more segfaults with the week 33 CPU but still random crashes on (long time) idle.

tags: removed: kernel-da-key
information type: Public → Public Security
information type: Public Security → Public
Changed in linux:
importance: Unknown → Medium
status: Unknown → Confirmed
Brad Figg (brad-figg) on 2019-07-24
tags: added: cscc
712 comments hidden view all 792 comments

I as the thread starter fully agree with Borislav and set this bug report to status resolved. Thank you all very much for the interesting discussion!

Having started with massive stability problems on my new Ryzen build, I reported this issue to address a potential problem in the Linux kernel. In the hundreds of comments, which surely could be very interesting for everybody affected, I found a workaround - disabling C6 states with the python script attached to a comment above. Since then the system was perfectly stable. Test with the "kill Ryzen" script also showed me, the CPU was not affected by the other huge problem early adopters had.

A few days ago I finally found the time to update the AB350 BIOS to the latest version and set the "typical idle current" in the options. I had only 2 issues since then: The bootloader entry for my OS was broken after the update and network manager did not bring up the LAN interface anymore. But after resolving these issues: No more stability problems. Of course I deactivated my systemd service for c6 states. I am still on the stable branch of Manjaro and use the latest 5.5 series kernel.

See you!

Changed in linux:
status: Confirmed → Expired

Dancehall music, also called ragga or dub, style of Jamaican popular music that had its genesis in the political turbulence of the late 1970s and became Jamaica’s dominant music in the 1980s and ’90s. Central to dancehall is the deejay, who raps, or “toasts,” over a prerecorded rhythm track (bass guitar and drums), or “dub.”

The rise of deejay Yellowman in the early 1980s marked the transition from mainstream reggae to dancehall music that took place in Jamaican nightclubs. In addition to the explicitly http://promodj.com/music-lyrics political lyrics of songs of the early 1980s such as “Operation Eradication” and “Soldier Take Over,” Yellowman incorporated into his repertoire salacious lyrics that became widely known as “slackness,” a Jamaicanism for licentiousness. Drawing on the raunchy tradition of mento, an earlier form of Jamaican dance music that barely disguised sexual discourse in metaphor, and on the spirit of the Caribbean calypso folk song, to which mento is kin, Yellowman teasingly addressed both sex and politics in his radical critique of society in the wake of the failure of Jamaica’s experiment with socialism under Prime Minister Michael Manley.

 do all that instrumentation on an affected system.

source: https://Getnaijamusic.com

The COVID-19 pandemic is posing an unprecedented threat to EU/EEA countries and the UK, which have been experiencing widespread transmission of the virus in the community for several weeks. In addition, there has been an increasing number of reports of COVID-19 outbreaks in long-term care homes across Europe with high associated mortality, highlighting the extreme vulnerability of the elderly in this setting.

Sources:
https://geomp3.com/mp3/music-mp3-download/
https://geomp3.com/
https://geomp3.com/mp3/latest-gossip/
http://www.geomp3.net/

(In reply to hoper from comment #33)
> Just a small message to say "me too". I spend the last 4 months spending a
> lot of time and money to change the motherboard, the cpu, the power...
> Before doing research and find that these freezes are software related :(
>
> I tried (and manage) to compile my own kernel with CONFIG_RCU_NOCB_CPU=Y and
> so on. Before that, my server always crashed after 2 ou 3 days. With this
> custom kernel, yes, it's better... The freeze only appear after 8 or 9 days.
> But the freeze are still here. And I guess I will just sell all this stuff
> and go back to intel :(
>
> I can't understand that this information "LINUX + RYZEN = NOT STABLE" is not
> spread everywhere. Lot's of people out there are loosing lot's of time and
> money, I'm sure of that. Of course I'm grateful for all open sources
> developers (I'm sharing also what I can :) and I really hope that the root
> cause of this bug will be found and corrected in the next few months... We
> need to be able to use linux on RYZEN ! (perfect cpu for servers).
>
> If someone manage to make this bug disappear (and have an uptime > 30 days),
> please share ! how you did that, with enough details for beginner like me :)

New song download , Kizunguzungu Vdj Jones ft Parroty,
Kabagazi, Volkhano, B-Razor (34Gvng), Dj MikeKay, New
song download https://geoserve.com.ng/diamond-platnumz-holla/ >Diamond Platnumz – Holla mp3</a> This
the third project this year 2020. I get the pleasure to
feature brand new talents on the Gengetone plaform.
Parroty
“Mr Pupupu”, Some coastal favour by Kabagazi, B-Razor
from the wabebe hit makers,34 Gvng. Volkhano
representing
1960 delivers the hook!! Introducing Dj Mike Kay a street
DJ
based in Nairobi,Kenya
https://www.geniusfuns.co.ke/nyashinski-malaika-mp3-download/ " >nyashinski — malaika mp3
download</a>

Hi folks!

For those who are looking for a solution or already found a solution, there is a new update of AGESA rolling out. The new version 1.0.0.4 claims:

* Improved system stability when switching through ACPI power states.

It has arrived on my Asus PRIME B350M days ago, I have upgraded and kept default optimised BIOS settings. When I used ZenState to check cpu status, the C6 state is now:

C6 State - Package - Enabled
C6 State - Core - Disabled

Previously they were all enabled, I guessed the new AGESA release finally solved this problem from source? I will report back days later to see if the system still hangs.

(In reply to Charles Lim from comment #673)

> For those who are looking for a solution or already found a solution, there
> is a new update of AGESA rolling out. The new version 1.0.0.4 claims:
>
> * Improved system stability when switching through ACPI power states.
>
> It has arrived on my Asus PRIME B350M days ago, I have upgraded and kept
> default optimised BIOS settings. When I used ZenState to check cpu status,
> the C6 state is now:
>
> C6 State - Package - Enabled
> C6 State - Core - Disabled
>
> Previously they were all enabled, I guessed the new AGESA release finally
> solved this problem from source? I will report back days later to see if the
> system still hangs.

Are you sure about the AGESA version string? For the Dell OptiPlex 5055, firmware version 1.1.20 [1], running it through *Dell PFS BIOS Extractor*, and then grepping for `AGESA!` in the strings/hexdump, it says it includes AGESA version 1.0.0.7a.

    $ strings 1\ --\ 1\ OptiPlex\ System\ BIOS\ v1.1.20.bin | grep -A1 AGESA!
    %pAGESA!V9
    SummitPI-AM4 1.0.0.7a

[1]: https://www.dell.com/support/home/de-de/drivers/driversdetails?driverid=w6mw5&oscode=wt64a&productcode=optiplex-5055-ryzen-cpu
[2]: https://github.com/platomav/BIOSUtilities

(In reply to Paul Menzel from comment #674)
> (In reply to Charles Lim from comment #673)
>
> > For those who are looking for a solution or already found a solution, there
> > is a new update of AGESA rolling out. The new version 1.0.0.4 claims:
> >
> > * Improved system stability when switching through ACPI power states.
> >
> > It has arrived on my Asus PRIME B350M days ago, I have upgraded and kept
> > default optimised BIOS settings. When I used ZenState to check cpu status,
> > the C6 state is now:
> >
> > C6 State - Package - Enabled
> > C6 State - Core - Disabled
> >
> > Previously they were all enabled, I guessed the new AGESA release finally
> > solved this problem from source? I will report back days later to see if
> the
> > system still hangs.
>
> Are you sure about the AGESA version string? For the Dell OptiPlex 5055,
> firmware version 1.1.20 [1], running it through *Dell PFS BIOS Extractor*,
> and then grepping for `AGESA!` in the strings/hexdump, it says it includes
> AGESA version 1.0.0.7a.
>
> $ strings 1\ --\ 1\ OptiPlex\ System\ BIOS\ v1.1.20.bin | grep -A1 AGESA!
> %pAGESA!V9
> SummitPI-AM4 1.0.0.7a
>
> [1]:
> https://www.dell.com/support/home/de-de/drivers/
> driversdetails?driverid=w6mw5&oscode=wt64a&productcode=optiplex-5055-ryzen-
> cpu
> [2]: https://github.com/platomav/BIOSUtilities

1.0.0.7a is for Zen+
1.0.0.4 is for Zen2.

AMD resets versioning every new CPU generation. 1.0.0.4 is newer than 1.0.0.7

(In reply to Rafal Kupiec from comment #675)
> (In reply to Paul Menzel from comment #674)
> > (In reply to Charles Lim from comment #673)
> >
> > > For those who are looking for a solution or already found a solution,
> there
> > > is a new update of AGESA rolling out. The new version 1.0.0.4 claims:
> > >
> > > * Improved system stability when switching through ACPI power states.

[…]

> > Are you sure about the AGESA version string? For the Dell OptiPlex 5055,
> > firmware version 1.1.20 [1], running it through *Dell PFS BIOS Extractor*,
> > and then grepping for `AGESA!` in the strings/hexdump, it says it includes
> > AGESA version 1.0.0.7a.

[…]

> 1.0.0.7a is for Zen+
> 1.0.0.4 is for Zen2.
>
> AMD resets versioning every new CPU generation. 1.0.0.4 is newer than 1.0.0.7

Thank you for the clarification. Though I am confused now, as I thought you could use Zen2 devices in “Zen+ boards” (boards original for Zen+). So, AGESA 1.0.0.4 for Zen2 also support the predecessor generation?

There is nothing like Zen+ or Zen2 board... There are AM4 motherboards based on different chipsets. All of them supports Zen, Zen+ and Zen2. If you want to install Zen2 in such board you must first ensure it has flashed BIOS that supports new CPU. The piece of software that includes CPU microcode is called AGESA. It is responsible for initializing CPU and memory at least. And it is redistributed as BIOS upgrade by motherboards manufacturers.

So if you want to install Zen2 CPU on AM4 motherboard, you need to make sure it has flashed BIOS with AGESA 1.0.0.1 that brings support for Zen2 CPUs. And that AGESA 1.0.0.1 is something newer than 1.0.0.7. It still supports Zen and Zen+, but since it is dedicated for Zen2, the versioning restarts from 1.0.0.0 what gets people confusing.

Hi folks again!

I have been testing the new AGESA 1.0.0.4 these days. Currently my uptime is reaching 3days+, which was impossible before - the system always hangs within 2 days. Hence I consider the new firmware indeed solved this Soft Lock issue!

I'm testing on Ubuntu Focal, acpi related settings are all left default. On my Asus PRIME B350M, default optimised settings are being used.

In addition, the CPU C6 State seems also normal, during idle, CPU enters its idle state with very low power consumption (measured externally). k10temp reports Vcore 850mV(approx.).

>Thank you for the clarification. Though I am confused now, as I thought you
>could use Zen2 devices in “Zen+ boards” (boards original for Zen+). So, AGESA
>1.0.0.4 for Zen2 also support the predecessor generation?

It seems that to address to this versioning mess, AMD claims that this 1.0.0.4 release reunited all architectures into one codebase.[1] I guess they also "reunited" the version number. The AGESA was also 1.0.0.7 on my motherboard before.

---REFERENCES---
[1]: https://www.reddit.com/r/Amd/comments/dtgutp/an_update_on_the_am4_platform_agesa_1004/

Looks like your BIOS applies the fix.. but check https://mpvibe.com/ and https://digitalskillsolution.com/ I think they will be helpful

I've been facing the same issue since I've bought this machine and setting my BIOS power settings to "Typical current idle" has only partially fixed things (the system still crashes every 1-2 weeks during idle: I leave my desk, screensaver fires up, then screens go to sleep, and by the time I get back to my desk, the system no longer responds to any input and the only option is a hard reset.).

At the time of writing, I can't tell if the AGESA version has made it into my motherboard: https://www.asrock.com/mb/AMD/X399M%20Taichi/index.asp#BIOS

But I'd be willing to give this a try. I also just yesterday requested an RMA from AMD though: I spoke to their support and they asked me to try setting the power settings in the BIOS settings. Once I reported back that the power settings only reduced the frequency of these crashes, their immediate response was to open an RMA request. It seemed like they were aware of the issue and they seemed quite confident that an RMA will fix this (I sure hope so since the RMA means at least 2 weeks of downtime for me).

(In reply to Ashesh Ambasta from comment #680)
> I've been facing the same issue since I've bought this machine and setting
> my BIOS power settings to "Typical current idle" has only partially fixed
> things (the system still crashes every 1-2 weeks during idle: I leave my
> desk, screensaver fires up, then screens go to sleep, and by the time I get
> back to my desk, the system no longer responds to any input and the only
> option is a hard reset.).
>
> At the time of writing, I can't tell if the AGESA version has made it into
> my motherboard: https://www.asrock.com/mb/AMD/X399M%20Taichi/index.asp#BIOS
>
> But I'd be willing to give this a try. I also just yesterday requested an
> RMA from AMD though: I spoke to their support and they asked me to try
> setting the power settings in the BIOS settings. Once I reported back that
> the power settings only reduced the frequency of these crashes, their
> immediate response was to open an RMA request. It seemed like they were
> aware of the issue and they seemed quite confident that an RMA will fix this
> (I sure hope so since the RMA means at least 2 weeks of downtime for me).

I wish you luck. I think this is all I can tell you.
I had exact problem with B450. I have opened RMA request and got money back. I bought X570 based motherboard and this solved issue for me.

Created attachment 289445
attachment-1715-0.html

Does this mean the issue could be motherboard related? I actually received a reply from AMD and they’re asking me for pictures of the processor installed on the motherboard for some reason.

Best,

Ashesh Ambasta

> On 1 Jun 2020, at 12:46, <email address hidden> wrote:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=196683
>
> --- Comment #681 from Rafal Kupiec (<email address hidden>) ---
> (In reply to Ashesh Ambasta from comment #680)
>> I've been facing the same issue since I've bought this machine and setting
>> my BIOS power settings to "Typical current idle" has only partially fixed
>> things (the system still crashes every 1-2 weeks during idle: I leave my
>> desk, screensaver fires up, then screens go to sleep, and by the time I get
>> back to my desk, the system no longer responds to any input and the only
>> option is a hard reset.).
>>
>> At the time of writing, I can't tell if the AGESA version has made it into
>> my motherboard: https://www.asrock.com/mb/AMD/X399M%20Taichi/index.asp#BIOS
>>
>> But I'd be willing to give this a try. I also just yesterday requested an
>> RMA from AMD though: I spoke to their support and they asked me to try
>> setting the power settings in the BIOS settings. Once I reported back that
>> the power settings only reduced the frequency of these crashes, their
>> immediate response was to open an RMA request. It seemed like they were
>> aware of the issue and they seemed quite confident that an RMA will fix this
>> (I sure hope so since the RMA means at least 2 weeks of downtime for me).
>
> I wish you luck. I think this is all I can tell you.
> I had exact problem with B450. I have opened RMA request and got money back.
> I
> bought X570 based motherboard and this solved issue for me.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

(In reply to Ashesh Ambasta from comment #680)
> I've been facing the same issue since I've bought this machine and setting
> my BIOS power settings to "Typical current idle" has only partially fixed
> things (the system still crashes every 1-2 weeks during idle: I leave my
> desk, screensaver fires up, then screens go to sleep, and by the time I get
> back to my desk, the system no longer responds to any input and the only
> option is a hard reset.).
>
> At the time of writing, I can't tell if the AGESA version has made it into
> my motherboard: https://www.asrock.com/mb/AMD/X399M%20Taichi/index.asp#BIOS

[I have an MSI board, but checked the firmware update files for you.]

The description for version 3.60 includes:

> Update AMD AGESA ThreadRipperPI-SP3r2 1.1.0.2

Reading the ASRock forum thread *What happened to Fatality X399 Bios 1.60?* [1], it’s enough to run the firmware update file through a hexeditor. I searched for *GES* and was lucky for version 3.80.

    $ hexdump -C X399MT3.80 | less
    […]
    00e34ce0 00 00 00 00 00 00 00 00 00 00 00 00 be 2c 63 17 |.............,c.|
    00e34cf0 80 6c b6 49 82 07 12 b5 3d 9b 25 70 41 47 45 53 |.l.I....=.%pAGES|
    00e34d00 41 21 56 39 00 54 68 72 65 61 64 52 69 70 70 65 |A!V9.ThreadRippe|
    00e34d10 72 50 49 2d 53 50 33 72 32 2d 31 2e 31 2e 30 2e |rPI-SP3r2-1.1.0.|
    00e34d20 32 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |2...............|
[…]
```

So, it still has AGESA version 1.1.0.2.

[…]

[1]: http://forum.asrock.com/forum_posts.asp?TID=6165&PN=4&title=what-happened-to-fatality-x399-bios-160

I see. But that is assuming that the 1.0.0.4 version “fixes” this issue. At this point it’s only a guess but it seems likely given the previous replies I’ve seen.
My god what a nightmare this has been. :-)

Best,

Ashesh Ambasta

> On 1 Jun 2020, at 17:59, <email address hidden> wrote:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=196683
>
> --- Comment #683 from Paul Menzel (<email address hidden>)
> ---
> (In reply to Ashesh Ambasta from comment #680)
>> I've been facing the same issue since I've bought this machine and setting
>> my BIOS power settings to "Typical current idle" has only partially fixed
>> things (the system still crashes every 1-2 weeks during idle: I leave my
>> desk, screensaver fires up, then screens go to sleep, and by the time I get
>> back to my desk, the system no longer responds to any input and the only
>> option is a hard reset.).
>>
>> At the time of writing, I can't tell if the AGESA version has made it into
>> my motherboard: https://www.asrock.com/mb/AMD/X399M%20Taichi/index.asp#BIOS
>
> [I have an MSI board, but checked the firmware update files for you.]
>
> The description for version 3.60 includes:
>
>> Update AMD AGESA ThreadRipperPI-SP3r2 1.1.0.2
>
> Reading the ASRock forum thread *What happened to Fatality X399 Bios 1.60?*
> [1], it’s enough to run the firmware update file through a hexeditor. I
> searched for *GES* and was lucky for version 3.80.
>
> $ hexdump -C X399MT3.80 | less
> […]
> 00e34ce0 00 00 00 00 00 00 00 00 00 00 00 00 be 2c 63 17
> |.............,c.|
> 00e34cf0 80 6c b6 49 82 07 12 b5 3d 9b 25 70 41 47 45 53
> |.l.I....=.%pAGES|
> 00e34d00 41 21 56 39 00 54 68 72 65 61 64 52 69 70 70 65
> |A!V9.ThreadRippe|
> 00e34d10 72 50 49 2d 53 50 33 72 32 2d 31 2e 31 2e 30 2e
> |rPI-SP3r2-1.1.0.|
> 00e34d20 32 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> |2...............|
> […]
> ```
>
> So, it still has AGESA version 1.1.0.2.
>
> […]
>
>
> [1]:
>
> http://forum.asrock.com/forum_posts.asp?TID=6165&PN=4&title=what-happened-to-fatality-x399-bios-160
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

Several experiments with Ryzen 1700 and Gigabyte X370 Gaming K7 Board. Best choice was "typical current idle", but the latest Bios did'nt have that option. So I bought a R5 3600 put the latest BIOS at the board and all the problems gone away. Rock solid now. I think, that the first generation of Ryzen make some problems with Linux - mine was a RMA'ed because of the compiler error. See Comment 195.
Meanwhile gcc-9.3.0, Kernel-5.7.0, Bios F50a

(In reply to Paul Menzel from comment #683)
> (In reply to Ashesh Ambasta from comment #680)
> > I've been facing the same issue since I've bought this machine and setting
> > my BIOS power settings to "Typical current idle" has only partially fixed
> > things (the system still crashes every 1-2 weeks during idle: I leave my
> > desk, screensaver fires up, then screens go to sleep, and by the time I get
> > back to my desk, the system no longer responds to any input and the only
> > option is a hard reset.).
> >
> > At the time of writing, I can't tell if the AGESA version has made it into
> > my motherboard: https://www.asrock.com/mb/AMD/X399M%20Taichi/index.asp#BIOS
>
> [I have an MSI board, but checked the firmware update files for you.]
>
> The description for version 3.60 includes:
>
> > Update AMD AGESA ThreadRipperPI-SP3r2 1.1.0.2
>
> Reading the ASRock forum thread *What happened to Fatality X399 Bios 1.60?*
> [1], it’s enough to run the firmware update file through a hexeditor. I
> searched for *GES* and was lucky for version 3.80.
>
> $ hexdump -C X399MT3.80 | less
> […]
> 00e34ce0 00 00 00 00 00 00 00 00 00 00 00 00 be 2c 63 17
> |.............,c.|
> 00e34cf0 80 6c b6 49 82 07 12 b5 3d 9b 25 70 41 47 45 53
> |.l.I....=.%pAGES|
> 00e34d00 41 21 56 39 00 54 68 72 65 61 64 52 69 70 70 65
> |A!V9.ThreadRippe|
> 00e34d10 72 50 49 2d 53 50 33 72 32 2d 31 2e 31 2e 30 2e
> |rPI-SP3r2-1.1.0.|
> 00e34d20 32 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> |2...............|
> […]
> ```
>
> So, it still has AGESA version 1.1.0.2.
>
> […]
>
>
> [1]:
> http://forum.asrock.com/forum_posts.asp?TID=6165&PN=4&title=what-happened-to-
> fatality-x399-bios-160

Hi,
Agesa for TR is not the same than for AM4 in versioning.

I’ve tried that option of “Typical current idle” to no avail.

Best,

Ashesh Ambasta

> On 1 Jun 2020, at 20:51, <email address hidden> wrote:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=196683
>
> --- Comment #685 from ChrisB (<email address hidden>) ---
> Several experiments with Ryzen 1700 and Gigabyte X370 Gaming K7 Board. Best
> choice was "typical current idle", but the latest Bios did'nt have that
> option.
> So I bought a R5 3600 put the latest BIOS at the board and all the problems
> gone away. Rock solid now. I think, that the first generation of Ryzen make
> some problems with Linux - mine was a RMA'ed because of the compiler error.
> See
> Comment 195.
> Meanwhile gcc-9.3.0, Kernel-5.7.0, Bios F50a
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

I've also found https://www.reddit.com/r/Amd/comments/8yzvxz/ryzen_c6_state_sleep_power_supply_common_current/e33lc6s/ which seems to be promising. It is, however, appalling to see this kind of response from AMD and I'm quite sure this is my last AMD system.

I'm currently testing my system with "Deep sleep" enabled and will report if its stable. The "Typical current idle" fix certainly didn't fix things for me, and in any case, the best I can say about that "fix" is that its an ugly hack and not really a fix from AMD.

I'm bewildered to see so many reports on this thread and no word from AMD. I'm also convinced they are aware of this issue and are doing nothing but propose cheap hacks.

AMD has also approved my RMA request and I'll send this processor back if the "deep sleep" "fix" doesn't fix things from me. If the RMA'd processor doesn't work, I'm going to sell.

(In reply to raulvior.bcn from comment #653)
> (In reply to txrx from comment #651)
>
> Typical Current Idle might not be working. Read the sensor output. If
> voltage is not higher than without enabling it, try to increase the core
> voltage.
>
> My Ryzen 7 1800X seems to not produce hangs since I upgraded to 1003ABB with
> an ASUS Crosshair VI Hero and enabled Typical current idle.
>
>
> > I was able to update my BIOS to version 18, but my system still locks up.
> > I tried the following with the new BIOS:
> > - use factory defaults
> > - disable SMT
> > - disable SMT with Typical Current Idle
> > - all of the above with SVM disabled/enabled
> > Right now I set the power supply idle control to "Low ..." and will report
> > back.
> >

The motherboard kept hanging. I had to remove the Vitals GNOME Extension. It seems that polling voltage values hangs the motherboard... Still, there are times that the computer does not come back from suspend. There's something wrong with the BIOS/UEFI.

1 comments hidden view all 792 comments

As a last resort, I've tried `idle=halt` on this machine. And yet my
system just crashed after 3 weeks of uptime.

I'm done with AMD. I will RMA this processor to try things out, but
overall, if that doesn't work, this thing is headed to the junkyard and
I'm going to live with Intel.

At least in the 13 or so odd Intel systems I've tried, I've not had
exhasperating issues like these where the company is positively trying
to ignore this ongoing issue.

This is disgusting from AMD.

On 6/11/20 6:03 PM, <email address hidden> wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=196683
>
> --- Comment #689 from <email address hidden> ---
> (In reply to raulvior.bcn from comment #653)
>> (In reply to txrx from comment #651)
>>
>> Typical Current Idle might not be working. Read the sensor output. If
>> voltage is not higher than without enabling it, try to increase the core
>> voltage.
>>
>> My Ryzen 7 1800X seems to not produce hangs since I upgraded to 1003ABB with
>> an ASUS Crosshair VI Hero and enabled Typical current idle.
>>
>>
>>> I was able to update my BIOS to version 18, but my system still locks up.
>>> I tried the following with the new BIOS:
>>> - use factory defaults
>>> - disable SMT
>>> - disable SMT with Typical Current Idle
>>> - all of the above with SVM disabled/enabled
>>> Right now I set the power supply idle control to "Low ..." and will report
>>> back.
>>>
> The motherboard kept hanging. I had to remove the Vitals GNOME Extension. It
> seems that polling voltage values hangs the motherboard... Still, there are
> times that the computer does not come back from suspend. There's something
> wrong with the BIOS/UEFI.
>

Before you discard the CPU if your BIOS has PBO option turn it off, with latest BIOS on my B350 motherboard version 5406 PBO is on by default and apart from causing high temperature it would lead to this kind of crashes and reboots with my Ryzen7. Turning off PBO fixed it and now I am running a Ryzen9 with the same board and all works fine with PBO off.

I’ll try that, but I doubt if at this stage it will have any effect.

Best,

Ashesh Ambasta

> On 27 Jun 2020, at 22:31, <email address hidden> wrote:
>
> https://bugzilla.kernel.org/show_bug.cgi?id=196683
>
> --- Comment #693 from Arup (<email address hidden>) ---
> Before you discard the CPU if your BIOS has PBO option turn it off, with
> latest
> BIOS on my B350 motherboard version 5406 PBO is on by default and apart from
> causing high temperature it would lead to this kind of crashes and reboots
> with
> my Ryzen7. Turning off PBO fixed it and now I am running a Ryzen9 with the
> same
> board and all works fine with PBO off.
>
> --
> You are receiving this mail because:
> You are on the CC list for the bug.

Moreover, from a bit of reading I see that PBO is related to how fast the processor clock speeds are boosted. How is this related to crashes when idle?

Best,

Ashesh Ambasta

> On 28 Jun 2020, at 07:23, Ashesh Ambasta <email address hidden> wrote:
>
> I’ll try that, but I doubt if at this stage it will have any effect.
>
> Best,
>
> Ashesh Ambasta
>
>> On 27 Jun 2020, at 22:31, <email address hidden> wrote:
>>
>> https://bugzilla.kernel.org/show_bug.cgi?id=196683
>>
>> --- Comment #693 from Arup (<email address hidden>) ---
>> Before you discard the CPU if your BIOS has PBO option turn it off, with
>> latest
>> BIOS on my B350 motherboard version 5406 PBO is on by default and apart from
>> causing high temperature it would lead to this kind of crashes and reboots
>> with
>> my Ryzen7. Turning off PBO fixed it and now I am running a Ryzen9 with the
>> same
>> board and all works fine with PBO off.
>>
>> --
>> You are receiving this mail because:
>> You are on the CC list for the bug.

Did you ask for an RMA? Did it work?
(In reply to Ashesh Ambasta from comment #692)
> As a last resort, I've tried `idle=halt` on this machine. And yet my
> system just crashed after 3 weeks of uptime.
>
> I'm done with AMD. I will RMA this processor to try things out, but
> overall, if that doesn't work, this thing is headed to the junkyard and
> I'm going to live with Intel.
>
> At least in the 13 or so odd Intel systems I've tried, I've not had
> exhasperating issues like these where the company is positively trying
> to ignore this ongoing issue.
>
> This is disgusting from AMD.
>
> On 6/11/20 6:03 PM, <email address hidden> wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=196683
> >
> > --- Comment #689 from <email address hidden> ---
> > (In reply to raulvior.bcn from comment #653)
> >> (In reply to txrx from comment #651)
> >>
> >> Typical Current Idle might not be working. Read the sensor output. If
> >> voltage is not higher than without enabling it, try to increase the core
> >> voltage.
> >>
> >> My Ryzen 7 1800X seems to not produce hangs since I upgraded to 1003ABB
> with
> >> an ASUS Crosshair VI Hero and enabled Typical current idle.
> >>
> >>
> >>> I was able to update my BIOS to version 18, but my system still locks up.
> >>> I tried the following with the new BIOS:
> >>> - use factory defaults
> >>> - disable SMT
> >>> - disable SMT with Typical Current Idle
> >>> - all of the above with SVM disabled/enabled
> >>> Right now I set the power supply idle control to "Low ..." and will
> report
> >>> back.
> >>>
> > The motherboard kept hanging. I had to remove the Vitals GNOME Extension.
> It
> > seems that polling voltage values hangs the motherboard... Still, there are
> > times that the computer does not come back from suspend. There's something
> > wrong with the BIOS/UEFI.
> >

Download full text (3.5 KiB)

Created attachment 290243
attachment-20700-0.html

I haven't; and to be honest, I've been procrastinating this issue.

As a very ugly hack/workaround; I've disabled screen power management in
xscreensaver: so the CPU keeps drawing graphics on my screen instead of
my displays going to sleep.

That way, my CPU never really enters the idle states for the crashes to
occur.

I understand that this is /far/ from a satisfactory solution; but I
didn't want to try my luck with the RMA anymore. As long as my system
doesn't crash, I can live with this CPU (albeit this continues to
frustrate me). I may lose patience in the coming months and go for  an
RMA anyway. But I'm deterred by the mixed reports for the RMA as well:
some people claim that an RMA fixes their issues; some people say it
makes no difference. I've even read reports of the RMA'd CPU actually
turning out to be worse.

I don't think I'm prepared for the gamble. I've been burnt pretty bad
with AMD at the moment. For now, I'm just making this work. The next
time I'm buying a CPU, I'll do my research more thoroughly and stay away
from AMD.

AMD did publish an errata in which they claim an issue like this exists;
but a solution is ruled out. Which is further bad news. There was some
discussion on a fix being at the kernel level, but that isn't anywhere
in sight either. I believe these CPU's are plagued by several issues,
which probably makes a fix for this at the kernel level hard. However,
Windows seems to have managed to fix it.

Anyway; rants aside, this is my current take on the CPU.

On 7/12/20 3:58 PM, <email address hidden> wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=196683
>
> --- Comment #698 from <email address hidden> ---
> Did you ask for an RMA? Did it work?
> (In reply to Ashesh Ambasta from comment #692)
>> As a last resort, I've tried `idle=halt` on this machine. And yet my
>> system just crashed after 3 weeks of uptime.
>>
>> I'm done with AMD. I will RMA this processor to try things out, but
>> overall, if that doesn't work, this thing is headed to the junkyard and
>> I'm going to live with Intel.
>>
>> At least in the 13 or so odd Intel systems I've tried, I've not had
>> exhasperating issues like these where the company is positively trying
>> to ignore this ongoing issue.
>>
>> This is disgusting from AMD.
>>
>> On 6/11/20 6:03 PM, <email address hidden> wrote:
>>> https://bugzilla.kernel.org/show_bug.cgi?id=196683
>>>
>>> --- Comment #689 from <email address hidden> ---
>>> (In reply to raulvior.bcn from comment #653)
>>>> (In reply to txrx from comment #651)
>>>>
>>>> Typical Current Idle might not be working. Read the sensor output. If
>>>> voltage is not higher than without enabling it, try to increase the core
>>>> voltage.
>>>>
>>>> My Ryzen 7 1800X seems to not produce hangs since I upgraded to 1003ABB
>> with
>>>> an ASUS Crosshair VI Hero and enabled Typical current idle.
>>>>
>>>>
>>>>> I was able to update my BIOS to version 18, but my system still locks up.
>>>>> I tried the following with the new BIOS:
>>>>> - use factory defaults
>>>>> - disable SMT
>>>>> - disable SMT with Typical Current Idle
>>>>...

Read more...

It amazing tho, but most of all thanks for this.i would like to visit here https://vibenext.com/
https://massivebaze.xyz

(In reply to Ashesh Ambasta from comment #699)

> Created attachment 290243 [details]
> attachment-20700-0.html

I am getting:

> Sorry, you are not authorized to access attachment #290243.

[…]

> AMD did publish an errata in which they claim an issue like this exists;
> but a solution is ruled out. Which is further bad news. There was some
> discussion on a fix being at the kernel level, but that isn't anywhere
> in sight either. I believe these CPU's are plagued by several issues,
> which probably makes a fix for this at the kernel level hard. However,
> Windows seems to have managed to fix it.

That’s interesting. Could you please share the URL to these statements?

[…]

raulvior.bcn, Asheesh: Please always remove the cited parts, as otherwise the Web issue page gets even more convoluted. Thanks.

> Anyway; rants aside, this is my current take on the CPU.
>
> On 7/12/20 3:58 PM, <email address hidden> wrote:
> > https://bugzilla.kernel.org/show_bug.cgi?id=196683
> >
> > --- Comment #698 from <email address hidden> ---
> > Did you ask for an RMA? Did it work?
> > (In reply to Ashesh Ambasta from comment #692)
> >> As a last resort, I've tried `idle=halt` on this machine. And yet my
> >> system just crashed after 3 weeks of uptime.
> >>
> >> I'm done with AMD. I will RMA this processor to try things out, but
> >> overall, if that doesn't work, this thing is headed to the junkyard and
> >> I'm going to live with Intel.
> >>
> >> At least in the 13 or so odd Intel systems I've tried, I've not had
> >> exhasperating issues like these where the company is positively trying
> >> to ignore this ongoing issue.
> >>
> >> This is disgusting from AMD.
> >>
> >> On 6/11/20 6:03 PM, <email address hidden> wrote:
> >>> https://bugzilla.kernel.org/show_bug.cgi?id=196683
> >>>
> >>> --- Comment #689 from <email address hidden> ---
> >>> (In reply to raulvior.bcn from comment #653)
> >>>> (In reply to txrx from comment #651)
> >>>>
> >>>> Typical Current Idle might not be working. Read the sensor output. If
> >>>> voltage is not higher than without enabling it, try to increase the core
> >>>> voltage.
> >>>>
> >>>> My Ryzen 7 1800X seems to not produce hangs since I upgraded to 1003ABB
> >> with
> >>>> an ASUS Crosshair VI Hero and enabled Typical current idle.
> >>>>
> >>>>
> >>>>> I was able to update my BIOS to version 18, but my system still locks
> up.
> >>>>> I tried the following with the new BIOS:
> >>>>> - use factory defaults
> >>>>> - disable SMT
> >>>>> - disable SMT with Typical Current Idle
> >>>>> - all of the above with SVM disabled/enabled
> >>>>> Right now I set the power supply idle control to "Low ..." and will
> >> report
> >>>>> back.
> >>>>>
> >>> The motherboard kept hanging. I had to remove the Vitals GNOME Extension.
> >> It
> >>> seems that polling voltage values hangs the motherboard... Still, there
> are
> >>> times that the computer does not come back from suspend. There's
> something
> >>> wrong with the BIOS/UEFI.
> >>>

Traveling with a musical instrument requires careful handling and storing because of its fragility. It can easily be damaged with humidity and other unexpected situations like bumps and extreme pressure. https://sahiphopza.com/download-mp3/august-alsina-ft-rick-ross-entanglements/ https://sahiphopza.com/ https://sahiphopza.com/download-mp3/stogie-t-ft-benny-the-butcher-animals/

I'm also experiencing severe and constant crashes with a Ryzen 5 3600X. I also had problems with my Ryzen 7 1700 and I was confident that the problem was never going to have problems again with my Ryzen 5 3600X, but I was wrong.

First months everything was fine, and after reinstalling the system, everything went wrong. Crashes daily and constantly. I had a critical Git repository corrupted (luckily I could recover things in time), but it completely destroyed my experience with my PC and actually become a financial gap as I use the computer to work on.

Not sure what to do. I've decided to try until the end of this year not to have more crashes, but as a Linux user, I'm unsure I will ever think about buying AMD again. I will always remember about the hassle and lost years I had with this.

Sometimes all you want to do is to have a great experience with your Linux computer and BOOM, a hard freeze that requires manually turning the computer off. It's ridiculous. To be honest, I expected much more from AMD. I hope they'll fix it ASAP, we can't stand this anymore. Intel becomes our only option with this problem.

I also opened an issue where I'm posting updates about my problems: https://bugzilla.kernel.org/show_bug.cgi?id=208615

Have you tried the suggested fixes in this thread? And what changed
between that reinstall and earlier?

On 7/24/20 4:17 AM, <email address hidden> wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=196683
>
> Victor Queiroz (<email address hidden>) changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> CC| |<email address hidden>
>
> --- Comment #703 from Victor Queiroz (<email address hidden>) ---
> I'm also experiencing severe and constant crashes with a Ryzen 5 3600X. I
> also
> had problems with my Ryzen 7 1700 and I was confident that the problem was
> never going to have problems again with my Ryzen 5 3600X, but I was wrong.
>
> First months everything was fine, and after reinstalling the system,
> everything
> went wrong. Crashes daily and constantly. I had a critical Git repository
> corrupted (luckily I could recover things in time), but it completely
> destroyed
> my experience with my PC and actually become a financial gap as I use the
> computer to work on.
>
> Not sure what to do. I've decided to try until the end of this year not to
> have
> more crashes, but as a Linux user, I'm unsure I will ever think about buying
> AMD again. I will always remember about the hassle and lost years I had with
> this.
>
> Sometimes all you want to do is to have a great experience with your Linux
> computer and BOOM, a hard freeze that requires manually turning the computer
> off. It's ridiculous. To be honest, I expected much more from AMD. I hope
> they'll fix it ASAP, we can't stand this anymore. Intel becomes our only
> option
> with this problem.
>
> I also opened an issue where I'm posting updates about my problems:
> https://bugzilla.kernel.org/show_bug.cgi?id=208615
>

Even with screensavers enabled and displays never going to sleep; I just
experienced a crash on idle.

Here's the logs (https://pastebin.com/D7SEJn9J). I did change the
display configuration just before leaving my desk (with xrandr) which
explain some of these logs. But I can see nothing from the kernel etc.

The system just dies.

On 7/24/20 4:17 AM, <email address hidden> wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=196683
>
> Victor Queiroz (<email address hidden>) changed:
>
> What |Removed |Added
> ----------------------------------------------------------------------------
> CC| |<email address hidden>
>
> --- Comment #703 from Victor Queiroz (<email address hidden>) ---
> I'm also experiencing severe and constant crashes with a Ryzen 5 3600X. I
> also
> had problems with my Ryzen 7 1700 and I was confident that the problem was
> never going to have problems again with my Ryzen 5 3600X, but I was wrong.
>
> First months everything was fine, and after reinstalling the system,
> everything
> went wrong. Crashes daily and constantly. I had a critical Git repository
> corrupted (luckily I could recover things in time), but it completely
> destroyed
> my experience with my PC and actually become a financial gap as I use the
> computer to work on.
>
> Not sure what to do. I've decided to try until the end of this year not to
> have
> more crashes, but as a Linux user, I'm unsure I will ever think about buying
> AMD again. I will always remember about the hassle and lost years I had with
> this.
>
> Sometimes all you want to do is to have a great experience with your Linux
> computer and BOOM, a hard freeze that requires manually turning the computer
> off. It's ridiculous. To be honest, I expected much more from AMD. I hope
> they'll fix it ASAP, we can't stand this anymore. Intel becomes our only
> option
> with this problem.
>
> I also opened an issue where I'm posting updates about my problems:
> https://bugzilla.kernel.org/show_bug.cgi?id=208615
>

Free MP3 Music Search and Download by Genre, Artist or Title. Download or Listen Online to Top Artists and Songs.Free music download! South African best hits and biggest catalog. Stream and download high quality mp3 and listen to popular playlists.

https://abokiplay.com

https://abokiplay.com/download-mp3/6ix9ine-punani/

https://abokiplay.com/album/brandy-b7-download/

https://abokiplay.com/album

Free MP3 Music Search and Download by Genre, Artist or Title. Download or Listen Online to Top Artists and Songs.Free music download! South African best hits and biggest catalog. Stream and download high quality mp3 and listen to popular playlists.

https://okayvibes.ng

https://www.okayvibes.ng/burna-boy-wonderful/

https://www.okayvibes.ng/skales-badman-love-video/

https://www.okayvibes.ng/albums

Displaying first 40 and last 40 comments. View all 792 comments or add a comment.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.