Comment 14 for bug 1857413

Revision history for this message
fan jinke (fanjinke) wrote :

So sorry for the late reply.

The debs which at https://people.canonical.com/~phlin/kernel/lp-1857413-ras-err-msg/ work well, and the same with Ubuntu 19.10 server.

dmesg log:
[ 316.984470] mce: [Hardware Error]: Machine check events logged
[ 316.984475] [Hardware Error]: Corrected error, no action required.
[ 316.984537] [Hardware Error]: CPU:0 (18:0:2) MC16_STATUS[Over|CE|MiscV|-|AddrV|-|-|SyndV|-|CECC]: 0xdc2040000000011b
[ 316.984610] [Hardware Error]: Error Addr: 0x00000007de33d040
[ 316.984654] [Hardware Error]: IPID: 0x0000009600150f00, Syndrome: 0x000040100a400f00
[ 316.984712] [Hardware Error]: Unified Memory Controller Extended Error Code: 0
[ 316.984765] [Hardware Error]: Unified Memory Controller Error: DRAM ECC error.
[ 316.984881] WARNING: CPU: 0 PID: 109 at drivers/edac/edac_mc.c:1243 edac_mc_handle_error+0x53f/0x590
[ 316.984883] Modules linked in: msr nls_iso8859_1 dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua amd64_edac_mod ipmi_ssif edac_mce_amd kvm_amd ccp kvm irqbypass ipmi_si input_leds ipmi_devintf ipmi_msghandler k10temp mac_hid sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid ast ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ahci igb drm libahci dca i2c_algo_bit
[ 316.984938] CPU: 0 PID: 109 Comm: kworker/0:2 Not tainted 5.0.0-38-generic #41
[ 316.984939] Hardware name: Sugon HygonH210/HygonH210, BIOS 210ER119 03/15/2019
[ 316.984946] Workqueue: events mce_gen_pool_process
[ 316.984951] RIP: 0010:edac_mc_handle_error+0x53f/0x590
[ 316.984953] Code: 77 6e 20 41 b9 72 79 00 00 49 89 84 24 88 05 00 00 48 8b 45 b8 c7 40 08 6d 65 6d 6f 66 44 89 48 0c c6 40 0e 00 e9 6c fd ff ff <0f> 0b 49 c7 82 b0 06 00 00 01 00 00 00 31 c0 e9 48 fe ff ff 40 84
[ 316.984955] RSP: 0018:ffffb03743b33c68 EFLAGS: 00010246
[ 316.984958] RAX: 0000000000000000 RBX: ffffffff8a9b81f1 RCX: 0000000000000001
[ 316.984959] RDX: 0000000000000000 RSI: ffffffff8a9b81f7 RDI: ffff9e7219335c9a
[ 316.984960] RBP: ffffb03743b33ce8 R08: ffffffff8a973dc8 R09: 000000007568c237
[ 316.984961] R10: ffff9e7219335800 R11: ffff9e7219335c99 R12: 0000000000000002
[ 316.984962] R13: ffff9e7219335c9a R14: ffff9e7219335800 R15: 00000000ffffffff
[ 316.984964] FS: 0000000000000000(0000) GS:ffff9e721d000000(0000) knlGS:0000000000000000
[ 316.984965] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 316.984967] CR2: 000055d228ad3d90 CR3: 0000000852780000 CR4: 00000000003406f0
[ 316.984968] Call Trace:
[ 316.984991] __log_ecc_error+0x62/0x90 [amd64_edac_mod]
[ 316.984995] decode_umc_error+0xac/0x190 [amd64_edac_mod]
[ 316.985002] amd_decode_mce.cold.27+0xa7c/0xa81 [edac_mce_amd]
[ 316.985011] notifier_call_chain+0x4c/0x70
[ 316.985014] blocking_notifier_call_chain+0x43/0x60
[ 316.985016] mce_gen_pool_process+0x41/0x70
[ 316.985023] process_one_work+0x20f/0x410
[ 316.985025] worker_thread+0x34/0x400
[ 316.985028] kthread+0x120/0x140
[ 316.985031] ? process_one_work+0x410/0x410
[ 316.985033] ? __kthread_parkme+0x70/0x70
[ 316.985043] ret_from_fork+0x22/0x40
[ 316.985046] ---[ end trace 324c2dc485143f45 ]---
[ 316.985053] EDAC MC0: 1 CE on mc#0csrow#0channel#1 (csrow:0 channel:1 page:0x85e33d offset:0x40 grain:1 syndrome:0x4010)
[ 316.985054] [Hardware Error]: cache level: L3/GEN, tx: GEN, mem-tx: RD

uname -a
Linux ubuntu 5.0.0-38-generic #41 SMP Thu Dec 26 09:14:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux