Testing results on QDF2400 showing a recoverable DDR error, correctable vendor specific error, correctable ARM cache error, and fatal vendor specific error. All functionality appears to be working properly. ubuntu@null-8cfdf006a3ef:~$ uname -a Linux null-8cfdf006a3ef 4.10.0-29-generic #33~lp1706141+build.2-Ubuntu SMP Tue Jul 25 19:12:22 UTC 2017 aarch64 aarch64 aarch64 GNU/Linux ubuntu@null-8cfdf006a3ef:~$ dmesg | grep -i -E 'hest|ghes|edac|hardware' [ 0.000000] ACPI: HEST 0x0000000008A60000 000288 (v01 QCOM QDF2400 00000001 INTL 20150515) [ 0.538984] HEST: Table parsing has been initialized. [ 3.854385] EDAC MC: Ver: 3.0.0 [ 5.537078] ghes_edac: This EDAC driver relies on BIOS to enumerate memory and get error reports. [ 5.545952] ghes_edac: Unfortunately, not all BIOSes reflect the memory layout correctly. [ 5.554123] ghes_edac: So, the end result of using this driver varies from vendor to vendor. [ 5.562555] ghes_edac: If you find incorrect reports, please contact your hardware vendor [ 5.570727] ghes_edac: to correct its BIOS. [ 5.574905] ghes_edac: This system has 6 DIMM sockets. [ 5.580205] EDAC MC0: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT) [ 5.589763] EDAC MC1: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT) [ 5.599319] EDAC MC2: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT) [ 5.608867] EDAC MC3: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT) [ 5.618416] EDAC MC4: Giving out device to module ghes_edac.c controller ghes_edac: DEV ghes (INTERRUPT) [ 5.628018] GHES: APEI firmware first mode is enabled by APEI bit and WHEA _OSC. [ 6.573372] qcom-emac QCOM8070:00 eth0: hardware id 64.1, hardware version 1.3.0 [ 224.669058] {1}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 [ 224.677330] {1}[Hardware Error]: event severity: recoverable [ 224.682992] {1}[Hardware Error]: precise tstamp: 2017-07-26 15:58:19 [ 224.689437] {1}[Hardware Error]: Error 0, type: recoverable [ 224.695097] {1}[Hardware Error]: section_type: memory error [ 224.700846] {1}[Hardware Error]: error_status: 0x00000000000c0400 [ 224.707113] {1}[Hardware Error]: physical_address: 0x0000000000204e10 [ 224.713726] {1}[Hardware Error]: physical_address_mask: 0x00000fffffffffff [ 224.720776] {1}[Hardware Error]: node: 0 card: 1 module: 0 rank: 0 bank: 0 device: 0 row: 4 column: 306 [ 224.730427] {1}[Hardware Error]: error_type: 3, multi-bit ECC [ 224.736356] EDAC MC0: 1 UE Multi-bit ECC on unknown label (node:0 card:1 module:0 rank:0 bank:0 row:4 col:306 page:0x204 offset:0xe10 grain:-4096 - status(0x00000000000c0400): Storage error in DRAM memory) [ 224.736358] [Firmware Warn]: GHES: Invalid address in generic error data: 0x204e10 [ 251.685322] {2}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 2 [ 251.685324] {2}[Hardware Error]: It has been corrected by h/w and requires no further action [ 251.685336] {2}[Hardware Error]: event severity: corrected [ 251.685341] {2}[Hardware Error]: precise tstamp: 2017-07-26 15:58:30 [ 251.685342] {2}[Hardware Error]: Error 0, type: corrected [ 251.685348] {2}[Hardware Error]: section type: unknown, d2e2621c-f936-468d-0d84-15a4ed015c8b [ 251.685349] {2}[Hardware Error]: section length: 0x238 [ 251.685355] {2}[Hardware Error]: 00000000: 4d415201 4d492031 453a4d45 435f4343 .RAM1 IMEM:ECC_C [ 251.685358] {2}[Hardware Error]: 00000010: 53515f45 44525f42 00000000 00000000 E_QSB_RD........ [ 251.685361] {2}[Hardware Error]: 00000020: 00000000 00000000 00000000 00000000 ................ [ 251.685364] {2}[Hardware Error]: 00000030: 00000000 00000000 01010000 01010000 ................ [ 251.685367] {2}[Hardware Error]: 00000040: 00000000 00000000 00000005 00000000 ................ [ 251.685369] {2}[Hardware Error]: 00000050: 01010000 00000000 00000001 00010100 ................ [ 251.685372] {2}[Hardware Error]: 00000060: 00000000 00000000 00000000 00000000 ................ [ 251.685375] {2}[Hardware Error]: 00000070: 00000000 00000000 00000000 00000000 ................ [ 251.685378] {2}[Hardware Error]: 00000080: 00000000 00000000 00000000 00000000 ................ [ 251.685381] {2}[Hardware Error]: 00000090: 00000000 00000000 00000000 00000000 ................ [ 251.685384] {2}[Hardware Error]: 000000a0: 00000000 00000000 00000000 00000000 ................ [ 251.685387] {2}[Hardware Error]: 000000b0: 00000000 00000000 00000000 00000000 ................ [ 251.685389] {2}[Hardware Error]: 000000c0: 00000000 00000000 00000000 00000000 ................ [ 251.685392] {2}[Hardware Error]: 000000d0: 00000000 00000000 00000000 00000000 ................ [ 251.685395] {2}[Hardware Error]: 000000e0: 00000000 00000000 00000000 00000000 ................ [ 251.685398] {2}[Hardware Error]: 000000f0: 00000000 00000000 00000000 00000000 ................ [ 251.685402] {2}[Hardware Error]: 00000100: 00000000 00000000 00000000 00000000 ................ [ 251.685405] {2}[Hardware Error]: 00000110: 00000000 00000000 00000000 00000000 ................ [ 251.685408] {2}[Hardware Error]: 00000120: 00000000 00000000 00000000 00000000 ................ [ 251.685410] {2}[Hardware Error]: 00000130: 00000000 00000000 00000000 00000000 ................ [ 251.685413] {2}[Hardware Error]: 00000140: 00000000 00000000 00000000 00000000 ................ [ 251.685416] {2}[Hardware Error]: 00000150: 00000000 00000000 00000000 00000000 ................ [ 251.685419] {2}[Hardware Error]: 00000160: 00000000 00000000 00000000 00000000 ................ [ 251.685423] {2}[Hardware Error]: 00000170: 00000000 00000000 00000000 00000000 ................ [ 251.685426] {2}[Hardware Error]: 00000180: 00000000 00000000 00000000 00000000 ................ [ 251.685429] {2}[Hardware Error]: 00000190: 00000000 00000000 00000000 00000000 ................ [ 251.685432] {2}[Hardware Error]: 000001a0: 00000000 00000000 00000000 00000000 ................ [ 251.685434] {2}[Hardware Error]: 000001b0: 00000000 00000000 00000000 00000000 ................ [ 251.685437] {2}[Hardware Error]: 000001c0: 00000000 00000000 00000000 00000000 ................ [ 251.685440] {2}[Hardware Error]: 000001d0: 00000000 00000000 00000000 00000000 ................ [ 251.685443] {2}[Hardware Error]: 000001e0: 00000000 00000000 00000000 00000000 ................ [ 251.685446] {2}[Hardware Error]: 000001f0: 00000000 00000000 00000000 00000000 ................ [ 251.685449] {2}[Hardware Error]: 00000200: 00000000 00000000 00000000 00000000 ................ [ 251.685451] {2}[Hardware Error]: 00000210: 00000000 00000000 00000000 00000000 ................ [ 251.685454] {2}[Hardware Error]: 00000220: 00000000 00000000 00000000 00000000 ................ [ 251.685457] {2}[Hardware Error]: 00000230: 00000000 00000000 ........ [ 357.701494] {3}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 2 [ 357.701496] {3}[Hardware Error]: event severity: info [ 357.701508] {3}[Hardware Error]: precise tstamp: 2017-07-26 16:00:12 [ 357.701510] {3}[Hardware Error]: Error 0, type: info [ 357.701513] {3}[Hardware Error]: section_type: ARM processor error [ 357.701515] {3}[Hardware Error]: MIDR: 0x00000000510f8000 [ 357.701518] {3}[Hardware Error]: Multiprocessor Affinity Register (MPIDR): 0x0000000000000000 [ 357.701520] {3}[Hardware Error]: error affinity level: 2 [ 357.701522] {3}[Hardware Error]: running state: 0x1 [ 357.701524] {3}[Hardware Error]: Power State Coordination Interface state: 0 [ 357.701527] {3}[Hardware Error]: Error info structure 0: [ 357.701529] {3}[Hardware Error]: num errors: 1 [ 357.701531] {3}[Hardware Error]: first error captured [ 357.701533] {3}[Hardware Error]: last error captured [ 357.701535] {3}[Hardware Error]: error_type: 0, cache error [ 357.701538] {3}[Hardware Error]: error_info: 0x0000000000c20058 ubuntu@null-8cfdf006a3ef:~$ ubuntu@null-8cfdf006a3ef:~$ [ 403.857832] {4}[Hardware Error]: Hardware error from APEI Generic Hardware Error Source: 1 [ 403.866103] {4}[Hardware Error]: event severity: fatal [ 403.871244] {4}[Hardware Error]: precise tstamp: 2017-07-26 16:01:18 [ 403.877690] {4}[Hardware Error]: Error 0, type: fatal [ 403.882831] {4}[Hardware Error]: section type: unknown, d2e2621c-f936-468d-0d84-15a4ed015c8b [ 403.891445] {4}[Hardware Error]: section length: 0x238 [ 403.896762] {4}[Hardware Error]: 00000000: 4d415201 4d492031 453a4d45 555f4343 .RAM1 IMEM:ECC_U [ 403.905721] {4}[Hardware Error]: 00000010: 53515f45 44525f42 00000000 00000000 E_QSB_RD........ [ 403.914682] {4}[Hardware Error]: 00000020: 00000000 00000000 00000000 00000000 ................ [ 403.923644] {4}[Hardware Error]: 00000030: 00000000 00000000 01010000 01010000 ................ [ 403.932605] {4}[Hardware Error]: 00000040: 00000000 00000000 00000005 00000000 ................ [ 403.941566] {4}[Hardware Error]: 00000050: 02020000 00000000 00000001 00c6c600 ................ [ 403.950531] {4}[Hardware Error]: 00000060: 00000000 00000000 00000000 00000000 ................ [ 403.959489] {4}[Hardware Error]: 00000070: 00000000 00000000 00000000 00000000 ................ [ 403.968450] {4}[Hardware Error]: 00000080: 00000000 00000000 00000000 00000000 ................ [ 403.977413] {4}[Hardware Error]: 00000090: 00000000 00000000 00000000 00000000 ................ [ 403.986374] {4}[Hardware Error]: 000000a0: 00000000 00000000 00000000 00000000 ................ [ 403.995339] {4}[Hardware Error]: 000000b0: 00000000 00000000 00000000 00000000 ................ [ 404.004302] {4}[Hardware Error]: 000000c0: 00000000 00000000 00000000 00000000 ................ [ 404.013263] {4}[Hardware Error]: 000000d0: 00000000 00000000 00000000 00000000 ................ [ 404.022223] {4}[Hardware Error]: 000000e0: 00000000 00000000 00000000 00000000 ................ [ 404.031183] {4}[Hardware Error]: 000000f0: 00000000 00000000 00000000 00000000 ................ [ 404.040143] {4}[Hardware Error]: 00000100: 00000000 00000000 00000000 00000000 ................ [ 404.049104] {4}[Hardware Error]: 00000110: 00000000 00000000 00000000 00000000 ................ [ 404.058064] {4}[Hardware Error]: 00000120: 00000000 00000000 00000000 00000000 ................ [ 404.067025] {4}[Hardware Error]: 00000130: 00000000 00000000 00000000 00000000 ................ [ 404.075986] {4}[Hardware Error]: 00000140: 00000000 00000000 00000000 00000000 ................ [ 404.084946] {4}[Hardware Error]: 00000150: 00000000 00000000 00000000 00000000 ................ [ 404.093907] {4}[Hardware Error]: 00000160: 00000000 00000000 00000000 00000000 ................ [ 404.102867] {4}[Hardware Error]: 00000170: 00000000 00000000 00000000 00000000 ................ [ 404.111828] {4}[Hardware Error]: 00000180: 00000000 00000000 00000000 00000000 ................ [ 404.120788] {4}[Hardware Error]: 00000190: 00000000 00000000 00000000 00000000 ................ [ 404.129752] {4}[Hardware Error]: 000001a0: 00000000 00000000 00000000 00000000 ................ [ 404.138710] {4}[Hardware Error]: 000001b0: 00000000 00000000 00000000 00000000 ................ [ 404.147673] {4}[Hardware Error]: 000001c0: 00000000 00000000 00000000 00000000 ................ [ 404.156632] {4}[Hardware Error]: 000001d0: 00000000 00000000 00000000 00000000 ................ [ 404.165593] {4}[Hardware Error]: 000001e0: 00000000 00000000 00000000 00000000 ................ [ 404.174555] {4}[Hardware Error]: 000001f0: 00000000 00000000 00000000 00000000 ................ [ 404.183516] {4}[Hardware Error]: 00000200: 00000000 00000000 00000000 00000000 ................ [ 404.192476] {4}[Hardware Error]: 00000210: 00000000 00000000 00000000 00000000 ................ [ 404.201438] {4}[Hardware Error]: 00000220: 00000000 00000000 00000000 00000000 ................ [ 404.210398] {4}[Hardware Error]: 00000230: 00000000 00000000 ........ [ 404.218665] Kernel panic - not syncing: Fatal hardware error! [ 404.224406] CPU: 0 PID: 217 Comm: kworker/0:1 Not tainted 4.10.0-29-generic #33~lp1706141+build.2-Ubuntu [ 404.233876] Hardware name: Qualcomm Qualcomm Centriq(TM) 2400 Development Platform/ABW|SYS|CVR,1DPC|V3 , BIOS XBL.DF.2.0.R1-00512 QDF2400_REL CR [ 404.247695] Workqueue: kacpi_notify acpi_os_execute_deferred [ 404.253347] Call trace: [ 404.255790] [] dump_backtrace+0x0/0x2b0 [ 404.261182] [] show_stack+0x24/0x30 [ 404.266230] [] dump_stack+0x9c/0xbc [ 404.271276] [] panic+0x140/0x2b0 [ 404.276061] [] ghes_proc+0x1d8/0x568 [ 404.281191] [] ghes_notify_sci+0x44/0x70 [ 404.286670] [] notifier_call_chain+0x5c/0xa0 [ 404.292495] [] __blocking_notifier_call_chain+0x58/0xa0 [ 404.299274] [] blocking_notifier_call_chain+0x3c/0x50 [ 404.305883] [] acpi_hed_notify+0x24/0x30 [ 404.311361] [] acpi_device_notify+0x30/0x40 [ 404.317101] [] acpi_ev_notify_dispatch+0x4c/0x70 [ 404.323274] [] acpi_os_execute_deferred+0x24/0x38 [ 404.329535] [] process_one_work+0x158/0x478 [ 404.335273] [] worker_thread+0x50/0x4a8 [ 404.340665] [] kthread+0x108/0x138 [ 404.345622] [] ret_from_fork+0x10/0x30 [ 404.350934] SMP: stopping secondary CPUs [ 404.356117] Starting crashdump kernel... [ 404.360034] Bye!