Activity log for bug #1774964

Date Who What changed Old value New value Message
2018-06-04 09:39:23 bugproxy bug added bug
2018-06-04 09:39:26 bugproxy tags architecture-ppc64le bugnameltc-167176 severity-high targetmilestone-inin1804
2018-06-04 09:39:28 bugproxy ubuntu: assignee Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
2018-06-04 09:39:31 bugproxy affects ubuntu linux (Ubuntu)
2018-06-04 14:53:50 Andrew Cloke bug task added ubuntu-power-systems
2018-06-04 14:54:25 Andrew Cloke tags architecture-ppc64le bugnameltc-167176 severity-high targetmilestone-inin1804 architecture-ppc64le bugnameltc-167176 p9 severity-high targetmilestone-inin1804 triage-g
2018-06-04 14:54:31 Andrew Cloke ubuntu-power-systems: importance Undecided High
2018-06-04 14:54:42 Andrew Cloke ubuntu-power-systems: assignee Canonical Kernel Team (canonical-kernel-team)
2018-06-04 19:50:58 Joseph Salisbury linux (Ubuntu): status New In Progress
2018-06-04 19:51:00 Joseph Salisbury linux (Ubuntu): importance Undecided High
2018-06-04 19:51:03 Joseph Salisbury linux (Ubuntu): assignee Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) Joseph Salisbury (jsalisbury)
2018-06-04 19:51:09 Joseph Salisbury nominated for series Ubuntu Cosmic
2018-06-04 19:51:09 Joseph Salisbury bug task added linux (Ubuntu Cosmic)
2018-06-04 19:51:09 Joseph Salisbury nominated for series Ubuntu Bionic
2018-06-04 19:51:09 Joseph Salisbury bug task added linux (Ubuntu Bionic)
2018-06-04 19:51:17 Joseph Salisbury linux (Ubuntu Bionic): status New In Progress
2018-06-04 19:51:21 Joseph Salisbury linux (Ubuntu Bionic): importance Undecided High
2018-06-04 19:51:24 Joseph Salisbury linux (Ubuntu Bionic): assignee Joseph Salisbury (jsalisbury)
2018-06-04 20:44:43 Frank Heimes ubuntu-power-systems: status New In Progress
2018-06-05 16:14:26 Joseph Salisbury bug task deleted linux (Ubuntu Cosmic)
2018-06-05 16:23:20 Joseph Salisbury description == Comment: #0 - PAVAMAN SUBRAMANIYAM <> - 2018-04-25 01:59:10 == ---Problem Description--- WARNING: CPU: 97 PID: 11965 at /build/linux-0zaMZw/linux-4.15.0/kernel/sched/core.c:1189 set_task_cpu+0x240/0x250 ---uname output--- Linux ltc-wspoon8 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux Machine Type = P9 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- Install a P9 Open Power Hardware with the latest OP910.20 Firmware images. root@witherspoon:~# cat /etc/os-release ID="openbmc-phosphor" NAME="Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro)" VERSION="ibm-v2.0" VERSION_ID="ibm-v2.0-0-r46-0-gbed584c" PRETTY_NAME="Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro) ibm-v2.0" BUILD_ID="ibm-v2.0-0-r46" root@witherspoon:~# cat /var/lib/phosphor-software-manager/pnor/ro/VERSION open-power-witherspoon-v1.21.2-251-ge2e9363-dirty buildroot-2017.11-5-g65679be skiboot-v5.10.3-op910-1-p240231e hostboot-0aa5bed linux-4.14.24-openpower1-p3e84190 petitboot-v1.6.6-pd7224b4 machine-xml-22224af occ-8c5b727 hostboot-binaries-9bd4056 capp-ucode-p9-dd2-v3 sbe-7e02c23 Then we have installed the Ubuntu 18.04 OS on the machine. root@ltc-wspoon8:~# uname -a Linux ltc-wspoon8 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux root@ltc-wspoon8:~# cat /etc/os-release NAME="Ubuntu" VERSION="18.04 LTS (Bionic Beaver)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 18.04 LTS" VERSION_ID="18.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=bionic UBUNTU_CODENAME=bionic root@ltc-wspoon8:~# cat /proc/cpuinfo | tail cpu : POWER9, altivec supported clock : 2300.000000MHz revision : 2.1 (pvr 004e 1201) timebase : 512000000 platform : PowerNV model : 8335-GTC........ machine : PowerNV 8335-GTC........ firmware : OPAL MMU : Radix root@ltc-wspoon8:~# kdump-config show DUMP_MODE: kdump USE_KDUMP: 1 KDUMP_SYSCTL: kernel.panic_on_oops=1 KDUMP_COREDIR: /var/crash crashkernel addr: /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.15.0-20-generic kdump initrd: /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-4.15.0-20-generic current state: ready to kdump kexec command: /sbin/kexec -p --command-line="root=UUID=a2cd572c-9047-4f0a-843b-6996fae3e999 ro quiet splash nr_cpus=1 systemd.unit=kdump-tools.service irqpoll noirqdistrib nousb" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz root@ltc-wspoon8:~# ps -ef | grep opal root 880 2 0 01:25 ? 00:00:00 [kopald] root 3604 1 2 01:25 ? 00:00:03 /usr/sbin/opal-prd root 4858 4278 0 01:28 pts/0 00:00:00 grep --color=auto opal root@ltc-wspoon8:~# service opal-prd status ? opal-prd.service - OPAL PRD daemon Loaded: loaded (/lib/systemd/system/opal-prd.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2018-04-25 01:25:48 CDT; 2min 43s ago Docs: man:opal-prd(8) Main PID: 3604 (opal-prd) Tasks: 1 (limit: 22118) CGroup: /system.slice/opal-prd.service ??3604 /usr/sbin/opal-prd Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: IMAGE: hbrt_init complete, version 0290000000000000 Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: hservices_init done Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: calling enable_attns Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: ATTN_SLOW:I>>>ATTN_RT::enableAttns Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: ATTN_SLOW:I>Service::enableAttns() enter Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: ATTN_SLOW:I>Service::enableAttns() exit Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: ATTN_SLOW:I><<ATTN_RT::enableAttns rc: 0 Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: calling get_ipoll_events Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: enabling IPOLL events 0x5b90000000000000 Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: FW: writing init message We try to inject the Machine Check Memory UE error using scom utilities. root@ltc-wspoon8:~# ./probe_cpus.sh -L CHIP ID: 0 CORE ID: 0 THREADS: 4 CPUs: 0 1 2 3 CHIP ID: 0 CORE ID: 1 THREADS: 4 CPUs: 4 5 6 7 CHIP ID: 0 CORE ID: 2 THREADS: 4 CPUs: 8 9 10 11 CHIP ID: 0 CORE ID: 3 THREADS: 4 CPUs: 12 13 14 15 CHIP ID: 0 CORE ID: 4 THREADS: 4 CPUs: 16 17 18 19 CHIP ID: 0 CORE ID: 5 THREADS: 4 CPUs: 20 21 22 23 CHIP ID: 0 CORE ID: 8 THREADS: 4 CPUs: 24 25 26 27 CHIP ID: 0 CORE ID: 9 THREADS: 4 CPUs: 28 29 30 31 CHIP ID: 0 CORE ID: 10 THREADS: 4 CPUs: 32 33 34 35 CHIP ID: 0 CORE ID: 11 THREADS: 4 CPUs: 36 37 38 39 CHIP ID: 0 CORE ID: 14 THREADS: 4 CPUs: 40 41 42 43 CHIP ID: 0 CORE ID: 15 THREADS: 4 CPUs: 44 45 46 47 CHIP ID: 0 CORE ID: 16 THREADS: 4 CPUs: 48 49 50 51 CHIP ID: 0 CORE ID: 17 THREADS: 4 CPUs: 52 53 54 55 CHIP ID: 0 CORE ID: 18 THREADS: 4 CPUs: 56 57 58 59 CHIP ID: 0 CORE ID: 19 THREADS: 4 CPUs: 60 61 62 63 CHIP ID: 0 CORE ID: 22 THREADS: 4 CPUs: 64 65 66 67 CHIP ID: 0 CORE ID: 23 THREADS: 4 CPUs: 68 69 70 71 CHIP ID: 8 CORE ID: 0 THREADS: 4 CPUs: 72 73 74 75 CHIP ID: 8 CORE ID: 1 THREADS: 4 CPUs: 76 77 78 79 CHIP ID: 8 CORE ID: 2 THREADS: 4 CPUs: 80 81 82 83 CHIP ID: 8 CORE ID: 3 THREADS: 4 CPUs: 84 85 86 87 CHIP ID: 8 CORE ID: 4 THREADS: 4 CPUs: 88 89 90 91 CHIP ID: 8 CORE ID: 5 THREADS: 4 CPUs: 92 93 94 95 CHIP ID: 8 CORE ID: 6 THREADS: 4 CPUs: 96 97 98 99 CHIP ID: 8 CORE ID: 7 THREADS: 4 CPUs: 100 101 102 103 CHIP ID: 8 CORE ID: 10 THREADS: 4 CPUs: 104 105 106 107 CHIP ID: 8 CORE ID: 11 THREADS: 4 CPUs: 108 109 110 111 CHIP ID: 8 CORE ID: 12 THREADS: 4 CPUs: 112 113 114 115 CHIP ID: 8 CORE ID: 13 THREADS: 4 CPUs: 116 117 118 119 CHIP ID: 8 CORE ID: 14 THREADS: 4 CPUs: 120 121 122 123 CHIP ID: 8 CORE ID: 15 THREADS: 4 CPUs: 124 125 126 127 CHIP ID: 8 CORE ID: 16 THREADS: 4 CPUs: 128 129 130 131 CHIP ID: 8 CORE ID: 17 THREADS: 4 CPUs: 132 133 134 135 CHIP ID: 8 CORE ID: 18 THREADS: 4 CPUs: 136 137 138 139 CHIP ID: 8 CORE ID: 19 THREADS: 4 CPUs: 140 141 142 143 ----------------------------- p[0] eq[0,1,2,3,4,5] ex[0,1,2,4,5,7,8,9,11] c[0,1,2,3,4,5,8,9,10,11,14,15,16,17,18,19,22,23] p[8] eq[0,1,2,3,4] ex[0,1,2,3,5,6,7,8,9] c[0,1,2,3,4,5,6,7,10,11,12,13,14,15,16,17,18,19] ----------------------------- ----------Processor Layout------------------- p[0] +---EQ00----+ +---EQ02----+ +---EQ04----+ |EX-0 C0 | |EX-4 C8 | |EX-8 C16| + - - - - - + + - - - - - + + - - - - - + |EX-0 C1 | |EX-4 C9 | |EX-8 C17| + - - - - - + + - - - - - + + - - - - - + |EX-1 C2 | |EX-5 C10| |EX-9 C18| + - - - - - + + - - - - - + + - - - - - + |EX-1 C3 | |EX-5 C11| |EX-9 C19| +-----------+ +-----------+ +-----------+ +---EQ01----+ +---EQ03----+ +---EQ05----+ |EX-2 C4 | | | | | + - - - - - + + - - - - - + + - - - - - + |EX-2 C5 | | | | | + - - - - - + + - - - - - + + - - - - - + | | |EX-7 C14| |EX-11 C22| + - - - - - + + - - - - - + + - - - - - + | | |EX-7 C15| |EX-11 C23| +-----------+ +-----------+ +-----------+ p[8] +---EQ00----+ +---EQ02----+ +---EQ04----+ |EX-0 C0 | | | |EX-8 C16| + - - - - - + + - - - - - + + - - - - - + |EX-0 C1 | | | |EX-8 C17| + - - - - - + + - - - - - + + - - - - - + |EX-1 C2 | |EX-5 C10| |EX-9 C18| + - - - - - + + - - - - - + + - - - - - + |EX-1 C3 | |EX-5 C11| |EX-9 C19| +-----------+ +-----------+ +-----------+ +---EQ01----+ +---EQ03----+ +---EQ05----+ |EX-2 C4 | |EX-6 C12| | | + - - - - - + + - - - - - + + - - - - - + |EX-2 C5 | |EX-6 C13| | | + - - - - - + + - - - - - + + - - - - - + |EX-3 C6 | |EX-7 C14| | | + - - - - - + + - - - - - + + - - - - - + |EX-3 C7 | |EX-7 C15| | | +-----------+ +-----------+ +-----------+ root@ltc-wspoon8:~# ./statedisable.sh ./statedisable.sh: line 10: /sys/devices/system/cpu/cpu*/cpuidle/state7/disable: No such file or directory ./statedisable.sh: line 11: /sys/devices/system/cpu/cpu*/cpuidle/state8/disable: No such file or directory root@ltc-wspoon8:~# ./run_workload.sh root@ltc-wspoon8:~# ./scom_addr_p9.sh 0x1001080c 7 EQ[ 1]: 0x1101080c EX[ 3]: 0x11010c0c C[ 7]: 0x3701080c root@ltc-wspoon8:~# ./skiboot/external/xscom-utils/getscom -c 0x8 0x11010c0c 0000000000000000 root@ltc-wspoon8:~# ./skiboot/external/xscom-utils/putscom -c 0x8 0x11010c0c 0c00000000000000 0c00000000000000 We see the following call traces in the kernel and there is no MCE recovered messages which was the expected output. Ubuntu 18.04 LTS ltc-wspoon8 hvc0 ltc-wspoon8 login: [ 191.741142] Severe Machine check interrupt [Not recovered] [ 191.741160] NIP [c000000000181b08]: osq_lock+0xb8/0x210 [ 191.741161] Initiator: CPU [ 191.741163] Error type: UE [Load/Store] [ 191.741166] opal: Hardware platform error: Unrecoverable Machine Check exception [ 191.741172] CPU: 123 PID: 11888 Comm: find Tainted: G M 4.15.0-20-generic #21-Ubuntu [ 191.741174] NIP: c000000000181b08 LR: c000000000cfa740 CTR: c000000000497f90 [ 191.741177] REGS: c000000007963d80 TRAP: 0200 Tainted: G M (4.15.0-20-generic) [ 191.741178] MSR: 9000000000209033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24002882 XER: 00000000 [ 191.741188] CFAR: c000000000181b54 DAR: 00002018faf69194 DSISR: 00008000 SOFTE: 1 [ 191.741188] GPR00: c000000000cfa740 c000201857d47a30 c0000000016eae00 c0000000015c6b2c [ 191.741188] GPR04: 0000000000000000 0000000000000000 c0000000017807c0 c000000007a20000 [ 191.741188] GPR08: c0002018faf69180 c0002018fb5e9180 c0002018faf69180 0000000000000000 [ 191.741188] GPR12: 0000000084002888 c000000007a74900 00000d02693c2b80 0000000000000000 [ 191.741188] GPR16: 0000000000000000 ffffffffffffff9c 00007fffc6e73f68 00000d02693e9510 [ 191.741188] GPR20: 0000000000000001 0000000000000000 fffffffffffffff6 0000000000000000 [ 191.741188] GPR24: c000201857d47c90 c0002018cdad201c fffffffffffff000 0000000000000004 [ 191.741188] GPR28: 0000000000000002 c0000000015c6b2c 0000000000000001 c0000000015c6b20 [ 191.741219] NIP [c000000000181b08] osq_lock+0xb8/0x210 [ 191.741224] LR [c000000000cfa740] __mutex_lock.isra.0+0x440/0x6e0 [ 191.741225] Call Trace: [ 191.741229] [c000201857d47a30] [c000000000cfa338] __mutex_lock.isra.0+0x38/0x6e0 (unreliable) [ 191.741234] [c000201857d47ac0] [c000000000497fe0] kernfs_iop_permission+0x50/0xb0 [ 191.741238] [c000201857d47b00] [c0000000003e43f4] __inode_permission+0x1a4/0x270 [ 191.741241] [c000201857d47b50] [c0000000003e8bcc] link_path_walk+0x62c/0x6c0 [ 191.741243] [c000201857d47bf0] [c0000000003eacbc] path_openat+0xac/0x3e0 [ 191.741247] [c000201857d47c70] [c0000000003ec570] do_filp_open+0x80/0x120 [ 191.741253] [c000201857d47da0] [c0000000003cfae8] do_sys_open+0x248/0x3f0 [ 191.741257] [c000201857d47e30] [c00000000000b184] system_call+0x58/0x6c [ 191.741259] Instruction dump: [ 191.741261] 81490010 2faa0000 409e0160 782a0464 e94a0080 714a0004 40820068 3cc20009 [ 191.741267] 38c659c0 60420000 e9490008 e8e60000 <814a0014> 394affff 7d4a07b4 1d4a0b00 [ 191.743669] Severe Machine check interrupt [Recovered] [ 191.743706] NIP [c000000000181b3c]: osq_lock+0xec/0x210 [ 191.743740] Initiator: CPU [ 191.743766] Error type: UE [Load/Store] [ 191.743811] WARNING: CPU: 97 PID: 11965 at /build/linux-0zaMZw/linux-4.15.0/kernel/sched/core.c:1189 set_task_cpu+0x240/0x250 [ 191.743885] Modules linked in: binfmt_misc ofpart cmdlinepart idt_89hpesx at24 opal_prd powernv_flash ipmi_powernv ipmi_devintf mtd vmx_crypto uio_pdrv_genirq ipmi_msghandler uio ibmpowernv sch_fq_codel ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_core nouveau ast i2c_algo_bit mlx5_core ttm drm_kms_helper syscopyarea sysfillrect uas sysimgblt fb_sys_fops usb_storage ahci mlxfw crct10dif_vpmsum crc32c_vpmsum drm tg3 libahci devlink [ 191.744292] CPU: 97 PID: 11965 Comm: find Tainted: G M 4.15.0-20-generic #21-Ubuntu [ 191.744350] NIP: c00000000014d6e0 LR: c00000000014e30c CTR: c00000000015a240 [ 191.744401] REGS: c00020185d9eb1e0 TRAP: 0700 Tainted: G M (4.15.0-20-generic) [ 191.744458] MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28008284 XER: 00000000 [ 191.744516] CFAR: c00000000014d54c SOFTE: 0 [ 191.744516] GPR00: c00000000014e30c c00020185d9eb460 c0000000016eae00 c000001f14647300 [ 191.744516] GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 191.744516] GPR08: c000000001721ee0 0000000000000000 0000000000000000 9000000000001003 [ 191.744516] GPR12: 0000000028008224 c000000007a62b00 000003b558292b80 0000000000000000 [ 191.744516] GPR16: 0000000000000000 ffffffffffffff9c 00007fffdde92a38 000003b5582bbef0 [ 191.744516] GPR20: 0000000000000001 0000000000000000 fffffffffffffff6 c00020185d9eb5e0 [ 191.744516] GPR24: c000001f14647728 c00000000171dd78 c0000000011d8580 0000000000000000 [ 191.744516] GPR28: 0000000000000004 0000000000000000 0000000000000000 c000001f14647300 [ 191.749630] NIP [c00000000014d6e0] set_task_cpu+0x240/0x250 [ 191.749709] LR [c00000000014e30c] try_to_wake_up+0x1bc/0x660 [ 191.749804] Call Trace: [ 191.749846] [c00020185d9eb460] [c0000000011d8580] runqueues+0x0/0xc00 (unreliable) [ 191.749943] [c00020185d9eb4a0] [c00000000014e30c] try_to_wake_up+0x1bc/0x660 [ 191.750072] [c00020185d9eb520] [c0000000001725d8] autoremove_wake_function+0x28/0x70 [ 191.750199] [c00020185d9eb550] [c000000000171b60] __wake_up_common+0xd0/0x200 [ 191.750316] [c00020185d9eb5c0] [c000000000171d4c] __wake_up_common_lock+0xbc/0x110 [ 191.750444] [c00020185d9eb650] [c00000000018ea40] wake_up_klogd_work_func+0x60/0xc0 [ 191.750573] [c00020185d9eb680] [c000000000295d10] irq_work_run_list+0xb0/0x100 [ 191.750713] [c00020185d9eb6c0] [c000000000024ab4] __timer_interrupt+0x254/0x260 [ 191.750841] [c00020185d9eb710] [c000000000024d08] timer_interrupt+0x98/0xe0 [ 191.750949] [c00020185d9eb740] [c000000000009014] decrementer_common+0x114/0x120 [ 191.751079] --- interrupt: 901 at osq_lock+0xec/0x210 [ 191.751079] LR = __mutex_lock.isra.0+0x440/0x6e0 [ 191.751255] [c00020185d9eba30] [c000000000cfa338] __mutex_lock.isra.0+0x38/0x6e0 (unreliable) [ 191.751402] [c00020185d9ebac0] [c000000000497fe0] kernfs_iop_permission+0x50/0xb0 [ 191.751530] [c00020185d9ebb00] [c0000000003e43f4] __inode_permission+0x1a4/0x270 [ 191.751658] [c00020185d9ebb50] [c0000000003e8bcc] link_path_walk+0x62c/0x6c0 [ 191.751785] [c00020185d9ebbf0] [c0000000003eacbc] path_openat+0xac/0x3e0 [ 191.751894] [c00020185d9ebc70] [c0000000003ec570] do_filp_open+0x80/0x120 [ 191.752003] [c00020185d9ebda0] [c0000000003cfae8] do_sys_open+0x248/0x3f0 [ 191.752112] [c00020185d9ebe30] [c00000000000b184] system_call+0x58/0x6c [ 191.752229] Instruction dump: [ 191.752299] 7faa3670 7d4a0194 57a706be 7d4a07b4 794a1f24 7d28502a 7d293c36 71290001 [ 191.752441] 4082fe80 60000000 60000000 60420000 <0fe00000> 4bfffe6c 60000000 60420000 [ 191.752584] ---[ end trace 032f502244013ba3 ]--- [ 309.237017153,0] OPAL: Reboot requested due to Platform error. [ 309.237089038,3] OPAL: Reboot requested due to Platform error.[ 309.237145569,5] Software initiated checkstop disabled. [ 309.237200666,5] OPAL: Reboot request... [ 309.247531874,5] Unable to log error Stack trace output: [ 191.749804] Call Trace: [ 191.749846] [c00020185d9eb460] [c0000000011d8580] runqueues+0x0/0xc00 (unreliable) [ 191.749943] [c00020185d9eb4a0] [c00000000014e30c] try_to_wake_up+0x1bc/0x660 [ 191.750072] [c00020185d9eb520] [c0000000001725d8] autoremove_wake_function+0x28/0x70 [ 191.750199] [c00020185d9eb550] [c000000000171b60] __wake_up_common+0xd0/0x200 [ 191.750316] [c00020185d9eb5c0] [c000000000171d4c] __wake_up_common_lock+0xbc/0x110 [ 191.750444] [c00020185d9eb650] [c00000000018ea40] wake_up_klogd_work_func+0x60/0xc0 [ 191.750573] [c00020185d9eb680] [c000000000295d10] irq_work_run_list+0xb0/0x100 [ 191.750713] [c00020185d9eb6c0] [c000000000024ab4] __timer_interrupt+0x254/0x260 [ 191.750841] [c00020185d9eb710] [c000000000024d08] timer_interrupt+0x98/0xe0 [ 191.750949] [c00020185d9eb740] [c000000000009014] decrementer_common+0x114/0x120 [ 191.751079] --- interrupt: 901 at osq_lock+0xec/0x210 [ 191.751079] LR = __mutex_lock.isra.0+0x440/0x6e0 [ 191.751255] [c00020185d9eba30] [c000000000cfa338] __mutex_lock.isra.0+0x38/0x6e0 (unreliable) [ 191.751402] [c00020185d9ebac0] [c000000000497fe0] kernfs_iop_permission+0x50/0xb0 [ 191.751530] [c00020185d9ebb00] [c0000000003e43f4] __inode_permission+0x1a4/0x270 [ 191.751658] [c00020185d9ebb50] [c0000000003e8bcc] link_path_walk+0x62c/0x6c0 [ 191.751785] [c00020185d9ebbf0] [c0000000003eacbc] path_openat+0xac/0x3e0 [ 191.751894] [c00020185d9ebc70] [c0000000003ec570] do_filp_open+0x80/0x120 [ 191.752003] [c00020185d9ebda0] [c0000000003cfae8] do_sys_open+0x248/0x3f0 [ 191.752112] [c00020185d9ebe30] [c00000000000b184] system_call+0x58/0x6c == Comment: #1 - PAVAMAN SUBRAMANIYAM <> - 2018-04-25 02:03:31 == I had a discussion with Mahesh about this bug and he has suggested to try out with the Patch which has been posted upstream in the below link: http://patchwork.ozlabs.org/patch/902735/ == Comment: #8 - PAVAMAN SUBRAMANIYAM <> - 2018-06-01 03:09:16 == Can we have the patch http://patchwork.ozlabs.org/patch/902735/ which is in upstream to be merged to Ubuntu 18.04 release. == SRU Justification == IBM reports seeing the following during their testing: WARNING: CPU: 97 PID: 11965 at /build/linux-0zaMZw/linux-4.15.0/kernel/sched/core.c:1189 set_task_cpu+0x240/0x250 This is a regression and was introduced by the following two commits in v4.15-rc1: 01eaac2b0591 ("powerpc/mce: Hookup ierror (instruction) UE errors") ba41e1e1ccb9 ("powerpc/mce: Hookup derror (load/store) UE errors") This regression is fixed by commit 75ecfb49516c in v4.17-rc3. The commit was also cc'd to upstream stable, but it is being SRU'd to get the fix into Ubuntu without waiting for it to come down via stable updates. == Fix == 75ecfb49516c ("powerpc/mce: Fix a bug where mce loops on memory UE.") == Regression Potential == Low. Limited to powerpc. The commit was also cc'd to upstream stable so it will recieve additional upstream stable review. == Test Case == A test kernel was built with this patch and tested by the original bug reporter. The bug reporter states the test kernel resolved the bug. == Original Bug Descriptions == == Comment: #0 - PAVAMAN SUBRAMANIYAM <> - 2018-04-25 01:59:10 == ---Problem Description--- WARNING: CPU: 97 PID: 11965 at /build/linux-0zaMZw/linux-4.15.0/kernel/sched/core.c:1189 set_task_cpu+0x240/0x250 ---uname output--- Linux ltc-wspoon8 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux Machine Type = P9 ---Debugger--- A debugger is not configured ---Steps to Reproduce--- Install a P9 Open Power Hardware with the latest OP910.20 Firmware images. root@witherspoon:~# cat /etc/os-release ID="openbmc-phosphor" NAME="Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro)" VERSION="ibm-v2.0" VERSION_ID="ibm-v2.0-0-r46-0-gbed584c" PRETTY_NAME="Phosphor OpenBMC (Phosphor OpenBMC Project Reference Distro) ibm-v2.0" BUILD_ID="ibm-v2.0-0-r46" root@witherspoon:~# cat /var/lib/phosphor-software-manager/pnor/ro/VERSION open-power-witherspoon-v1.21.2-251-ge2e9363-dirty         buildroot-2017.11-5-g65679be         skiboot-v5.10.3-op910-1-p240231e         hostboot-0aa5bed         linux-4.14.24-openpower1-p3e84190         petitboot-v1.6.6-pd7224b4         machine-xml-22224af         occ-8c5b727         hostboot-binaries-9bd4056         capp-ucode-p9-dd2-v3         sbe-7e02c23 Then we have installed the Ubuntu 18.04 OS on the machine. root@ltc-wspoon8:~# uname -a Linux ltc-wspoon8 4.15.0-20-generic #21-Ubuntu SMP Tue Apr 24 06:14:44 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux root@ltc-wspoon8:~# cat /etc/os-release NAME="Ubuntu" VERSION="18.04 LTS (Bionic Beaver)" ID=ubuntu ID_LIKE=debian PRETTY_NAME="Ubuntu 18.04 LTS" VERSION_ID="18.04" HOME_URL="https://www.ubuntu.com/" SUPPORT_URL="https://help.ubuntu.com/" BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/" PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy" VERSION_CODENAME=bionic UBUNTU_CODENAME=bionic root@ltc-wspoon8:~# cat /proc/cpuinfo | tail cpu : POWER9, altivec supported clock : 2300.000000MHz revision : 2.1 (pvr 004e 1201) timebase : 512000000 platform : PowerNV model : 8335-GTC........ machine : PowerNV 8335-GTC........ firmware : OPAL MMU : Radix root@ltc-wspoon8:~# kdump-config show DUMP_MODE: kdump USE_KDUMP: 1 KDUMP_SYSCTL: kernel.panic_on_oops=1 KDUMP_COREDIR: /var/crash crashkernel addr:    /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinux-4.15.0-20-generic kdump initrd:    /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-4.15.0-20-generic current state: ready to kdump kexec command:   /sbin/kexec -p --command-line="root=UUID=a2cd572c-9047-4f0a-843b-6996fae3e999 ro quiet splash nr_cpus=1 systemd.unit=kdump-tools.service irqpoll noirqdistrib nousb" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz root@ltc-wspoon8:~# ps -ef | grep opal root 880 2 0 01:25 ? 00:00:00 [kopald] root 3604 1 2 01:25 ? 00:00:03 /usr/sbin/opal-prd root 4858 4278 0 01:28 pts/0 00:00:00 grep --color=auto opal root@ltc-wspoon8:~# service opal-prd status ? opal-prd.service - OPAL PRD daemon    Loaded: loaded (/lib/systemd/system/opal-prd.service; enabled; vendor preset: enabled)    Active: active (running) since Wed 2018-04-25 01:25:48 CDT; 2min 43s ago      Docs: man:opal-prd(8)  Main PID: 3604 (opal-prd)     Tasks: 1 (limit: 22118)    CGroup: /system.slice/opal-prd.service            ??3604 /usr/sbin/opal-prd Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: IMAGE: hbrt_init complete, version 0290000000000000 Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: hservices_init done Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: calling enable_attns Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: ATTN_SLOW:I>>>ATTN_RT::enableAttns Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: ATTN_SLOW:I>Service::enableAttns() enter Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: ATTN_SLOW:I>Service::enableAttns() exit Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: ATTN_SLOW:I><<ATTN_RT::enableAttns rc: 0 Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: calling get_ipoll_events Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: HBRT: enabling IPOLL events 0x5b90000000000000 Apr 25 01:25:52 ltc-wspoon8 opal-prd[3604]: FW: writing init message We try to inject the Machine Check Memory UE error using scom utilities. root@ltc-wspoon8:~# ./probe_cpus.sh -L CHIP ID: 0 CORE ID: 0 THREADS: 4 CPUs: 0 1 2 3 CHIP ID: 0 CORE ID: 1 THREADS: 4 CPUs: 4 5 6 7 CHIP ID: 0 CORE ID: 2 THREADS: 4 CPUs: 8 9 10 11 CHIP ID: 0 CORE ID: 3 THREADS: 4 CPUs: 12 13 14 15 CHIP ID: 0 CORE ID: 4 THREADS: 4 CPUs: 16 17 18 19 CHIP ID: 0 CORE ID: 5 THREADS: 4 CPUs: 20 21 22 23 CHIP ID: 0 CORE ID: 8 THREADS: 4 CPUs: 24 25 26 27 CHIP ID: 0 CORE ID: 9 THREADS: 4 CPUs: 28 29 30 31 CHIP ID: 0 CORE ID: 10 THREADS: 4 CPUs: 32 33 34 35 CHIP ID: 0 CORE ID: 11 THREADS: 4 CPUs: 36 37 38 39 CHIP ID: 0 CORE ID: 14 THREADS: 4 CPUs: 40 41 42 43 CHIP ID: 0 CORE ID: 15 THREADS: 4 CPUs: 44 45 46 47 CHIP ID: 0 CORE ID: 16 THREADS: 4 CPUs: 48 49 50 51 CHIP ID: 0 CORE ID: 17 THREADS: 4 CPUs: 52 53 54 55 CHIP ID: 0 CORE ID: 18 THREADS: 4 CPUs: 56 57 58 59 CHIP ID: 0 CORE ID: 19 THREADS: 4 CPUs: 60 61 62 63 CHIP ID: 0 CORE ID: 22 THREADS: 4 CPUs: 64 65 66 67 CHIP ID: 0 CORE ID: 23 THREADS: 4 CPUs: 68 69 70 71 CHIP ID: 8 CORE ID: 0 THREADS: 4 CPUs: 72 73 74 75 CHIP ID: 8 CORE ID: 1 THREADS: 4 CPUs: 76 77 78 79 CHIP ID: 8 CORE ID: 2 THREADS: 4 CPUs: 80 81 82 83 CHIP ID: 8 CORE ID: 3 THREADS: 4 CPUs: 84 85 86 87 CHIP ID: 8 CORE ID: 4 THREADS: 4 CPUs: 88 89 90 91 CHIP ID: 8 CORE ID: 5 THREADS: 4 CPUs: 92 93 94 95 CHIP ID: 8 CORE ID: 6 THREADS: 4 CPUs: 96 97 98 99 CHIP ID: 8 CORE ID: 7 THREADS: 4 CPUs: 100 101 102 103 CHIP ID: 8 CORE ID: 10 THREADS: 4 CPUs: 104 105 106 107 CHIP ID: 8 CORE ID: 11 THREADS: 4 CPUs: 108 109 110 111 CHIP ID: 8 CORE ID: 12 THREADS: 4 CPUs: 112 113 114 115 CHIP ID: 8 CORE ID: 13 THREADS: 4 CPUs: 116 117 118 119 CHIP ID: 8 CORE ID: 14 THREADS: 4 CPUs: 120 121 122 123 CHIP ID: 8 CORE ID: 15 THREADS: 4 CPUs: 124 125 126 127 CHIP ID: 8 CORE ID: 16 THREADS: 4 CPUs: 128 129 130 131 CHIP ID: 8 CORE ID: 17 THREADS: 4 CPUs: 132 133 134 135 CHIP ID: 8 CORE ID: 18 THREADS: 4 CPUs: 136 137 138 139 CHIP ID: 8 CORE ID: 19 THREADS: 4 CPUs: 140 141 142 143 ----------------------------- p[0]    eq[0,1,2,3,4,5]    ex[0,1,2,4,5,7,8,9,11]     c[0,1,2,3,4,5,8,9,10,11,14,15,16,17,18,19,22,23] p[8]    eq[0,1,2,3,4]    ex[0,1,2,3,5,6,7,8,9]     c[0,1,2,3,4,5,6,7,10,11,12,13,14,15,16,17,18,19] ----------------------------- ----------Processor Layout------------------- p[0]         +---EQ00----+ +---EQ02----+ +---EQ04----+         |EX-0 C0 | |EX-4 C8 | |EX-8 C16|         + - - - - - + + - - - - - + + - - - - - +         |EX-0 C1 | |EX-4 C9 | |EX-8 C17|         + - - - - - + + - - - - - + + - - - - - +         |EX-1 C2 | |EX-5 C10| |EX-9 C18|         + - - - - - + + - - - - - + + - - - - - +         |EX-1 C3 | |EX-5 C11| |EX-9 C19|         +-----------+ +-----------+ +-----------+         +---EQ01----+ +---EQ03----+ +---EQ05----+         |EX-2 C4 | | | | |         + - - - - - + + - - - - - + + - - - - - +         |EX-2 C5 | | | | |         + - - - - - + + - - - - - + + - - - - - +         | | |EX-7 C14| |EX-11 C22|         + - - - - - + + - - - - - + + - - - - - +         | | |EX-7 C15| |EX-11 C23|         +-----------+ +-----------+ +-----------+ p[8]         +---EQ00----+ +---EQ02----+ +---EQ04----+         |EX-0 C0 | | | |EX-8 C16|         + - - - - - + + - - - - - + + - - - - - +         |EX-0 C1 | | | |EX-8 C17|         + - - - - - + + - - - - - + + - - - - - +         |EX-1 C2 | |EX-5 C10| |EX-9 C18|         + - - - - - + + - - - - - + + - - - - - +         |EX-1 C3 | |EX-5 C11| |EX-9 C19|         +-----------+ +-----------+ +-----------+         +---EQ01----+ +---EQ03----+ +---EQ05----+         |EX-2 C4 | |EX-6 C12| | |         + - - - - - + + - - - - - + + - - - - - +         |EX-2 C5 | |EX-6 C13| | |         + - - - - - + + - - - - - + + - - - - - +         |EX-3 C6 | |EX-7 C14| | |         + - - - - - + + - - - - - + + - - - - - +         |EX-3 C7 | |EX-7 C15| | |         +-----------+ +-----------+ +-----------+ root@ltc-wspoon8:~# ./statedisable.sh ./statedisable.sh: line 10: /sys/devices/system/cpu/cpu*/cpuidle/state7/disable: No such file or directory ./statedisable.sh: line 11: /sys/devices/system/cpu/cpu*/cpuidle/state8/disable: No such file or directory root@ltc-wspoon8:~# ./run_workload.sh root@ltc-wspoon8:~# ./scom_addr_p9.sh 0x1001080c 7 EQ[ 1]: 0x1101080c EX[ 3]: 0x11010c0c  C[ 7]: 0x3701080c root@ltc-wspoon8:~# ./skiboot/external/xscom-utils/getscom -c 0x8 0x11010c0c 0000000000000000 root@ltc-wspoon8:~# ./skiboot/external/xscom-utils/putscom -c 0x8 0x11010c0c 0c00000000000000 0c00000000000000 We see the following call traces in the kernel and there is no MCE recovered messages which was the expected output. Ubuntu 18.04 LTS ltc-wspoon8 hvc0 ltc-wspoon8 login: [ 191.741142] Severe Machine check interrupt [Not recovered] [ 191.741160] NIP [c000000000181b08]: osq_lock+0xb8/0x210 [ 191.741161] Initiator: CPU [ 191.741163] Error type: UE [Load/Store] [ 191.741166] opal: Hardware platform error: Unrecoverable Machine Check exception [ 191.741172] CPU: 123 PID: 11888 Comm: find Tainted: G M 4.15.0-20-generic #21-Ubuntu [ 191.741174] NIP: c000000000181b08 LR: c000000000cfa740 CTR: c000000000497f90 [ 191.741177] REGS: c000000007963d80 TRAP: 0200 Tainted: G M (4.15.0-20-generic) [ 191.741178] MSR: 9000000000209033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24002882 XER: 00000000 [ 191.741188] CFAR: c000000000181b54 DAR: 00002018faf69194 DSISR: 00008000 SOFTE: 1 [ 191.741188] GPR00: c000000000cfa740 c000201857d47a30 c0000000016eae00 c0000000015c6b2c [ 191.741188] GPR04: 0000000000000000 0000000000000000 c0000000017807c0 c000000007a20000 [ 191.741188] GPR08: c0002018faf69180 c0002018fb5e9180 c0002018faf69180 0000000000000000 [ 191.741188] GPR12: 0000000084002888 c000000007a74900 00000d02693c2b80 0000000000000000 [ 191.741188] GPR16: 0000000000000000 ffffffffffffff9c 00007fffc6e73f68 00000d02693e9510 [ 191.741188] GPR20: 0000000000000001 0000000000000000 fffffffffffffff6 0000000000000000 [ 191.741188] GPR24: c000201857d47c90 c0002018cdad201c fffffffffffff000 0000000000000004 [ 191.741188] GPR28: 0000000000000002 c0000000015c6b2c 0000000000000001 c0000000015c6b20 [ 191.741219] NIP [c000000000181b08] osq_lock+0xb8/0x210 [ 191.741224] LR [c000000000cfa740] __mutex_lock.isra.0+0x440/0x6e0 [ 191.741225] Call Trace: [ 191.741229] [c000201857d47a30] [c000000000cfa338] __mutex_lock.isra.0+0x38/0x6e0 (unreliable) [ 191.741234] [c000201857d47ac0] [c000000000497fe0] kernfs_iop_permission+0x50/0xb0 [ 191.741238] [c000201857d47b00] [c0000000003e43f4] __inode_permission+0x1a4/0x270 [ 191.741241] [c000201857d47b50] [c0000000003e8bcc] link_path_walk+0x62c/0x6c0 [ 191.741243] [c000201857d47bf0] [c0000000003eacbc] path_openat+0xac/0x3e0 [ 191.741247] [c000201857d47c70] [c0000000003ec570] do_filp_open+0x80/0x120 [ 191.741253] [c000201857d47da0] [c0000000003cfae8] do_sys_open+0x248/0x3f0 [ 191.741257] [c000201857d47e30] [c00000000000b184] system_call+0x58/0x6c [ 191.741259] Instruction dump: [ 191.741261] 81490010 2faa0000 409e0160 782a0464 e94a0080 714a0004 40820068 3cc20009 [ 191.741267] 38c659c0 60420000 e9490008 e8e60000 <814a0014> 394affff 7d4a07b4 1d4a0b00 [ 191.743669] Severe Machine check interrupt [Recovered] [ 191.743706] NIP [c000000000181b3c]: osq_lock+0xec/0x210 [ 191.743740] Initiator: CPU [ 191.743766] Error type: UE [Load/Store] [ 191.743811] WARNING: CPU: 97 PID: 11965 at /build/linux-0zaMZw/linux-4.15.0/kernel/sched/core.c:1189 set_task_cpu+0x240/0x250 [ 191.743885] Modules linked in: binfmt_misc ofpart cmdlinepart idt_89hpesx at24 opal_prd powernv_flash ipmi_powernv ipmi_devintf mtd vmx_crypto uio_pdrv_genirq ipmi_msghandler uio ibmpowernv sch_fq_codel ib_iser rdma_cm iw_cm ib_cm iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mlx5_ib ib_core nouveau ast i2c_algo_bit mlx5_core ttm drm_kms_helper syscopyarea sysfillrect uas sysimgblt fb_sys_fops usb_storage ahci mlxfw crct10dif_vpmsum crc32c_vpmsum drm tg3 libahci devlink [ 191.744292] CPU: 97 PID: 11965 Comm: find Tainted: G M 4.15.0-20-generic #21-Ubuntu [ 191.744350] NIP: c00000000014d6e0 LR: c00000000014e30c CTR: c00000000015a240 [ 191.744401] REGS: c00020185d9eb1e0 TRAP: 0700 Tainted: G M (4.15.0-20-generic) [ 191.744458] MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28008284 XER: 00000000 [ 191.744516] CFAR: c00000000014d54c SOFTE: 0 [ 191.744516] GPR00: c00000000014e30c c00020185d9eb460 c0000000016eae00 c000001f14647300 [ 191.744516] GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [ 191.744516] GPR08: c000000001721ee0 0000000000000000 0000000000000000 9000000000001003 [ 191.744516] GPR12: 0000000028008224 c000000007a62b00 000003b558292b80 0000000000000000 [ 191.744516] GPR16: 0000000000000000 ffffffffffffff9c 00007fffdde92a38 000003b5582bbef0 [ 191.744516] GPR20: 0000000000000001 0000000000000000 fffffffffffffff6 c00020185d9eb5e0 [ 191.744516] GPR24: c000001f14647728 c00000000171dd78 c0000000011d8580 0000000000000000 [ 191.744516] GPR28: 0000000000000004 0000000000000000 0000000000000000 c000001f14647300 [ 191.749630] NIP [c00000000014d6e0] set_task_cpu+0x240/0x250 [ 191.749709] LR [c00000000014e30c] try_to_wake_up+0x1bc/0x660 [ 191.749804] Call Trace: [ 191.749846] [c00020185d9eb460] [c0000000011d8580] runqueues+0x0/0xc00 (unreliable) [ 191.749943] [c00020185d9eb4a0] [c00000000014e30c] try_to_wake_up+0x1bc/0x660 [ 191.750072] [c00020185d9eb520] [c0000000001725d8] autoremove_wake_function+0x28/0x70 [ 191.750199] [c00020185d9eb550] [c000000000171b60] __wake_up_common+0xd0/0x200 [ 191.750316] [c00020185d9eb5c0] [c000000000171d4c] __wake_up_common_lock+0xbc/0x110 [ 191.750444] [c00020185d9eb650] [c00000000018ea40] wake_up_klogd_work_func+0x60/0xc0 [ 191.750573] [c00020185d9eb680] [c000000000295d10] irq_work_run_list+0xb0/0x100 [ 191.750713] [c00020185d9eb6c0] [c000000000024ab4] __timer_interrupt+0x254/0x260 [ 191.750841] [c00020185d9eb710] [c000000000024d08] timer_interrupt+0x98/0xe0 [ 191.750949] [c00020185d9eb740] [c000000000009014] decrementer_common+0x114/0x120 [ 191.751079] --- interrupt: 901 at osq_lock+0xec/0x210 [ 191.751079] LR = __mutex_lock.isra.0+0x440/0x6e0 [ 191.751255] [c00020185d9eba30] [c000000000cfa338] __mutex_lock.isra.0+0x38/0x6e0 (unreliable) [ 191.751402] [c00020185d9ebac0] [c000000000497fe0] kernfs_iop_permission+0x50/0xb0 [ 191.751530] [c00020185d9ebb00] [c0000000003e43f4] __inode_permission+0x1a4/0x270 [ 191.751658] [c00020185d9ebb50] [c0000000003e8bcc] link_path_walk+0x62c/0x6c0 [ 191.751785] [c00020185d9ebbf0] [c0000000003eacbc] path_openat+0xac/0x3e0 [ 191.751894] [c00020185d9ebc70] [c0000000003ec570] do_filp_open+0x80/0x120 [ 191.752003] [c00020185d9ebda0] [c0000000003cfae8] do_sys_open+0x248/0x3f0 [ 191.752112] [c00020185d9ebe30] [c00000000000b184] system_call+0x58/0x6c [ 191.752229] Instruction dump: [ 191.752299] 7faa3670 7d4a0194 57a706be 7d4a07b4 794a1f24 7d28502a 7d293c36 71290001 [ 191.752441] 4082fe80 60000000 60000000 60420000 <0fe00000> 4bfffe6c 60000000 60420000 [ 191.752584] ---[ end trace 032f502244013ba3 ]--- [ 309.237017153,0] OPAL: Reboot requested due to Platform error. [ 309.237089038,3] OPAL: Reboot requested due to Platform error.[ 309.237145569,5] Software initiated checkstop disabled. [ 309.237200666,5] OPAL: Reboot request... [ 309.247531874,5] Unable to log error Stack trace output:  [ 191.749804] Call Trace: [ 191.749846] [c00020185d9eb460] [c0000000011d8580] runqueues+0x0/0xc00 (unreliable) [ 191.749943] [c00020185d9eb4a0] [c00000000014e30c] try_to_wake_up+0x1bc/0x660 [ 191.750072] [c00020185d9eb520] [c0000000001725d8] autoremove_wake_function+0x28/0x70 [ 191.750199] [c00020185d9eb550] [c000000000171b60] __wake_up_common+0xd0/0x200 [ 191.750316] [c00020185d9eb5c0] [c000000000171d4c] __wake_up_common_lock+0xbc/0x110 [ 191.750444] [c00020185d9eb650] [c00000000018ea40] wake_up_klogd_work_func+0x60/0xc0 [ 191.750573] [c00020185d9eb680] [c000000000295d10] irq_work_run_list+0xb0/0x100 [ 191.750713] [c00020185d9eb6c0] [c000000000024ab4] __timer_interrupt+0x254/0x260 [ 191.750841] [c00020185d9eb710] [c000000000024d08] timer_interrupt+0x98/0xe0 [ 191.750949] [c00020185d9eb740] [c000000000009014] decrementer_common+0x114/0x120 [ 191.751079] --- interrupt: 901 at osq_lock+0xec/0x210 [ 191.751079] LR = __mutex_lock.isra.0+0x440/0x6e0 [ 191.751255] [c00020185d9eba30] [c000000000cfa338] __mutex_lock.isra.0+0x38/0x6e0 (unreliable) [ 191.751402] [c00020185d9ebac0] [c000000000497fe0] kernfs_iop_permission+0x50/0xb0 [ 191.751530] [c00020185d9ebb00] [c0000000003e43f4] __inode_permission+0x1a4/0x270 [ 191.751658] [c00020185d9ebb50] [c0000000003e8bcc] link_path_walk+0x62c/0x6c0 [ 191.751785] [c00020185d9ebbf0] [c0000000003eacbc] path_openat+0xac/0x3e0 [ 191.751894] [c00020185d9ebc70] [c0000000003ec570] do_filp_open+0x80/0x120 [ 191.752003] [c00020185d9ebda0] [c0000000003cfae8] do_sys_open+0x248/0x3f0 [ 191.752112] [c00020185d9ebe30] [c00000000000b184] system_call+0x58/0x6c == Comment: #1 - PAVAMAN SUBRAMANIYAM <> - 2018-04-25 02:03:31 == I had a discussion with Mahesh about this bug and he has suggested to try out with the Patch which has been posted upstream in the below link: http://patchwork.ozlabs.org/patch/902735/ == Comment: #8 - PAVAMAN SUBRAMANIYAM <> - 2018-06-01 03:09:16 == Can we have the patch http://patchwork.ozlabs.org/patch/902735/ which is in upstream to be merged to Ubuntu 18.04 release.
2018-06-07 19:14:40 Khaled El Mously linux (Ubuntu Bionic): status In Progress Fix Committed
2018-06-07 20:21:57 Frank Heimes ubuntu-power-systems: status In Progress Fix Committed
2018-07-19 19:12:02 Joseph Salisbury linux (Ubuntu): status In Progress Fix Committed
2018-09-19 12:49:06 Joseph Salisbury linux (Ubuntu Bionic): status Fix Committed Fix Released
2018-09-19 12:49:09 Joseph Salisbury linux (Ubuntu): status Fix Committed Fix Released
2018-09-19 12:49:17 Joseph Salisbury ubuntu-power-systems: status Fix Committed Fix Released
2019-07-24 21:05:12 Brad Figg tags architecture-ppc64le bugnameltc-167176 p9 severity-high targetmilestone-inin1804 triage-g architecture-ppc64le bugnameltc-167176 cscc p9 severity-high targetmilestone-inin1804 triage-g