Comment 2 for bug 1774964

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-06-05 01:26 EDT-------
I have downloaded the test kernel from the link and installed the same.

root@ltc-wspoon8:~# dpkg -i linux-modules-4.15.0-22-generic_4.15.0-22.25~lp1774964_ppc64el.deb
Selecting previously unselected package linux-modules-4.15.0-22-generic.
(Reading database ... 74250 files and directories currently installed.)
Preparing to unpack linux-modules-4.15.0-22-generic_4.15.0-22.25~lp1774964_ppc64el.deb ...
Unpacking linux-modules-4.15.0-22-generic (4.15.0-22.25~lp1774964) ...
Setting up linux-modules-4.15.0-22-generic (4.15.0-22.25~lp1774964) ...

root@ltc-wspoon8:~# dpkg -i linux-image-unsigned-4.15.0-22-generic_4.15.0-22.25~lp1774964_ppc64el.deb
Selecting previously unselected package linux-image-unsigned-4.15.0-22-generic.
(Reading database ... 80003 files and directories currently installed.)
Preparing to unpack linux-image-unsigned-4.15.0-22-generic_4.15.0-22.25~lp1774964_ppc64el.deb ...
Unpacking linux-image-unsigned-4.15.0-22-generic (4.15.0-22.25~lp1774964) ...
Setting up linux-image-unsigned-4.15.0-22-generic (4.15.0-22.25~lp1774964) ...
I: /boot/vmlinux is now a symlink to vmlinux-4.15.0-22-generic
I: /boot/initrd.img is now a symlink to initrd.img-4.15.0-22-generic
Processing triggers for linux-image-unsigned-4.15.0-22-generic (4.15.0-22.25~lp1774964) ...
/etc/kernel/postinst.d/initramfs-tools:
update-initramfs: Generating /boot/initrd.img-4.15.0-22-generic
W: Possible missing firmware /lib/firmware/ast_dp501_fw.bin for module ast
/etc/kernel/postinst.d/kdump-tools:
kdump-tools: Generating /var/lib/kdump/initrd.img-4.15.0-22-generic
W: Possible missing firmware /lib/firmware/ast_dp501_fw.bin for module ast
/etc/kernel/postinst.d/zz-update-grub:
Generating grub configuration file ...
Found linux image: /boot/vmlinux-4.15.0-22-generic
Found initrd image: /boot/initrd.img-4.15.0-22-generic
Found linux image: /boot/vmlinux-4.15.0-20-generic
Found initrd image: /boot/initrd.img-4.15.0-20-generic
done

root@ltc-wspoon8:~# dpkg -i linux-modules-extra-4.15.0-22-generic_4.15.0-22.25~lp1774964_ppc64el.deb
(Reading database ... 80006 files and directories currently installed.)
Preparing to unpack linux-modules-extra-4.15.0-22-generic_4.15.0-22.25~lp1774964_ppc64el.deb ...
Unpacking linux-modules-extra-4.15.0-22-generic (4.15.0-22.25~lp1774964) over (4.15.0-22.25~lp1774964) ...
Setting up linux-modules-extra-4.15.0-22-generic (4.15.0-22.25~lp1774964) ...
Processing triggers for linux-image-unsigned-4.15.0-22-generic (4.15.0-22.25~lp1774964) ...
/etc/kernel/postinst.d/initramfs-tools:
update-initramfs: Generating /boot/initrd.img-4.15.0-22-generic
W: Possible missing firmware /lib/firmware/ast_dp501_fw.bin for module ast
/etc/kernel/postinst.d/kdump-tools:
kdump-tools: Generating /var/lib/kdump/initrd.img-4.15.0-22-generic
W: Possible missing firmware /lib/firmware/ast_dp501_fw.bin for module ast
/etc/kernel/postinst.d/zz-update-grub:
Generating grub configuration file ...
Found linux image: /boot/vmlinux-4.15.0-22-generic
Found initrd image: /boot/initrd.img-4.15.0-22-generic
Found linux image: /boot/vmlinux-4.15.0-20-generic
Found initrd image: /boot/initrd.img-4.15.0-20-generic
done

Then again triggered the test scenario and verified the bug with the test kernel.

root@ltc-wspoon8:~# uname -a
Linux ltc-wspoon8 4.15.0-22-generic #25~lp1774964 SMP Mon Jun 4 19:51:41 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux

root@ltc-wspoon8:~# ./statedisable.sh
./statedisable.sh: line 8: /sys/devices/system/cpu/cpu*/cpuidle/state5/disable: No such file or directory
./statedisable.sh: line 9: /sys/devices/system/cpu/cpu*/cpuidle/state6/disable: No such file or directory
./statedisable.sh: line 10: /sys/devices/system/cpu/cpu*/cpuidle/state7/disable: No such file or directory
./statedisable.sh: line 11: /sys/devices/system/cpu/cpu*/cpuidle/state8/disable: No such file or directory
root@ltc-wspoon8:~# ./run_workload.sh

root@ltc-wspoon8:~# ./scom_addr_p9.sh 0x1001080c 7
EQ[ 1]: 0x1101080c
EX[ 3]: 0x11010c0c
C[ 7]: 0x3701080c

root@ltc-wspoon8:~# ./skiboot/external/xscom-utils/getscom -c 0x8 0x11010c0c
0000000000000000

root@ltc-wspoon8:~# ./skiboot/external/xscom-utils/putscom -c 0x8 0x11010c0c 0c00000000000000
0c00000000000000

In the SOL console we see the following Machine Check Interrupt messages as expected.

Ubuntu 18.04 LTS ltc-wspoon8 hvc0

ltc-wspoon8 login: [ 265.635189] Disabling lock debugging due to kernel taint
[ 265.635194] Severe Machine check interrupt [Not recovered]
[ 265.63[ 395.356139869,0] OPAL: Reboot requested due to Platform error.
[ 395.357920113,3] OPAL: Reboot requested due to Platform error.[ 395.361558129,5] Software initiated checkstop disabled.
[ 395.362802920,5] OPAL: Reboot request...
5205] NIP [c000000000ac4da8]: menu_select+0x98/0x600
[ 265.635206] Initiator: CPU
[ 265.635208] Error type: UE [Load/Store]
[ 265.635211] opal: Hardware platform error: Unrecoverable Machine Check exception
[ 265.635217] CPU: 99 PID: 0 Comm: swapper/99 Tainted: G M 4.15.0-22-generic #25~lp1774964
[ 265.635220] NIP: c000000000ac4da8 LR: c000000000ac4da4 CTR: c000000000ac4d10
[ 265.635223] REGS: c000000007af3d80 TRAP: 0200 Tainted: G M (4.15.0-22-generic)
[ 265.635225] MSR: 9000000002a0b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44004428 XER: 00000000
[ 265.635237] CFAR: c000000000184cfc DAR: 00002018fafe7240 DSISR: 00008000 SOFTE: 0
[ 265.635237] GPR00: c000000000ac4da4 c0002018ec3cbdd0 c0000000016eaf00 000000007fffffff
[ 265.635237] GPR04: c0002018fafe6fd8 0000000001359d04 c0002018fafd1014 c0000000011c1008
[ 265.635237] GPR08: c0000000017220e0 fffffffffffff000 00002018f9e10000 c0002018fafe727c
[ 265.635237] GPR12: c000000000ac4d10 c000000007a64100 c0002018ec3cbf90 0000000000000000
[ 265.635237] GPR16: 0000000000000000 c000000000047fa0 c000000000047f70 c0000000011b5380
[ 265.635237] GPR20: 0000000000000800 c000000001722494 0000000000000063 0000000000000008
[ 265.635237] GPR24: c0002018fafc0030 c0000000011d723c 0000000077359400 c0002018fafe723c
[ 265.635237] GPR28: 00002018f9e10000 c0002018fafe6fd8 c00000000161ca08 c0000000011d723c
[ 265.635275] NIP [c000000000ac4da8] menu_select+0x98/0x600
[ 265.635279] LR [c000000000ac4da4] menu_select+0x94/0x600
[ 265.635280] Call Trace:
[ 265.635287] [c0002018ec3cbdd0] [c00000000161ca20] powernv_idle_driver+0x18/0x3e8 (unreliable)
[ 265.635292] [c0002018ec3cbe50] [c000000000ac28e8] cpuidle_select+0x38/0x50
[ 265.635299] [c0002018ec3cbe70] [c000000000173f3c] do_idle+0x29c/0x330
[ 265.635303] [c0002018ec3cbec0] [c000000000174208] cpu_startup_entry+0x38/0x50
[ 265.635308] [c0002018ec3cbef0] [c00000000004a4f0] start_secondary+0x4f0/0x510
[ 265.635313] [c0002018ec3cbf90] [c00000000000ab6c] start_secondary_prolog+0x10/0x14
[ 265.635315] Instruction dump:
[ 265.635318] 38600001 4b6bf96d 60000000 7c7a1b78 e87801d8 2fa30000 419e03e8 3920f000
[ 265.635325] 7fa34840 419d03dc 4b6bff51 60000000 <813b0004> 2f890000 409e03dc 7f9a1800

------- Comment From <email address hidden> 2018-06-05 01:27 EDT-------
With the test kernel with all the patches, the bug is resolved.