Comment 13 for bug 1848127

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2019-11-06 01:01 EDT-------
I have installed both the Bionic *and* Disco kernels available in PPA: https://launchpad.net/~ubuntu-power-triage/+archive/ubuntu/lp1848127

Then executed the MCE UE tests again on the machine with both the kernels.

root@ltc-wspoon4:~# apt-get install linux-image-unsigned-5.0.0-33-generic/disco
Reading package lists... Done
Building dependency tree
Reading state information... Done
Selected version '5.0.0-33.35~lp1848127+build.1' (lp1848127:19.04/disco [ppc64el]) for 'linux-image-unsigned-5.0.0-33-generic'
The following additional packages will be installed:
linux-modules-5.0.0-33-generic
Suggested packages:
fdutils linux-doc-5.0.0 | linux-source-5.0.0 linux-headers-5.0.0-33-generic
The following NEW packages will be installed:
linux-image-unsigned-5.0.0-33-generic linux-modules-5.0.0-33-generic
0 upgraded, 2 newly installed, 0 to remove and 3 not upgraded.
Need to get 20.7 MB of archives.
After this operation, 106 MB of additional disk space will be used.
Do you want to continue? [Y/n] Y
Get:1 http://ppa.launchpad.net/ubuntu-power-triage/lp1848127/ubuntu disco/main ppc64el linux-modules-5.0.0-33-generic ppc64el 5.0.0-33.35~lp1848127+build.1 [14.0 MB]
Get:2 http://ppa.launchpad.net/ubuntu-power-triage/lp1848127/ubuntu disco/main ppc64el linux-image-unsigned-5.0.0-33-generic ppc64el 5.0.0-33.35~lp1848127+build.1 [6,748 kB]
Fetched 20.7 MB in 13s (1,546 kB/s)
Selecting previously unselected package linux-modules-5.0.0-33-generic.
(Reading database ... 71699 files and directories currently installed.)
Preparing to unpack .../linux-modules-5.0.0-33-generic_5.0.0-33.35~lp1848127+build.1_ppc64el.deb ...
Unpacking linux-modules-5.0.0-33-generic (5.0.0-33.35~lp1848127+build.1) ...
Selecting previously unselected package linux-image-unsigned-5.0.0-33-generic.
Preparing to unpack .../linux-image-unsigned-5.0.0-33-generic_5.0.0-33.35~lp1848127+build.1_ppc64el.deb ...
Unpacking linux-image-unsigned-5.0.0-33-generic (5.0.0-33.35~lp1848127+build.1) ...
Setting up linux-modules-5.0.0-33-generic (5.0.0-33.35~lp1848127+build.1) ...
Setting up linux-image-unsigned-5.0.0-33-generic (5.0.0-33.35~lp1848127+build.1) ...
I: /boot/vmlinux is now a symlink to vmlinux-5.0.0-33-generic
I: /boot/initrd.img is now a symlink to initrd.img-5.0.0-33-generic
Processing triggers for linux-image-unsigned-5.0.0-33-generic (5.0.0-33.35~lp1848127+build.1) ...
/etc/kernel/postinst.d/initramfs-tools:
update-initramfs: Generating /boot/initrd.img-5.0.0-33-generic
cryptsetup: WARNING: The initramfs image may not contain cryptsetup binaries
nor crypto modules. If that's on purpose, you may want to uninstall the
'cryptsetup-initramfs' package in order to disable the cryptsetup initramfs
integration and avoid this warning.
W: Possible missing firmware /lib/firmware/ast_dp501_fw.bin for module ast
/etc/kernel/postinst.d/zz-update-grub:
Sourcing file `/etc/default/grub'
Sourcing file `/etc/default/grub.d/init-select.cfg'
Generating grub configuration file ...
Found linux image: /boot/vmlinux-5.0.0-33-generic
Found initrd image: /boot/initrd.img-5.0.0-33-generic
Found linux image: /boot/vmlinux-5.0.0-32-generic
Found initrd image: /boot/initrd.img-5.0.0-32-generic
done

root@ltc-wspoon4:~# uname -a
Linux ltc-wspoon4 5.0.0-33-generic #35~lp1848127+build.1-Ubuntu SMP Mon Oct 28 20:12:03 UTC 2019 ppc64le ppc64le ppc64le GNU/Linux
root@ltc-wspoon4:~# cat /etc/os-release
NAME="Ubuntu"
VERSION="19.04 (Disco Dingo)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 19.04"
VERSION_ID="19.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=disco
UBUNTU_CODENAME=disco

root@ltc-wspoon4:~# ./statedisable.sh
./statedisable.sh: line 10: /sys/devices/system/cpu/cpu*/cpuidle/state7/disable: No such file or directory
./statedisable.sh: line 11: /sys/devices/system/cpu/cpu*/cpuidle/state8/disable: No such file or directory
root@ltc-wspoon4:~# ./run_workload.sh

root@ltc-wspoon4:~# ./scom_addr_p9.sh 0x1001080c 6
EQ[ 1]: 0x1101080c
EX[ 3]: 0x11010c0c
C[ 6]: 0x3601080c

root@ltc-wspoon4:~# getscom -c 0x8 0x11010c0c
0000000000000000
root@ltc-wspoon4:~# putscom -c 0x8 0x11010c0c 0c00000000000000
0c00000000000000

ltc-wspoon4 login: [ 442.228985] NIP [c00000000019ae5c]: osq_lock+0x15c/0x230
[ 442.228985] Initiator: CPU
[ 442.228986] Error type: UE [Load/Store]
[ 442.228987] Effective address: c000201cc76a9600
[ 442.228988] Physical address: 0000201cc76a0000
[ 442.228988] opal: Hardware platform error: Unrecoverable Machine Check exception
[ 442.228989] CPU: 109 PID: 9095 Comm: find Tainted: G M 5.0.0-33-generic #35~lp1848127+build.1-Ubuntu
[ 442.228990] NIP: c00000000019ae5c LR: c000000000e000a0 CTR: c000000000446e30
[ 442.228991] REGS: c000201fff24bd70 TRAP: 0200 Tainted: G M (5.0.0-33-generic)
[ 442.228992] MSR: 9000000000209033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 48002222 XER: 00000000
[ 442.228996] CFAR: c00000000019ae34 DAR: c000201cc76a9600 DSISR: 00008000 IRQMASK: 0
[ 442.228998] GPR00: c000000000e000a0 c000201c87babc30 c00000000184cb00 c000000001731abc
[ 442.229001] GPR04: 0000000000000000 0000000000000000 c000000001885c78 0000000000000000
[ 442.229003] GPR08: c000201cc76a9600 c000201cc7b69600 0000000000000004 ffffffffffffffea
[ 442.229005] GPR12: 0000000088002228 c000201fff686d80 00000ed7ab1e2b80 00000ed7ab1e2b80
[ 442.229008] GPR16: 00000ed7ab1f0e30 00000ed7ab1eec30 0000000000000101 00007fffc662d8b8
[ 442.229010] GPR20: 0000000000000000 0000000000030000 000000000001a9b7 0000000000000018
[ 442.229012] GPR24: c000001fc28a9dc8 c000201c7710c500 0000000000000000 c000000001731ab0
[ 442.229014] GPR28: 0000000000000002 c000000001731abc c000201c87babdb0 c000000001731ab0
[ 442.229017] NIP [c00000000019ae5c] osq_lock+0x15c/0x230
[ 442.229018] LR [c000000000e000a0] __mutex_lock.isra.1+0x90/0x710
[ 442.229018] Call Trace:
[ 442.229019] [c000201c87babc30] [c000000000e00054] __mutex_lock.isra.1+0x44/0x710 (unreliable)
[ 442.229020] [c000201c87babcd0] [c000000[ 577.498732581,0] OPAL: Reboot requested due to Platform error.
[ 577.498806187,3] OPAL: Reboot requested due to Platform error.0004facd0] kernfs_fop_readdir+0x200/0x3b0
[ 442.229022] [c000201c87babd40] [c000000000446300] iterate_dir+0x200/0x280
[ 442.229023] [c000201c87babd90] [c0000000004472a0] ksys_getdents64+0xa0/0x1a0
[ 442.229024] [c000201c87babe00] [c0000000004473c8] sys_getdents64+0x28/0x110
[ 442.229025] [c000201c87babe20] [c00000000000b288] system_call+0x5c/0x70
[ 442.229026] Instruction dump:
[ 442.229027] 60000000 38e00000 48000028 60000000 60000000 81490010 7c2004ac 2faa0000
[ 442.229030] 409effd4 7c210b78 7c421378 e9090008 <e9480000> 7faa4800 409effdc 7c0004ac
[ 443.416541] Disabling lock debugging due to kernel taint
[ 443.416543] Severe Machine check interrupt [Not recovered]
[ 443.416544] NIP [c00000000019ad88]: osq_lock+0x88/0x230
[ 443.416544] Initiator: CPU
[ 443.416545] Error type: UE [Load/Store]
[ 443.416545] Effective address: c000201cc76a9610
[ 443.416546] Physical address: 0000201cc76a0000
[ 443.416547] opal: Hardware platform error: Unrecoverable Machine Check exception
[ 443.416548] CPU: 90 PID: 9020 Comm: find Tainted: G M 5.0.0-33-generic #35~lp1848127+build.1-Ubuntu
[ 443.416549] NIP: c00000000019ad88 LR: c000000000e000a0 CTR: c0000000004f8d60
[ 443.416550] REGS: c000201fff32fd70 TRAP: 0200 Tainted: G M (5.0.0-33-generic)
[ 443.416551] MSR: 9000000000209033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 24002224 XER: 00000000
[ 443.416555] CFAR: c000000000e0009c DAR: 0000201cc76a9610 DSISR: 00008000 IRQMASK: 0
[ 443.416557] GPR00: c000000000e000a0 c000201c8f81fbc0 c00000000184cb00 c000000001731abc
[ 443.416559] GPR04: 0000000000000000 0000000000000000 0000201cc6370000 c000000001339600
[ 443.416561] GPR08: c000001ffec29600 c000201cc76a9600 0000001ffd8f0000 c000001f96936300
[ 443.416564] GPR12: 0000000084002228 c000201fff69ba00 00000334527e2b80 0000000000000000
[ 443.416566] GPR16: 0000000000000000 000003345280d440 0000000000000101 00007fffcaffe858
[ 443.416568] GPR20: 0000000000000000 00007fffcaffe7c8 0000000000000000 0000000000000006
[ 443.416570] GPR24: 000077194c155308 00000000000007ff c000201c8f81fd80 000003345280d548
[ 443.416572] GPR28: 0000000000000002 c000000001731abc c000000001731ab0 c000000001731ab0
[ 443.416575] NIP [c00000000019ad88] osq_lock+0x88/0x230
[ 443.416576] LR [c000000000e000a0] __mutex_lock.isra.1+0x90/0x710
[ 443.416576] Call Trace:
[ 443.416577] [c000201c8f81fbc0] [c000000000e00054] __mutex_lock.isra.1+0x44/0x710 (unreliable)
[ 443.416578] [c000201c8f81fc60] [c0000000004f8dac] kernfs_iop_getattr+0x4c/0xa0
[ 443.416579] [c000201c8f81fca0] [c00000000042eac0] vfs_getattr_nosec+0x90/0xf0
[ 443.416581] [c000201c8f81fce0] [c00000000042ed68] vfs_statx+0xc8/0x190
[ 443.416582] [c000201c8f81fd60] [c00000000042f128] sys_newfstatat+0x48/0x90
[ 443.416583] [c000201c8f81fe20] [c00000000000b288] system_call+0x5c/0x70
[ 443.416584] Instruction dump:
[ 443.416584] 2faa0000 419e00c4 394affff 3d020003 39085170 7d4a07b4 794a1f24 7d48502a
[ 443.416587] 7d075214 f9090008 7c2004ac 7d27512a <81490010> 2faa0000 409e0090 782a0464
[ 577.500377001,3] ___________________________________________________________
[ 577.500429242,3] < Dangerous NVRAM option: opal-sw-xstop=enable
[ 577.500480635,3] -----------------------------------------------------------
[ 577.500520165,3] \
[ 577.500562271,3] \ WW
[ 577.500614905,3] <^ \___/|
[ 577.500657283,3] \ /
[ 577.500704560,3] \_ _/
[ 577.500743890,3] }{

The Linux HOST did not hang and it booted back after the above injection.