Comment 27 for bug 1762844

Revision history for this message
bugproxy (bugproxy) wrote : Comment bridged from LTC Bugzilla

------- Comment From <email address hidden> 2018-04-21 01:53 EDT-------
Looks like an Oops similar to the previous one in comment#39 starting a sequence of events

root@boslcp3:~# [ 2837.030181] Unable to handle kernel paging request for data at address 0x00000008
[ 2837.030253] Faulting instruction address: 0xc0000000001336fc
[ 2837.030295] Oops: Kernel access of bad area, sig: 11 [#1]
[ 2837.030328] LE SMP NR_CPUS=2048 NUMA PowerNV
[ 2837.030364] Modules linked in: vhost_net vhost macvtap macvlan tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables devlink ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache kvm_hv binfmt_misc kvm dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua joydev input_leds idt_89hpesx mac_hid vmx_crypto crct10dif_vpmsum at24 ofpart cmdlinepart uio_pdrv_genirq uio powernv_flash mtd ibmpowernv ipmi_powernv ipmi_devintf ipmi_msghandler opal_prd nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure hid_generic
[ 2837.030909] usbhid hid qla2xxx ast i2c_algo_bit ttm ixgbe drm_kms_helper mpt3sas nvme_fc syscopyarea sysfillrect nvme_fabrics sysimgblt fb_sys_fops nvme_core raid_class crc32c_vpmsum drm i40e scsi_transport_sas scsi_transport_fc mdio aacraid
[ 2837.031053] CPU: 145 PID: 1182 Comm: kworker/145:1 Not tainted 4.15.0-18-generic #19
[ 2837.031107] NIP: c0000000001336fc LR: c000000000133cf8 CTR: c000000000cfefa0
[ 2837.031156] REGS: c000200e44c77a10 TRAP: 0300 Not tainted (4.15.0-18-generic)
[ 2837.031204] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28000822 XER: 00000000
[ 2837.031257] CFAR: c000000000133cf4 DAR: 0000000000000008 DSISR: 40000000 SOFTE: 0
[ 2837.031257] GPR00: c000000000133cf8 c000200e44c77c90 c0000000016eae00 c000200e44bda5c0
[ 2837.031257] GPR04: c000000fdf6f7da0 c000200e618f7da0 c000200e618fa305 c000000fdf6f7cc8
[ 2837.031257] GPR08: c000200e6190c960 0000000000002440 0000000000000000 c00800000f04e0f8
[ 2837.031257] GPR12: 0000000000000000 c000000007a83b00 c00000000013c788 c000200e50ebf3c0
[ 2837.031257] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 2837.031257] GPR20: c000200e618f7d80 0000000000000000 0000000000000000 fffffffffffffef7
[ 2837.031257] GPR24: 0000000000000402 0000000000000000 c000200e618f8100 c000000001713b00
[ 2837.031257] GPR28: c000200e618f7da0 0000000000000000 c000200e618f7d80 c000200e44bda5c0
[ 2837.031687] NIP [c0000000001336fc] process_one_work+0x3c/0x5a0
[ 2837.031727] LR [c000000000133cf8] worker_thread+0x98/0x630
[ 2837.031760] Call Trace:
[ 2837.031778] [c000200e44c77c90] [c000000000133974] process_one_work+0x2b4/0x5a0 (unreliable)
[ 2837.031828] [c000200e44c77d20] [c000000000133cf8] worker_thread+0x98/0x630
[ 2837.031885] [c000200e44c77dc0] [c00000000013c928] kthread+0x1a8/0x1b0
[ 2837.031928] [c000200e44c77e30] [c00000000000b528] ret_from_kernel_thread+0x5c/0xb4
[ 2837.031976] Instruction dump:
[ 2837.032001] 60000000 7d908026 fba1ffe8 fbc1fff0 91810008 f821ff71 e9240000 712a0004
[ 2837.032052] 793d05e4 40820008 3ba00000 ebc30048 <e93d0008> 815e0010 81290100 714a0004
[ 2837.032104] ---[ end trace ae121b1a8fbe89f8 ]---

A cascading series of events follow ending up in hard lockups. However, that likely happens when IPIs fail and these are secondary events.