Ubuntu
linux package

Comment 27 for bug 1762844

Revision history for this message

bugproxy (bugproxy) wrote on 2018-04-21: Comment bridged from LTC Bugzilla

#27

------- Comment From <email address hidden> 2018-04-21 01:53 EDT-------
Looks like an Oops similar to the previous one in comment#39 starting a sequence of events

root@boslcp3:~# [ 2837.030181] Unable to handle kernel paging request for data at address 0x00000008
[ 2837.030253] Faulting instruction address: 0xc0000000001336fc
[ 2837.030295] Oops: Kernel access of bad area, sig: 11 [#1]
[ 2837.030328] LE SMP NR_CPUS=2048 NUMA PowerNV
[ 2837.030364] Modules linked in: vhost_net vhost macvtap macvlan tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables devlink ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache kvm_hv binfmt_misc kvm dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua joydev input_leds idt_89hpesx mac_hid vmx_crypto crct10dif_vpmsum at24 ofpart cmdlinepart uio_pdrv_genirq uio powernv_flash mtd ibmpowernv ipmi_powernv ipmi_devintf ipmi_msghandler opal_prd nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure hid_generic
[ 2837.030909] usbhid hid qla2xxx ast i2c_algo_bit ttm ixgbe drm_kms_helper mpt3sas nvme_fc syscopyarea sysfillrect nvme_fabrics sysimgblt fb_sys_fops nvme_core raid_class crc32c_vpmsum drm i40e scsi_transport_sas scsi_transport_fc mdio aacraid
[ 2837.031053] CPU: 145 PID: 1182 Comm: kworker/145:1 Not tainted 4.15.0-18-generic #19
[ 2837.031107] NIP: c0000000001336fc LR: c000000000133cf8 CTR: c000000000cfefa0
[ 2837.031156] REGS: c000200e44c77a10 TRAP: 0300 Not tainted (4.15.0-18-generic)
[ 2837.031204] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28000822 XER: 00000000
[ 2837.031257] CFAR: c000000000133cf4 DAR: 0000000000000008 DSISR: 40000000 SOFTE: 0
[ 2837.031257] GPR00: c000000000133cf8 c000200e44c77c90 c0000000016eae00 c000200e44bda5c0
[ 2837.031257] GPR04: c000000fdf6f7da0 c000200e618f7da0 c000200e618fa305 c000000fdf6f7cc8
[ 2837.031257] GPR08: c000200e6190c960 0000000000002440 0000000000000000 c00800000f04e0f8
[ 2837.031257] GPR12: 0000000000000000 c000000007a83b00 c00000000013c788 c000200e50ebf3c0
[ 2837.031257] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 2837.031257] GPR20: c000200e618f7d80 0000000000000000 0000000000000000 fffffffffffffef7
[ 2837.031257] GPR24: 0000000000000402 0000000000000000 c000200e618f8100 c000000001713b00
[ 2837.031257] GPR28: c000200e618f7da0 0000000000000000 c000200e618f7d80 c000200e44bda5c0
[ 2837.031687] NIP [c0000000001336fc] process_one_work+0x3c/0x5a0
[ 2837.031727] LR [c000000000133cf8] worker_thread+0x98/0x630
[ 2837.031760] Call Trace:
[ 2837.031778] [c000200e44c77c90] [c000000000133974] process_one_work+0x2b4/0x5a0 (unreliable)
[ 2837.031828] [c000200e44c77d20] [c000000000133cf8] worker_thread+0x98/0x630
[ 2837.031885] [c000200e44c77dc0] [c00000000013c928] kthread+0x1a8/0x1b0
[ 2837.031928] [c000200e44c77e30] [c00000000000b528] ret_from_kernel_thread+0x5c/0xb4
[ 2837.031976] Instruction dump:
[ 2837.032001] 60000000 7d908026 fba1ffe8 fbc1fff0 91810008 f821ff71 e9240000 712a0004
[ 2837.032052] 793d05e4 40820008 3ba00000 ebc30048 <e93d0008> 815e0010 81290100 714a0004
[ 2837.032104] ---[ end trace ae121b1a8fbe89f8 ]---

A cascading series of events follow ending up in hard lockups. However, that likely happens when IPIs fail and these are secondary events.

------- Comment From pradeep@us.ibm.com 2018-04-21 01:53 EDT-------
Looks like an Oops similar to the previous one in comment#39 starting a sequence of events

root@boslcp3:~# [ 2837.030181] Unable to handle kernel paging request for data at address 0x00000008
[ 2837.030253] Faulting instruction address: 0xc0000000001336fc
[ 2837.030295] Oops: Kernel access of bad area, sig: 11 [#1]
[ 2837.030328] LE SMP NR_CPUS=2048 NUMA PowerNV
[ 2837.030364] Modules linked in: vhost_net vhost macvtap macvlan tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables devlink ip6table_filter ip6_tables iptable_filter rpcsec_gss_krb5 nfsv4 nfs fscache kvm_hv binfmt_misc kvm dm_service_time dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua joydev input_leds idt_89hpesx mac_hid vmx_crypto crct10dif_vpmsum at24 ofpart cmdlinepart uio_pdrv_genirq uio powernv_flash mtd ibmpowernv ipmi_powernv ipmi_devintf ipmi_msghandler opal_prd nfsd auth_rpcgss nfs_acl lockd grace sunrpc sch_fq_codel ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure hid_generic
[ 2837.030909]  usbhid hid qla2xxx ast i2c_algo_bit ttm ixgbe drm_kms_helper mpt3sas nvme_fc syscopyarea sysfillrect nvme_fabrics sysimgblt fb_sys_fops nvme_core raid_class crc32c_vpmsum drm i40e scsi_transport_sas scsi_transport_fc mdio aacraid
[ 2837.031053] CPU: 145 PID: 1182 Comm: kworker/145:1 Not tainted 4.15.0-18-generic #19
[ 2837.031107] NIP:  c0000000001336fc LR: c000000000133cf8 CTR: c000000000cfefa0
[ 2837.031156] REGS: c000200e44c77a10 TRAP: 0300   Not tainted  (4.15.0-18-generic)
[ 2837.031204] MSR:  9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE>  CR: 28000822  XER: 00000000
[ 2837.031257] CFAR: c000000000133cf4 DAR: 0000000000000008 DSISR: 40000000 SOFTE: 0
[ 2837.031257] GPR00: c000000000133cf8 c000200e44c77c90 c0000000016eae00 c000200e44bda5c0
[ 2837.031257] GPR04: c000000fdf6f7da0 c000200e618f7da0 c000200e618fa305 c000000fdf6f7cc8
[ 2837.031257] GPR08: c000200e6190c960 0000000000002440 0000000000000000 c00800000f04e0f8
[ 2837.031257] GPR12: 0000000000000000 c000000007a83b00 c00000000013c788 c000200e50ebf3c0
[ 2837.031257] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[ 2837.031257] GPR20: c000200e618f7d80 0000000000000000 0000000000000000 fffffffffffffef7
[ 2837.031257] GPR24: 0000000000000402 0000000000000000 c000200e618f8100 c000000001713b00
[ 2837.031257] GPR28: c000200e618f7da0 0000000000000000 c000200e618f7d80 c000200e44bda5c0
[ 2837.031687] NIP [c0000000001336fc] process_one_work+0x3c/0x5a0
[ 2837.031727] LR [c000000000133cf8] worker_thread+0x98/0x630
[ 2837.031760] Call Trace:
[ 2837.031778] [c000200e44c77c90] [c000000000133974] process_one_work+0x2b4/0x5a0 (unreliable)
[ 2837.031828] [c000200e44c77d20] [c000000000133cf8] worker_thread+0x98/0x630
[ 2837.031885] [c000200e44c77dc0] [c00000000013c928] kthread+0x1a8/0x1b0
[ 2837.031928] [c000200e44c77e30] [c00000000000b528] ret_from_kernel_thread+0x5c/0xb4
[ 2837.031976] Instruction dump:
[ 2837.032001] 60000000 7d908026 fba1ffe8 fbc1fff0 91810008 f821ff71 e9240000 712a0004
[ 2837.032052] 793d05e4 40820008 3ba00000 ebc30048 <e93d0008> 815e0010 81290100 714a0004
[ 2837.032104] ---[ end trace ae121b1a8fbe89f8 ]---

A cascading series of events follow ending up in hard lockups. However, that likely happens when  IPIs fail and these are secondary events.

Ubuntulinux package

Comment 27 for bug 1762844

Ubuntu
linux package