Activity log for bug #1777194

Date Who What changed Old value New value Message
2018-06-15 21:09:49 bugproxy bug added bug
2018-06-15 21:09:51 bugproxy tags architecture-ppc64le bugnameltc-166791 severity-high targetmilestone-inin1804
2018-06-15 21:10:13 bugproxy attachment added syslog https://bugs.launchpad.net/bugs/1777194/+attachment/5153136/+files/hard_lockup_tuleta.txt
2018-06-15 21:10:28 bugproxy attachment added sosreport https://bugs.launchpad.net/bugs/1777194/+attachment/5153137/+files/sosreport-lep8d.stressng-20180417004816.tar.xz
2018-06-15 21:10:33 bugproxy attachment added syslog https://bugs.launchpad.net/bugs/1777194/+attachment/5153138/+files/syslog_p9.txt
2018-06-15 21:10:35 bugproxy ubuntu: assignee Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage)
2018-06-15 21:10:37 bugproxy affects ubuntu linux (Ubuntu)
2018-06-18 08:23:10 Frank Heimes bug task added ubuntu-power-systems
2018-06-18 08:23:51 Frank Heimes ubuntu-power-systems: importance Undecided High
2018-06-18 08:24:01 Frank Heimes ubuntu-power-systems: assignee Canonical Kernel Team (canonical-kernel-team)
2018-06-18 14:18:00 Manoj Iyer linux (Ubuntu): assignee Ubuntu on IBM Power Systems Bug Triage (ubuntu-power-triage) Canonical Kernel Team (canonical-kernel-team)
2018-06-18 14:18:03 Manoj Iyer linux (Ubuntu): importance Undecided High
2018-06-18 14:18:31 Frank Heimes ubuntu-power-systems: status New Triaged
2018-06-18 14:19:20 Manoj Iyer tags architecture-ppc64le bugnameltc-166791 severity-high targetmilestone-inin1804 architecture-ppc64le bugnameltc-166791 severity-high targetmilestone-inin1804 triage-g
2018-06-18 19:39:28 Joseph Salisbury linux (Ubuntu): status New In Progress
2018-06-18 19:39:31 Joseph Salisbury linux (Ubuntu): assignee Canonical Kernel Team (canonical-kernel-team) Joseph Salisbury (jsalisbury)
2018-06-18 19:39:35 Joseph Salisbury nominated for series Ubuntu Bionic
2018-06-18 19:39:35 Joseph Salisbury bug task added linux (Ubuntu Bionic)
2018-06-18 19:39:41 Joseph Salisbury linux (Ubuntu Bionic): assignee Joseph Salisbury (jsalisbury)
2018-06-18 19:39:45 Joseph Salisbury linux (Ubuntu Bionic): importance Undecided High
2018-06-18 19:39:49 Joseph Salisbury linux (Ubuntu Bionic): status New In Progress
2018-06-22 16:07:34 Joseph Salisbury description --Problem Description-- Hard LOCKUP on stressing Ubuntu 18 04 ---Issue observed--- Hard LOCKUP on stressing Ubuntu 18 04 using Ubuntu 18 04, sometimes leads to rcu_stalls. Apr 17 00:00:23 lep8d kernel: [ 4309.786755] Watchdog CPU:3 Hard LOCKUP Apr 17 00:00:23 lep8d kernel: [ 4309.786759] Modules linked in: algif_rng salsa20_generic userio camellia_generic cast6_generic cast_common snd_seq snd_seq_device snd_timer snd soundcore vhost_net serpent_generic tap twofish_generic twofish_common vhost_vsock vmw_vsock_virtio_transport_common vhost vsock lrw unix_diag algif_skcipher cuse sctp tgr192 wp512 rmd320 rmd256 rmd160 hci_vhci rmd128 bluetooth ecdh_generic dccp_ipv4 md4 uhid hid algif_hash dccp af_alg xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables devlink ip6table_filter ip6_tables iptable_filter kvm_hv kvm binfmt_misc uio_pdrv_genirq uio vmx_crypto ibmpowernv powernv_op_panel ipmi_powernv Apr 17 00:00:23 lep8d kernel: [ 4309.786899] ipmi_devintf ipmi_msghandler powernv_rng leds_powernv crct10dif_vpmsum sch_fq_codel ip_tables x_tables autofs4 ses enclosure scsi_transport_sas btrfs xor zstd_compress raid6_pq uas usb_storage crc32c_vpmsum tg3 ipr Apr 17 00:00:23 lep8d kernel: [ 4309.786944] CPU: 3 PID: 28361 Comm: stress-ng-hrtim Not tainted 4.15.0-15-generic #16-Ubuntu Apr 17 00:00:23 lep8d kernel: [ 4309.786950] NIP: c000000000d0c8b8 LR: c000000000120dbc CTR: c000000000024480 Apr 17 00:00:23 lep8d kernel: [ 4309.786956] REGS: c000000007f7fd80 TRAP: 0900 Not tainted (4.15.0-15-generic) Apr 17 00:00:23 lep8d kernel: [ 4309.786957] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28000442 XER: 20000000 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] CFAR: c000000000120db8 SOFTE: 0 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR00: c000000000120dbc c000002d51377ba0 c0000000016eb400 c000002d512b0f88 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR04: 0000000000000000 0000000000000001 0000000001f40668 0000000000000001 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR08: 0000000000000001 0000000000000000 0000000080000003 0000000000000000 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR12: c000000000024480 c000000007a22100 0000000000000000 00000000000186a0 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR16: 00007fffed3687e0 00000abda98e5db8 0000000000008005 0000000000040100 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR20: 0000000000000000 00000000418004fc 00000000003c0000 0000000008430000 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR24: c000002d512b0f88 0000000000000000 c000002d51377d30 c000002d52d47c00 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR28: c000002d51377d50 c000002d51377d50 c000002d52f04500 c000002d512b0f88 Apr 17 00:00:23 lep8d kernel: [ 4309.787035] NIP [c000000000d0c8b8] _raw_spin_lock+0x38/0xe0 Apr 17 00:00:23 lep8d kernel: [ 4309.787045] LR [c000000000120dbc] dequeue_signal+0xcc/0x260 Apr 17 00:00:23 lep8d kernel: [ 4309.787046] Call Trace: Apr 17 00:00:23 lep8d kernel: [ 4309.787052] [c000002d51377bd0] [c000000000120dac] dequeue_signal+0xbc/0x260 Apr 17 00:00:23 lep8d kernel: [ 4309.787059] [c000002d51377c20] [c00000000012459c] get_signal+0x13c/0x7a0 Apr 17 00:00:23 lep8d kernel: [ 4309.787066] [c000002d51377d10] [c00000000001dacc] do_signal+0x7c/0x2c0 Apr 17 00:00:23 lep8d kernel: [ 4309.787072] [c000002d51377e00] [c00000000001deb0] do_notify_resume+0xd0/0x100 Apr 17 00:00:23 lep8d kernel: [ 4309.787083] [c000002d51377e30] [c00000000000b7c4] ret_from_except_lite+0x70/0x74 Apr 17 00:00:23 lep8d kernel: [ 4309.787085] Instruction dump: Apr 17 00:00:23 lep8d kernel: [ 4309.787090] 7c0802a6 60000000 fbe1fff8 f821ffd1 7c7f1b78 39400000 994d028c 814d0008 Apr 17 00:00:23 lep8d kernel: [ 4309.787102] 7d201829 2c090000 40c20010 7d40192d <40c2fff0> 7c2004ac 2fa90000 409e001c Apr 17 00:00:23 lep8d kernel: [ 4313.015781] kauditd_printk_skb: 13 callbacks suppressed ---uname output--- # uname -a Linux lep8d 4.15.0-15-generic #16-Ubuntu SMP Wed Apr 4 13:57:51 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux Machine Type = Power 8 BML/Tuleta ----Additional Info----- Hard LOCKUP is also seen on garri BML. syslog is attached. Reproducible : 90% ---Steps to Reproduce--- 1. wget https://github.com/ColinIanKing/stress-ng/archive/master.zip 2. unzip master.zip; cd stress-ng-master; 3. make; make install; 4. Run the following command multiple times stress-ng --all <nr_cpus> --vm-bytes 80% --aggressive --maximize --oomable --timeout 300 --verify --syslog --metrics --times Issue is observed on Power 9 BML machines as well. [Tue Apr 17 04:13:42 2018] Watchdog CPU:37 Hard LOCKUP [Tue Apr 17 04:13:42 2018] Modules linked in: vsock lrw algif_skcipher tgr192 wp512 rmd320 rmd256 hci_vhci unix_diag bluetooth rmd160 sctp rmd128 ecdh_generic md4 dccp_ipv4 algif_hash cuse dccp af_alg vhost_net vhost tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink binfmt_misc kvm_hv kvm dm_crypt dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua idt_89hpesx joydev input_leds mac_hid ofpart cmdlinepart vmx_crypto ipmi_powernv ipmi_devintf at24 uio_pdrv_genirq ibmpowernv opal_prd crct10dif_vpmsum powernv_flash ipmi_msghandler mtd uio sch_fq_codel ib_iser rdma_cm iw_cm ib_cm [Tue Apr 17 04:13:42 2018] ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi jc42 ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic ast i2c_algo_bit ttm drm_kms_helper usbhid hid syscopyarea sysfillrect sysimgblt fb_sys_fops crc32c_vpmsum drm i40e aacraid [Tue Apr 17 04:13:42 2018] CPU: 37 PID: 11524 Comm: stress-ng-hrtim Not tainted 4.15.0-15-generic #16-Ubuntu [Tue Apr 17 04:13:42 2018] NIP: c00000000012058c LR: c000000000120554 CTR: c00000000002bd30 [Tue Apr 17 04:13:42 2018] REGS: c000000007debd80 TRAP: 0900 Not tainted (4.15.0-15-generic) [Tue Apr 17 04:13:42 2018] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 22000442 XER: 00000000 [Tue Apr 17 04:13:42 2018] CFAR: c00000000011f8c4 SOFTE: 0 GPR00: c000000000120554 c0000004cfc7bd10 c0000000016eb400 c0000004cfb54200 GPR04: c0000004cfc7be00 0000000042000442 0000000000007338 0000000000040000 GPR08: c0000004cfc78080 c0000004cfc78000 0000000000000002 c000000000d10f78 GPR12: c00000000002bd30 c000000007a39700 0000000000000000 00000000000186a0 GPR16: 00007fffd2cab1d0 00000edac67c5db8 00007fffd2cab1c8 00000edac67c5dc0 GPR20: 00007fffd2cab1cc 00007fffd2cab48b ffffffffffffffff 00007fffd2cab48a GPR24: 0000000000010000 000072164cd80000 00007fffd2cab0c4 00007fffd2cab1d8 GPR28: 0000000000000000 c0000004cfc7be00 c0000004cfc7be00 c0000004cfb54200 [Tue Apr 17 04:13:42 2018] NIP [c00000000012058c] recalc_sigpending+0x5c/0x90 [Tue Apr 17 04:13:42 2018] LR [c000000000120554] recalc_sigpending+0x24/0x90 [Tue Apr 17 04:13:42 2018] Call Trace: [Tue Apr 17 04:13:42 2018] [c0000004cfc7bd10] [c00000000001db60] do_signal+0x110/0x2c0 (unreliable) [Tue Apr 17 04:13:42 2018] [c0000004cfc7bd30] [c000000000121658] __set_task_blocked+0x48/0x90 [Tue Apr 17 04:13:42 2018] [c0000004cfc7bd70] [c000000000124ed8] __set_current_blocked+0x58/0xb0 [Tue Apr 17 04:13:42 2018] [c0000004cfc7bda0] [c00000000002be18] sys_rt_sigreturn+0xe8/0x270 [Tue Apr 17 04:13:42 2018] [c0000004cfc7be30] [c00000000000b184] system_call+0x58/0x6c [Tue Apr 17 04:13:42 2018] Instruction dump: [Tue Apr 17 04:13:42 2018] e86d0260 3d220020 3929def8 81290000 2f890000 409e0030 78290464 39400002 [Tue Apr 17 04:13:42 2018] 39090080 7ce040a8 7ce75078 7ce041ad <40c2fff4> 38210020 e8010010 7c0803a6 Continuous lockups are observed. - Harish == == Hardware: P9 Boston/ P8 Tuleta DD revision: P9 DD2.2 Operating Env.: BML PNOR: version-SUPERMICRO-P9DSU-V1.10-20180413-imp Host OS: Ubuntu 18.04 ==== (In reply to comment #5) > hi Harish, > > Does this bug happen if you set powersave=off? I am wondering if this > problem might be related to the stop state issue. Following Issue is seen with powersave=off. This is on Power 8 BML. [ 517.480153] Modules linked in: salsa20_generic(+) camellia_generic cast6_generic cast_common serpent_generic twofish_generic twofish_common lrw algif_skcipher cuse hci_vhci bluetooth snd_seq snd_seq_device tgr192 ecdh_generic snd_timer snd wp512 soundcore rmd320 rmd256 rmd160 uhid hid vhost_net tap sctp unix_diag userio rmd128 vhost_vsock vmw_vsock_virtio_transport_common md4 dccp_ipv4 vhost vsock algif_hash dccp af_alg xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter kvm_hv kvm binfmt_misc ipmi_powernv leds_powernv ipmi_devintf vmx_crypto uio_pdrv_genirq ipmi_msghandler [ 517.480305] powernv_rng crct10dif_vpmsum uio ibmpowernv powernv_op_panel sch_fq_codel ip_tables x_tables autofs4 ses enclosure scsi_transport_sas btrfs xor zstd_compress raid6_pq tg3 uas crc32c_vpmsum ipr usb_storage [ 517.480343] CPU: 73 PID: 4388 Comm: stress-ng-dev Not tainted 4.15.0-15-generic #16-Ubuntu [ 517.480347] NIP: c00000000000a724 LR: c000000000016e74 CTR: 0000000030061154 [ 517.480352] REGS: c000001fef1db890 TRAP: 0901 Not tainted (4.15.0-15-generic) [ 517.480353] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 48002224 XER: 20000000 [ 517.480368] CFAR: c000000000d0c8fc SOFTE: 1 [ 517.480368] GPR00: c00000000018c5bc c000001fef1dbb10 c0000000016eb400 0000000000000500 [ 517.480368] GPR04: 0000000000000000 c0000000000a2608 9000000000001033 0000000000000004 [ 517.480368] GPR08: c000000007a52300 0000000000000000 0000000080000049 9000000000001003 [ 517.480368] GPR12: 0000000000000040 c000000007a52300 [ 517.480406] NIP [c00000000000a724] replay_interrupt_return+0x0/0x4 [ 517.480411] LR [c000000000016e74] arch_local_irq_restore+0x74/0x90 [ 517.480412] Call Trace: [ 517.480420] [c000001fef1dbb10] [c0000000018bd940] log_first_seq+0x0/0x8 (unreliable) [ 517.480427] [c000001fef1dbb30] [c00000000018c5bc] console_unlock+0x2fc/0x6c0 [ 517.480432] [c000001fef1dbc20] [c00000000018ccec] vprintk_emit+0x36c/0x420 [ 517.480437] [c000001fef1dbc90] [c00000000018ec54] vprintk_func+0x64/0xf0 [ 517.480442] [c000001fef1dbcb0] [c00000000018e354] printk+0x40/0x54 [ 517.480455] [c000001fef1dbcd0] [d00000001efc2f20] vsock_dev_do_ioctl.isra.4+0xb8/0xe0 [vsock] [ 517.480462] [c000001fef1dbd40] [c0000000003efc34] do_vfs_ioctl+0xd4/0xa00 [ 517.480467] [c000001fef1dbde0] [c0000000003f0624] SyS_ioctl+0xc4/0x130 [ 517.480473] [c000001fef1dbe30] [c00000000000b184] system_call+0x58/0x6c [ 517.480475] Instruction dump: [ 517.480480] 7d8000a6 e9628008 7d200026 618c8000 2c030900 4182e7f8 2c030500 4182e310 [ 517.480491] 2c030a00 4182ffa4 2c030e60 4182f090 <4e800020> 7c781b78 48000359 48000371 which leads to the Hard LOCKUP and rcu_stalls. [ 629.383369] Watchdog CPU:73 Hard LOCKUP [ 629.383372] Modules linked in: salsa20_generic(+) camellia_generic cast6_generic cast_common serpent_generic twofish_generic twofish_common lrw algif_skcipher cuse hci_vhci bluetooth snd_seq snd_seq_device tgr192 ecdh_generic snd_timer snd wp512 soundcore rmd320 rmd256 rmd160 uhid hid vhost_net tap sctp unix_diag userio rmd128 vhost_vsock vmw_vsock_virtio_transport_common md4 dccp_ipv4 vhost vsock algif_hash dccp af_alg xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter kvm_hv kvm binfmt_misc ipmi_powernv leds_powernv ipmi_devintf vmx_crypto uio_pdrv_genirq ipmi_msghandler [ 629.383451] powernv_rng crct10dif_vpmsum uio ibmpowernv powernv_op_panel sch_fq_codel ip_tables x_tables autofs4 ses enclosure scsi_transport_sas btrfs xor zstd_compress raid6_pq tg3 uas crc32c_vpmsum ipr usb_storage [ 629.383475] CPU: 73 PID: 4388 Comm: stress-ng-dev Tainted: G L 4.15.0-15-generic #16-Ubuntu [ 629.383478] NIP: c0000000000a259c LR: c00000000009df5c CTR: 0000000030036830 [ 629.383481] REGS: c000000007c37d80 TRAP: 0900 Tainted: G L (4.15.0-15-generic) [ 629.383482] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 48002222 XER: 20000000 [ 629.383494] CFAR: c00000000009df48 SOFTE: 0 [ 629.383497] GPR00: 0000000030005128 c000001fef1dba30 c0000000016eb400 0000000000000000 [ 629.383504] GPR04: 0000000048002222 c0000000000a259c 9000000000009033 00000000000000f1 [ 629.383510] GPR08: 0000000000000000 00000000300b0218 c00000000009df70 9000000000001003 [ 629.383517] GPR12: c00000000009df48 c000000007a52300 0000714a9655a560 0000000000000000 [ 629.383524] GPR16: 0000000000000027 0000000000000027 c000000001572a00 0000000000000000 [ 629.383530] GPR20: 20c49ba5e353f7cf 0000000000000000 0000000000000017 000000000000000d [ 629.383537] GPR24: fffffffffffffff5 0000000000000000 0000000000000010 c0000000018a8d18 [ 629.383544] GPR28: 0000000000000000 0000000000000010 c000001fef1dbad0 0000000000000010 [ 629.383551] NIP [c0000000000a259c] opal_put_chars+0x19c/0x280 [ 629.383553] LR [c00000000009df5c] opal_return+0x14/0x48 [ 629.383554] Call Trace: [ 629.383556] [c000001fef1dba30] [c0000000000a259c] opal_put_chars+0x19c/0x280 (unreliable) [ 629.383561] [c000001fef1dbab0] [c0000000008089b0] hvc_console_print+0xd0/0x210 [ 629.383564] [c000001fef1dbb30] [c00000000018c59c] console_unlock+0x2dc/0x6c0 [ 629.383568] [c000001fef1dbc20] [c00000000018ccec] vprintk_emit+0x36c/0x420 [ 629.383571] [c000001fef1dbc90] [c00000000018ec54] vprintk_func+0x64/0xf0 [ 629.383574] [c000001fef1dbcb0] [c00000000018e354] printk+0x40/0x54 [ 629.383577] [c000001fef1dbcd0] [d00000001efc2f20] vsock_dev_do_ioctl.isra.4+0xb8/0xe0 [vsock] [ 629.383581] [c000001fef1dbd40] [c0000000003efc34] do_vfs_ioctl+0xd4/0xa00 [ 629.383584] [c000001fef1dbde0] [c0000000003f0624] SyS_ioctl+0xc4/0x130 [ 629.383587] [c000001fef1dbe30] [c00000000000b184] system_call+0x58/0x6c [ 629.383589] Instruction dump: [ 629.383591] 7f03c378 38210080 eb01ffc0 eb61ffd8 4e800020 7f63db78 7f24cb78 48c6a5e1 [ 629.383600] 60000000 38600000 3b00fff5 4bffc01d <60000000> e8010090 eb210048 eb810060 [ 638.478556] INFO: rcu_sched detected stalls on CPUs/tasks: [ 638.478581] 73-....: (149 ticks this GP) idle=0aa/140000000000000/0 softirq=14950/14950 fqs=15015 [ 638.478586] (detected by 83, t=37345 jiffies, g=2733, c=2732, q=9230463) [ 638.478645] Sending NMI from CPU 83 to CPUs 73: So, actual issue might be with powersave. These issue are due to huge data dumped to console. - Harish == Harish Sriram <hasriram@in.ibm.com> == With powersave=off on a P9 DD 2.2 Boston LC, the following is observed. [ 119.200587] Watchdog CPU:60 Hard LOCKUP [ 119.204932] Watchdog CPU:48 Hard LOCKUP [ 119.207911] Watchdog CPU:52 Hard LOCKUP [ 119.208454] Watchdog CPU:46 Hard LOCKUP .... .... syslog will be attached. - Harish == MAHESH J. SALGAONKAR <mahesh.salgaonkar@in.ibm.com> == > > hi Harish, > > > > Does this bug happen if you set powersave=off? I am wondering if this > > problem might be related to the stop state issue. > > Following Issue is seen with powersave=off. This is on Power 8 BML. > > [ 517.480153] Modules linked in: salsa20_generic(+) camellia_generic > cast6_generic cast_common serpent_generic twofish_generic twofish_common lrw > algif_skcipher cuse hci_vhci bluetooth snd_seq snd_seq_device tgr192 > ecdh_generic snd_timer snd wp512 soundcore rmd320 rmd256 rmd160 uhid hid > vhost_net tap sctp unix_diag userio rmd128 vhost_vsock > vmw_vsock_virtio_transport_common md4 dccp_ipv4 vhost vsock algif_hash dccp > af_alg xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 > iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack > nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc > devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter > kvm_hv kvm binfmt_misc ipmi_powernv leds_powernv ipmi_devintf vmx_crypto > uio_pdrv_genirq ipmi_msghandler > [ 517.480305] powernv_rng crct10dif_vpmsum uio ibmpowernv powernv_op_panel > sch_fq_codel ip_tables x_tables autofs4 ses enclosure scsi_transport_sas > btrfs xor zstd_compress raid6_pq tg3 uas crc32c_vpmsum ipr usb_storage > [ 517.480343] CPU: 73 PID: 4388 Comm: stress-ng-dev Not tainted > 4.15.0-15-generic #16-Ubuntu > [ 517.480347] NIP: c00000000000a724 LR: c000000000016e74 CTR: > 0000000030061154 > [ 517.480352] REGS: c000001fef1db890 TRAP: 0901 Not tainted > (4.15.0-15-generic) > [ 517.480353] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: > 48002224 XER: 20000000 > [ 517.480368] CFAR: c000000000d0c8fc SOFTE: 1 > [ 517.480368] GPR00: c00000000018c5bc c000001fef1dbb10 c0000000016eb400 > 0000000000000500 > [ 517.480368] GPR04: 0000000000000000 c0000000000a2608 9000000000001033 > 0000000000000004 > [ 517.480368] GPR08: c000000007a52300 0000000000000000 0000000080000049 > 9000000000001003 > [ 517.480368] GPR12: 0000000000000040 c000000007a52300 > [ 517.480406] NIP [c00000000000a724] replay_interrupt_return+0x0/0x4 > [ 517.480411] LR [c000000000016e74] arch_local_irq_restore+0x74/0x90 > [ 517.480412] Call Trace: > [ 517.480420] [c000001fef1dbb10] [c0000000018bd940] log_first_seq+0x0/0x8 > (unreliable) > [ 517.480427] [c000001fef1dbb30] [c00000000018c5bc] > console_unlock+0x2fc/0x6c0 > [ 517.480432] [c000001fef1dbc20] [c00000000018ccec] vprintk_emit+0x36c/0x420 > [ 517.480437] [c000001fef1dbc90] [c00000000018ec54] vprintk_func+0x64/0xf0 > [ 517.480442] [c000001fef1dbcb0] [c00000000018e354] printk+0x40/0x54 > [ 517.480455] [c000001fef1dbcd0] [d00000001efc2f20] > vsock_dev_do_ioctl.isra.4+0xb8/0xe0 [vsock] > [ 517.480462] [c000001fef1dbd40] [c0000000003efc34] do_vfs_ioctl+0xd4/0xa00 > [ 517.480467] [c000001fef1dbde0] [c0000000003f0624] SyS_ioctl+0xc4/0x130 > [ 517.480473] [c000001fef1dbe30] [c00000000000b184] system_call+0x58/0x6c > [ 517.480475] Instruction dump: > [ 517.480480] 7d8000a6 e9628008 7d200026 618c8000 2c030900 4182e7f8 > 2c030500 4182e310 > [ 517.480491] 2c030a00 4182ffa4 2c030e60 4182f090 <4e800020> 7c781b78 > 48000359 48000371 > > which leads to the Hard LOCKUP and rcu_stalls. > > [ 629.383369] Watchdog CPU:73 Hard LOCKUP > [ 629.383372] Modules linked in: salsa20_generic(+) camellia_generic > cast6_generic cast_common serpent_generic twofish_generic twofish_common lrw > algif_skcipher cuse hci_vhci bluetooth snd_seq snd_seq_device tgr192 > ecdh_generic snd_timer snd wp512 soundcore rmd320 rmd256 rmd160 uhid hid > vhost_net tap sctp unix_diag userio rmd128 vhost_vsock > vmw_vsock_virtio_transport_common md4 dccp_ipv4 vhost vsock algif_hash dccp > af_alg xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 > iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack > nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc > devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter > kvm_hv kvm binfmt_misc ipmi_powernv leds_powernv ipmi_devintf vmx_crypto > uio_pdrv_genirq ipmi_msghandler > [ 629.383451] powernv_rng crct10dif_vpmsum uio ibmpowernv powernv_op_panel > sch_fq_codel ip_tables x_tables autofs4 ses enclosure scsi_transport_sas > btrfs xor zstd_compress raid6_pq tg3 uas crc32c_vpmsum ipr usb_storage > [ 629.383475] CPU: 73 PID: 4388 Comm: stress-ng-dev Tainted: G > L 4.15.0-15-generic #16-Ubuntu > [ 629.383478] NIP: c0000000000a259c LR: c00000000009df5c CTR: > 0000000030036830 > [ 629.383481] REGS: c000000007c37d80 TRAP: 0900 Tainted: G L > (4.15.0-15-generic) > [ 629.383482] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: > 48002222 XER: 20000000 > [ 629.383494] CFAR: c00000000009df48 SOFTE: 0 > [ 629.383497] GPR00: 0000000030005128 c000001fef1dba30 c0000000016eb400 > 0000000000000000 > [ 629.383504] GPR04: 0000000048002222 c0000000000a259c 9000000000009033 > 00000000000000f1 > [ 629.383510] GPR08: 0000000000000000 00000000300b0218 c00000000009df70 > 9000000000001003 > [ 629.383517] GPR12: c00000000009df48 c000000007a52300 0000714a9655a560 > 0000000000000000 > [ 629.383524] GPR16: 0000000000000027 0000000000000027 c000000001572a00 > 0000000000000000 > [ 629.383530] GPR20: 20c49ba5e353f7cf 0000000000000000 0000000000000017 > 000000000000000d > [ 629.383537] GPR24: fffffffffffffff5 0000000000000000 0000000000000010 > c0000000018a8d18 > [ 629.383544] GPR28: 0000000000000000 0000000000000010 c000001fef1dbad0 > 0000000000000010 > [ 629.383551] NIP [c0000000000a259c] opal_put_chars+0x19c/0x280 > [ 629.383553] LR [c00000000009df5c] opal_return+0x14/0x48 > [ 629.383554] Call Trace: > [ 629.383556] [c000001fef1dba30] [c0000000000a259c] > opal_put_chars+0x19c/0x280 (unreliable) > [ 629.383561] [c000001fef1dbab0] [c0000000008089b0] > hvc_console_print+0xd0/0x210 > [ 629.383564] [c000001fef1dbb30] [c00000000018c59c] > console_unlock+0x2dc/0x6c0 > [ 629.383568] [c000001fef1dbc20] [c00000000018ccec] vprintk_emit+0x36c/0x420 > [ 629.383571] [c000001fef1dbc90] [c00000000018ec54] vprintk_func+0x64/0xf0 > [ 629.383574] [c000001fef1dbcb0] [c00000000018e354] printk+0x40/0x54 > [ 629.383577] [c000001fef1dbcd0] [d00000001efc2f20] > vsock_dev_do_ioctl.isra.4+0xb8/0xe0 [vsock] > [ 629.383581] [c000001fef1dbd40] [c0000000003efc34] do_vfs_ioctl+0xd4/0xa00 > [ 629.383584] [c000001fef1dbde0] [c0000000003f0624] SyS_ioctl+0xc4/0x130 > [ 629.383587] [c000001fef1dbe30] [c00000000000b184] system_call+0x58/0x6c > [ 629.383589] Instruction dump: > [ 629.383591] 7f03c378 38210080 eb01ffc0 eb61ffd8 4e800020 7f63db78 > 7f24cb78 48c6a5e1 > [ 629.383600] 60000000 38600000 3b00fff5 4bffc01d <60000000> e8010090 > eb210048 eb810060 > [ 638.478556] INFO: rcu_sched detected stalls on CPUs/tasks: > [ 638.478581] 73-....: (149 ticks this GP) idle=0aa/140000000000000/0 > softirq=14950/14950 fqs=15015 > [ 638.478586] (detected by 83, t=37345 jiffies, g=2733, c=2732, > q=9230463) > [ 638.478645] Sending NMI from CPU 83 to CPUs 73: > > So, actual issue might be with powersave. These issue are due to huge data > dumped to console. > > - Harish == Breno Leitao <brenohl@br.ibm.com> == > I think we probably need below upstream commit: > > commit 15b4dd7981496f51c5f9262a5e0761e48de6655f > Author: Nicholas Piggin <npiggin@gmail.com> > Date: Tue Mar 27 01:01:03 2018 +1000 > > powerpc/64s: return more carefully from sreset NMI This patch apply cleanly on top of bionic kernel, but we definitely need to test it before sending to canonical for inclusion. Holding the mirror until we have an ack that this is the patch we need for this issue. == Gustavo Luiz Ferreira Walbon <gwalbon@br.ibm.com> == Breno, Two patches are missing, 6bed3237624e3faad1592543952907cd01a42c83 powerpc: use NMI IPI for smp_send_stop ac61c1156623455c46701654abd8c99720bceea1 powerpc: Fix smp_send_stop NMI IPI handling I have rebased the tree with latest proposed kernel and built the packages again : http://pokgsa.ibm.com/gsa/pokgsa/home/g/w/gwalbon/web/public/Bug166791/ Patches : https://github.com/walbon/ubuntu-bionic/commits/bug166791 == Harish Sriram <hasriram@in.ibm.com>== > > Hard Lockups are observed with stressing the 18.04.1 kernel. > > > > # uname -a > > Linux ltc-wspoon12 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 17:59:00 UTC > > 2018 ppc64le ppc64le ppc64le GNU/Linux > > > > [ 5295.047807] Watchdog CPU:64 Hard LOCKUP > > [ 5295.047817] Modules linked in: snd_seq snd_seq_device snd_timer snd > > soundcore twofish_generic twofish_common lrw uhid hid algif_skcipher > > vhost_net tgr192 vhost tap wp512 hci_vhci bluetooth rmd320 ecdh_generic > > rmd256 cuse rmd160 rmd128 btrfs zstd_compress md4 algif_hash xor unix_diag > > binfmt_misc raid6_pq sctp af_alg userio dccp_ipv4 dccp xt_CHECKSUM > > iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 > > nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c > > ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables > > ip6table_filter ip6_tables iptable_filter idt_89hpesx at24 ofpart > > uio_pdrv_genirq ipmi_powernv ipmi_devintf ipmi_msghandler opal_prd uio > > cmdlinepart powernv_flash mtd ibmpowernv vmx_crypto crct10dif_vpmsum kvm_hv > > kvm sch_fq_codel > > [ 5295.047928] ip_tables x_tables autofs4 mlx5_ib ib_core nouveau ast > > i2c_algo_bit ttm mlx5_core drm_kms_helper syscopyarea sysfillrect sysimgblt > > fb_sys_fops ahci mlxfw crc32c_vpmsum drm tg3 libahci devlink > > [ 5295.047956] CPU: 64 PID: 15874 Comm: stress-ng-hrtim Not tainted > > 4.15.0-23-generic #25-Ubuntu > > [ 5295.047958] NIP: c000000000cffc04 LR: c000000000126468 CTR: > > 0000000000000000 > > [ 5295.047962] REGS: c00000003fcffd80 TRAP: 0900 Not tainted > > (4.15.0-23-generic) > > [ 5295.047963] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: > > 28000448 XER: 00000000 > > [ 5295.047973] CFAR: c000000000126464 SOFTE: 0 > > GPR00: c000000000126468 c0002036131f7c70 c0000000016eaf00 > > c0002038f4c4b208 > > GPR04: c0002036131f7d30 0000000000000000 00000ebedb3a39a4 > > 00000ebedb3bc840 > > GPR08: 0000000000000021 0000000000000000 0000000080000040 > > 0000000052000442 > > GPR12: 0000000000000001 c00000000faac000 0000000000000000 > > 00000000000186a0 > > GPR16: 00007ffff15aaec0 00000ebedb366800 00007ffff15aaeb8 > > 00000ebedb366808 > > GPR20: 00007ffff15aaebc 00007ffff15ab17b ffffffffffffffff > > 00007ffff15ab17a > > GPR24: 0000000000010000 00007ab613340000 00007ffff15aad68 > > 00007ffff15aaec8 > > GPR28: 0000000000000000 0000000000040002 c0002036131f7cf0 > > c0002038f4c4b208 > > [ 5295.048016] NIP [c000000000cffc04] _raw_spin_lock_irq+0x44/0xe0 > > [ 5295.048021] LR [c000000000126468] __set_current_blocked+0x48/0xb0 > > [ 5295.048022] Call Trace: > > [ 5295.048027] [c0002036131f7ca0] [00007ffff15aaebc] 0x7ffff15aaebc > > [ 5295.048031] [c0002036131f7cd0] [c000000000126584] > > signal_setup_done+0x84/0xe0 > > [ 5295.048036] [c0002036131f7d10] [c00000000001dc60] do_signal+0x110/0x2c0 > > [ 5295.048040] [c0002036131f7e00] [c00000000001dfb0] > > do_notify_resume+0xd0/0x100 > > [ 5295.048045] [c0002036131f7e30] [c00000000000b8c4] > > ret_from_except_lite+0x70/0x74 > > [ 5295.048046] Instruction dump: > > [ 5295.048050] f821ffd1 7c7f1b78 39400000 892d028b 994d028b 39400000 > > 994d028d 814d0008 > > [ 5295.048074] 7d201829 2c090000 40c20010 7d40192d <40c2fff0> 7c2004ac > > 2fa90000 409e0010 > > [ 5295.272204] Watchdog CPU:64 became unstuck > > > > This is on WSP DD 2.2. > > PNOR: > > Version of System Firmware : > > Product Name : OpenPOWER Firmware > > Product Extra : version-witherspoon-ibm-OP9-v2.0-2.14 > > Product Extra : occ-77bb5e6 > > Product Extra : skiboot-v6.0.1 > > Product Extra : buildroot-2018.02.1-6-ga8d1126 > > It would be interesting to run an irqsoff latency tracer with this stress > test. It might require kernel recompile to add the irqsoff tracer option. > > Where is this hrtimer stress test from? This test from stress ng. == SRU Justification == IBM has seen hard lockups when running stress tests against Bionic. These test sometimes lead to rcu_stalls. IBM found that this bug is resolved with commits 6bed3237624e and ac61c1156623. == Fixes == 6bed3237624e ("powerpc: use NMI IPI for smp_send_stop") ac61c1156623 ("powerpc: Fix smp_send_stop NMI IPI handling") == Regression Potential == Low. Limited to powerpc. == Test Case == A test kernel was built with these patches and tested by the original bug reporter. The bug reporter states the test kernel resolved the bug. --Problem Description-- Hard LOCKUP on stressing Ubuntu 18 04 ---Issue observed--- Hard LOCKUP on stressing Ubuntu 18 04 using Ubuntu 18 04, sometimes leads to rcu_stalls. Apr 17 00:00:23 lep8d kernel: [ 4309.786755] Watchdog CPU:3 Hard LOCKUP Apr 17 00:00:23 lep8d kernel: [ 4309.786759] Modules linked in: algif_rng salsa20_generic userio camellia_generic cast6_generic cast_common snd_seq snd_seq_device snd_timer snd soundcore vhost_net serpent_generic tap twofish_generic twofish_common vhost_vsock vmw_vsock_virtio_transport_common vhost vsock lrw unix_diag algif_skcipher cuse sctp tgr192 wp512 rmd320 rmd256 rmd160 hci_vhci rmd128 bluetooth ecdh_generic dccp_ipv4 md4 uhid hid algif_hash dccp af_alg xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables devlink ip6table_filter ip6_tables iptable_filter kvm_hv kvm binfmt_misc uio_pdrv_genirq uio vmx_crypto ibmpowernv powernv_op_panel ipmi_powernv Apr 17 00:00:23 lep8d kernel: [ 4309.786899] ipmi_devintf ipmi_msghandler powernv_rng leds_powernv crct10dif_vpmsum sch_fq_codel ip_tables x_tables autofs4 ses enclosure scsi_transport_sas btrfs xor zstd_compress raid6_pq uas usb_storage crc32c_vpmsum tg3 ipr Apr 17 00:00:23 lep8d kernel: [ 4309.786944] CPU: 3 PID: 28361 Comm: stress-ng-hrtim Not tainted 4.15.0-15-generic #16-Ubuntu Apr 17 00:00:23 lep8d kernel: [ 4309.786950] NIP: c000000000d0c8b8 LR: c000000000120dbc CTR: c000000000024480 Apr 17 00:00:23 lep8d kernel: [ 4309.786956] REGS: c000000007f7fd80 TRAP: 0900 Not tainted (4.15.0-15-generic) Apr 17 00:00:23 lep8d kernel: [ 4309.786957] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 28000442 XER: 20000000 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] CFAR: c000000000120db8 SOFTE: 0 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR00: c000000000120dbc c000002d51377ba0 c0000000016eb400 c000002d512b0f88 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR04: 0000000000000000 0000000000000001 0000000001f40668 0000000000000001 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR08: 0000000000000001 0000000000000000 0000000080000003 0000000000000000 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR12: c000000000024480 c000000007a22100 0000000000000000 00000000000186a0 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR16: 00007fffed3687e0 00000abda98e5db8 0000000000008005 0000000000040100 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR20: 0000000000000000 00000000418004fc 00000000003c0000 0000000008430000 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR24: c000002d512b0f88 0000000000000000 c000002d51377d30 c000002d52d47c00 Apr 17 00:00:23 lep8d kernel: [ 4309.786972] GPR28: c000002d51377d50 c000002d51377d50 c000002d52f04500 c000002d512b0f88 Apr 17 00:00:23 lep8d kernel: [ 4309.787035] NIP [c000000000d0c8b8] _raw_spin_lock+0x38/0xe0 Apr 17 00:00:23 lep8d kernel: [ 4309.787045] LR [c000000000120dbc] dequeue_signal+0xcc/0x260 Apr 17 00:00:23 lep8d kernel: [ 4309.787046] Call Trace: Apr 17 00:00:23 lep8d kernel: [ 4309.787052] [c000002d51377bd0] [c000000000120dac] dequeue_signal+0xbc/0x260 Apr 17 00:00:23 lep8d kernel: [ 4309.787059] [c000002d51377c20] [c00000000012459c] get_signal+0x13c/0x7a0 Apr 17 00:00:23 lep8d kernel: [ 4309.787066] [c000002d51377d10] [c00000000001dacc] do_signal+0x7c/0x2c0 Apr 17 00:00:23 lep8d kernel: [ 4309.787072] [c000002d51377e00] [c00000000001deb0] do_notify_resume+0xd0/0x100 Apr 17 00:00:23 lep8d kernel: [ 4309.787083] [c000002d51377e30] [c00000000000b7c4] ret_from_except_lite+0x70/0x74 Apr 17 00:00:23 lep8d kernel: [ 4309.787085] Instruction dump: Apr 17 00:00:23 lep8d kernel: [ 4309.787090] 7c0802a6 60000000 fbe1fff8 f821ffd1 7c7f1b78 39400000 994d028c 814d0008 Apr 17 00:00:23 lep8d kernel: [ 4309.787102] 7d201829 2c090000 40c20010 7d40192d <40c2fff0> 7c2004ac 2fa90000 409e001c Apr 17 00:00:23 lep8d kernel: [ 4313.015781] kauditd_printk_skb: 13 callbacks suppressed ---uname output--- # uname -a Linux lep8d 4.15.0-15-generic #16-Ubuntu SMP Wed Apr 4 13:57:51 UTC 2018 ppc64le ppc64le ppc64le GNU/Linux Machine Type = Power 8 BML/Tuleta ----Additional Info----- Hard LOCKUP is also seen on garri BML. syslog is attached. Reproducible : 90% ---Steps to Reproduce--- 1. wget https://github.com/ColinIanKing/stress-ng/archive/master.zip 2. unzip master.zip; cd stress-ng-master; 3. make; make install; 4. Run the following command multiple times stress-ng --all <nr_cpus> --vm-bytes 80% --aggressive --maximize --oomable --timeout 300 --verify --syslog --metrics --times Issue is observed on Power 9 BML machines as well. [Tue Apr 17 04:13:42 2018] Watchdog CPU:37 Hard LOCKUP [Tue Apr 17 04:13:42 2018] Modules linked in: vsock lrw algif_skcipher tgr192 wp512 rmd320 rmd256 hci_vhci unix_diag bluetooth rmd160 sctp rmd128 ecdh_generic md4 dccp_ipv4 algif_hash cuse dccp af_alg vhost_net vhost tap xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter devlink binfmt_misc kvm_hv kvm dm_crypt dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua idt_89hpesx joydev input_leds mac_hid ofpart cmdlinepart vmx_crypto ipmi_powernv ipmi_devintf at24 uio_pdrv_genirq ibmpowernv opal_prd crct10dif_vpmsum powernv_flash ipmi_msghandler mtd uio sch_fq_codel ib_iser rdma_cm iw_cm ib_cm [Tue Apr 17 04:13:42 2018] ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi jc42 ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq ses enclosure scsi_transport_sas hid_generic ast i2c_algo_bit ttm drm_kms_helper usbhid hid syscopyarea sysfillrect sysimgblt fb_sys_fops crc32c_vpmsum drm i40e aacraid [Tue Apr 17 04:13:42 2018] CPU: 37 PID: 11524 Comm: stress-ng-hrtim Not tainted 4.15.0-15-generic #16-Ubuntu [Tue Apr 17 04:13:42 2018] NIP: c00000000012058c LR: c000000000120554 CTR: c00000000002bd30 [Tue Apr 17 04:13:42 2018] REGS: c000000007debd80 TRAP: 0900 Not tainted (4.15.0-15-generic) [Tue Apr 17 04:13:42 2018] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 22000442 XER: 00000000 [Tue Apr 17 04:13:42 2018] CFAR: c00000000011f8c4 SOFTE: 0                            GPR00: c000000000120554 c0000004cfc7bd10 c0000000016eb400 c0000004cfb54200                            GPR04: c0000004cfc7be00 0000000042000442 0000000000007338 0000000000040000                            GPR08: c0000004cfc78080 c0000004cfc78000 0000000000000002 c000000000d10f78                            GPR12: c00000000002bd30 c000000007a39700 0000000000000000 00000000000186a0                            GPR16: 00007fffd2cab1d0 00000edac67c5db8 00007fffd2cab1c8 00000edac67c5dc0                            GPR20: 00007fffd2cab1cc 00007fffd2cab48b ffffffffffffffff 00007fffd2cab48a                            GPR24: 0000000000010000 000072164cd80000 00007fffd2cab0c4 00007fffd2cab1d8                            GPR28: 0000000000000000 c0000004cfc7be00 c0000004cfc7be00 c0000004cfb54200 [Tue Apr 17 04:13:42 2018] NIP [c00000000012058c] recalc_sigpending+0x5c/0x90 [Tue Apr 17 04:13:42 2018] LR [c000000000120554] recalc_sigpending+0x24/0x90 [Tue Apr 17 04:13:42 2018] Call Trace: [Tue Apr 17 04:13:42 2018] [c0000004cfc7bd10] [c00000000001db60] do_signal+0x110/0x2c0 (unreliable) [Tue Apr 17 04:13:42 2018] [c0000004cfc7bd30] [c000000000121658] __set_task_blocked+0x48/0x90 [Tue Apr 17 04:13:42 2018] [c0000004cfc7bd70] [c000000000124ed8] __set_current_blocked+0x58/0xb0 [Tue Apr 17 04:13:42 2018] [c0000004cfc7bda0] [c00000000002be18] sys_rt_sigreturn+0xe8/0x270 [Tue Apr 17 04:13:42 2018] [c0000004cfc7be30] [c00000000000b184] system_call+0x58/0x6c [Tue Apr 17 04:13:42 2018] Instruction dump: [Tue Apr 17 04:13:42 2018] e86d0260 3d220020 3929def8 81290000 2f890000 409e0030 78290464 39400002 [Tue Apr 17 04:13:42 2018] 39090080 7ce040a8 7ce75078 7ce041ad <40c2fff4> 38210020 e8010010 7c0803a6 Continuous lockups are observed. - Harish == == Hardware: P9 Boston/ P8 Tuleta DD revision: P9 DD2.2 Operating Env.: BML PNOR: version-SUPERMICRO-P9DSU-V1.10-20180413-imp Host OS: Ubuntu 18.04 ==== (In reply to comment #5) > hi Harish, > > Does this bug happen if you set powersave=off? I am wondering if this > problem might be related to the stop state issue. Following Issue is seen with powersave=off. This is on Power 8 BML. [ 517.480153] Modules linked in: salsa20_generic(+) camellia_generic cast6_generic cast_common serpent_generic twofish_generic twofish_common lrw algif_skcipher cuse hci_vhci bluetooth snd_seq snd_seq_device tgr192 ecdh_generic snd_timer snd wp512 soundcore rmd320 rmd256 rmd160 uhid hid vhost_net tap sctp unix_diag userio rmd128 vhost_vsock vmw_vsock_virtio_transport_common md4 dccp_ipv4 vhost vsock algif_hash dccp af_alg xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter kvm_hv kvm binfmt_misc ipmi_powernv leds_powernv ipmi_devintf vmx_crypto uio_pdrv_genirq ipmi_msghandler [ 517.480305] powernv_rng crct10dif_vpmsum uio ibmpowernv powernv_op_panel sch_fq_codel ip_tables x_tables autofs4 ses enclosure scsi_transport_sas btrfs xor zstd_compress raid6_pq tg3 uas crc32c_vpmsum ipr usb_storage [ 517.480343] CPU: 73 PID: 4388 Comm: stress-ng-dev Not tainted 4.15.0-15-generic #16-Ubuntu [ 517.480347] NIP: c00000000000a724 LR: c000000000016e74 CTR: 0000000030061154 [ 517.480352] REGS: c000001fef1db890 TRAP: 0901 Not tainted (4.15.0-15-generic) [ 517.480353] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 48002224 XER: 20000000 [ 517.480368] CFAR: c000000000d0c8fc SOFTE: 1 [ 517.480368] GPR00: c00000000018c5bc c000001fef1dbb10 c0000000016eb400 0000000000000500 [ 517.480368] GPR04: 0000000000000000 c0000000000a2608 9000000000001033 0000000000000004 [ 517.480368] GPR08: c000000007a52300 0000000000000000 0000000080000049 9000000000001003 [ 517.480368] GPR12: 0000000000000040 c000000007a52300 [ 517.480406] NIP [c00000000000a724] replay_interrupt_return+0x0/0x4 [ 517.480411] LR [c000000000016e74] arch_local_irq_restore+0x74/0x90 [ 517.480412] Call Trace: [ 517.480420] [c000001fef1dbb10] [c0000000018bd940] log_first_seq+0x0/0x8 (unreliable) [ 517.480427] [c000001fef1dbb30] [c00000000018c5bc] console_unlock+0x2fc/0x6c0 [ 517.480432] [c000001fef1dbc20] [c00000000018ccec] vprintk_emit+0x36c/0x420 [ 517.480437] [c000001fef1dbc90] [c00000000018ec54] vprintk_func+0x64/0xf0 [ 517.480442] [c000001fef1dbcb0] [c00000000018e354] printk+0x40/0x54 [ 517.480455] [c000001fef1dbcd0] [d00000001efc2f20] vsock_dev_do_ioctl.isra.4+0xb8/0xe0 [vsock] [ 517.480462] [c000001fef1dbd40] [c0000000003efc34] do_vfs_ioctl+0xd4/0xa00 [ 517.480467] [c000001fef1dbde0] [c0000000003f0624] SyS_ioctl+0xc4/0x130 [ 517.480473] [c000001fef1dbe30] [c00000000000b184] system_call+0x58/0x6c [ 517.480475] Instruction dump: [ 517.480480] 7d8000a6 e9628008 7d200026 618c8000 2c030900 4182e7f8 2c030500 4182e310 [ 517.480491] 2c030a00 4182ffa4 2c030e60 4182f090 <4e800020> 7c781b78 48000359 48000371 which leads to the Hard LOCKUP and rcu_stalls. [ 629.383369] Watchdog CPU:73 Hard LOCKUP [ 629.383372] Modules linked in: salsa20_generic(+) camellia_generic cast6_generic cast_common serpent_generic twofish_generic twofish_common lrw algif_skcipher cuse hci_vhci bluetooth snd_seq snd_seq_device tgr192 ecdh_generic snd_timer snd wp512 soundcore rmd320 rmd256 rmd160 uhid hid vhost_net tap sctp unix_diag userio rmd128 vhost_vsock vmw_vsock_virtio_transport_common md4 dccp_ipv4 vhost vsock algif_hash dccp af_alg xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter kvm_hv kvm binfmt_misc ipmi_powernv leds_powernv ipmi_devintf vmx_crypto uio_pdrv_genirq ipmi_msghandler [ 629.383451] powernv_rng crct10dif_vpmsum uio ibmpowernv powernv_op_panel sch_fq_codel ip_tables x_tables autofs4 ses enclosure scsi_transport_sas btrfs xor zstd_compress raid6_pq tg3 uas crc32c_vpmsum ipr usb_storage [ 629.383475] CPU: 73 PID: 4388 Comm: stress-ng-dev Tainted: G L 4.15.0-15-generic #16-Ubuntu [ 629.383478] NIP: c0000000000a259c LR: c00000000009df5c CTR: 0000000030036830 [ 629.383481] REGS: c000000007c37d80 TRAP: 0900 Tainted: G L (4.15.0-15-generic) [ 629.383482] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: 48002222 XER: 20000000 [ 629.383494] CFAR: c00000000009df48 SOFTE: 0 [ 629.383497] GPR00: 0000000030005128 c000001fef1dba30 c0000000016eb400 0000000000000000 [ 629.383504] GPR04: 0000000048002222 c0000000000a259c 9000000000009033 00000000000000f1 [ 629.383510] GPR08: 0000000000000000 00000000300b0218 c00000000009df70 9000000000001003 [ 629.383517] GPR12: c00000000009df48 c000000007a52300 0000714a9655a560 0000000000000000 [ 629.383524] GPR16: 0000000000000027 0000000000000027 c000000001572a00 0000000000000000 [ 629.383530] GPR20: 20c49ba5e353f7cf 0000000000000000 0000000000000017 000000000000000d [ 629.383537] GPR24: fffffffffffffff5 0000000000000000 0000000000000010 c0000000018a8d18 [ 629.383544] GPR28: 0000000000000000 0000000000000010 c000001fef1dbad0 0000000000000010 [ 629.383551] NIP [c0000000000a259c] opal_put_chars+0x19c/0x280 [ 629.383553] LR [c00000000009df5c] opal_return+0x14/0x48 [ 629.383554] Call Trace: [ 629.383556] [c000001fef1dba30] [c0000000000a259c] opal_put_chars+0x19c/0x280 (unreliable) [ 629.383561] [c000001fef1dbab0] [c0000000008089b0] hvc_console_print+0xd0/0x210 [ 629.383564] [c000001fef1dbb30] [c00000000018c59c] console_unlock+0x2dc/0x6c0 [ 629.383568] [c000001fef1dbc20] [c00000000018ccec] vprintk_emit+0x36c/0x420 [ 629.383571] [c000001fef1dbc90] [c00000000018ec54] vprintk_func+0x64/0xf0 [ 629.383574] [c000001fef1dbcb0] [c00000000018e354] printk+0x40/0x54 [ 629.383577] [c000001fef1dbcd0] [d00000001efc2f20] vsock_dev_do_ioctl.isra.4+0xb8/0xe0 [vsock] [ 629.383581] [c000001fef1dbd40] [c0000000003efc34] do_vfs_ioctl+0xd4/0xa00 [ 629.383584] [c000001fef1dbde0] [c0000000003f0624] SyS_ioctl+0xc4/0x130 [ 629.383587] [c000001fef1dbe30] [c00000000000b184] system_call+0x58/0x6c [ 629.383589] Instruction dump: [ 629.383591] 7f03c378 38210080 eb01ffc0 eb61ffd8 4e800020 7f63db78 7f24cb78 48c6a5e1 [ 629.383600] 60000000 38600000 3b00fff5 4bffc01d <60000000> e8010090 eb210048 eb810060 [ 638.478556] INFO: rcu_sched detected stalls on CPUs/tasks: [ 638.478581] 73-....: (149 ticks this GP) idle=0aa/140000000000000/0 softirq=14950/14950 fqs=15015 [ 638.478586] (detected by 83, t=37345 jiffies, g=2733, c=2732, q=9230463) [ 638.478645] Sending NMI from CPU 83 to CPUs 73: So, actual issue might be with powersave. These issue are due to huge data dumped to console. - Harish == Harish Sriram <hasriram@in.ibm.com> == With powersave=off on a P9 DD 2.2 Boston LC, the following is observed. [ 119.200587] Watchdog CPU:60 Hard LOCKUP [ 119.204932] Watchdog CPU:48 Hard LOCKUP [ 119.207911] Watchdog CPU:52 Hard LOCKUP [ 119.208454] Watchdog CPU:46 Hard LOCKUP .... .... syslog will be attached. - Harish == MAHESH J. SALGAONKAR <mahesh.salgaonkar@in.ibm.com> == > > hi Harish, > > > > Does this bug happen if you set powersave=off? I am wondering if this > > problem might be related to the stop state issue. > > Following Issue is seen with powersave=off. This is on Power 8 BML. > > [ 517.480153] Modules linked in: salsa20_generic(+) camellia_generic > cast6_generic cast_common serpent_generic twofish_generic twofish_common lrw > algif_skcipher cuse hci_vhci bluetooth snd_seq snd_seq_device tgr192 > ecdh_generic snd_timer snd wp512 soundcore rmd320 rmd256 rmd160 uhid hid > vhost_net tap sctp unix_diag userio rmd128 vhost_vsock > vmw_vsock_virtio_transport_common md4 dccp_ipv4 vhost vsock algif_hash dccp > af_alg xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 > iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack > nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc > devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter > kvm_hv kvm binfmt_misc ipmi_powernv leds_powernv ipmi_devintf vmx_crypto > uio_pdrv_genirq ipmi_msghandler > [ 517.480305] powernv_rng crct10dif_vpmsum uio ibmpowernv powernv_op_panel > sch_fq_codel ip_tables x_tables autofs4 ses enclosure scsi_transport_sas > btrfs xor zstd_compress raid6_pq tg3 uas crc32c_vpmsum ipr usb_storage > [ 517.480343] CPU: 73 PID: 4388 Comm: stress-ng-dev Not tainted > 4.15.0-15-generic #16-Ubuntu > [ 517.480347] NIP: c00000000000a724 LR: c000000000016e74 CTR: > 0000000030061154 > [ 517.480352] REGS: c000001fef1db890 TRAP: 0901 Not tainted > (4.15.0-15-generic) > [ 517.480353] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: > 48002224 XER: 20000000 > [ 517.480368] CFAR: c000000000d0c8fc SOFTE: 1 > [ 517.480368] GPR00: c00000000018c5bc c000001fef1dbb10 c0000000016eb400 > 0000000000000500 > [ 517.480368] GPR04: 0000000000000000 c0000000000a2608 9000000000001033 > 0000000000000004 > [ 517.480368] GPR08: c000000007a52300 0000000000000000 0000000080000049 > 9000000000001003 > [ 517.480368] GPR12: 0000000000000040 c000000007a52300 > [ 517.480406] NIP [c00000000000a724] replay_interrupt_return+0x0/0x4 > [ 517.480411] LR [c000000000016e74] arch_local_irq_restore+0x74/0x90 > [ 517.480412] Call Trace: > [ 517.480420] [c000001fef1dbb10] [c0000000018bd940] log_first_seq+0x0/0x8 > (unreliable) > [ 517.480427] [c000001fef1dbb30] [c00000000018c5bc] > console_unlock+0x2fc/0x6c0 > [ 517.480432] [c000001fef1dbc20] [c00000000018ccec] vprintk_emit+0x36c/0x420 > [ 517.480437] [c000001fef1dbc90] [c00000000018ec54] vprintk_func+0x64/0xf0 > [ 517.480442] [c000001fef1dbcb0] [c00000000018e354] printk+0x40/0x54 > [ 517.480455] [c000001fef1dbcd0] [d00000001efc2f20] > vsock_dev_do_ioctl.isra.4+0xb8/0xe0 [vsock] > [ 517.480462] [c000001fef1dbd40] [c0000000003efc34] do_vfs_ioctl+0xd4/0xa00 > [ 517.480467] [c000001fef1dbde0] [c0000000003f0624] SyS_ioctl+0xc4/0x130 > [ 517.480473] [c000001fef1dbe30] [c00000000000b184] system_call+0x58/0x6c > [ 517.480475] Instruction dump: > [ 517.480480] 7d8000a6 e9628008 7d200026 618c8000 2c030900 4182e7f8 > 2c030500 4182e310 > [ 517.480491] 2c030a00 4182ffa4 2c030e60 4182f090 <4e800020> 7c781b78 > 48000359 48000371 > > which leads to the Hard LOCKUP and rcu_stalls. > > [ 629.383369] Watchdog CPU:73 Hard LOCKUP > [ 629.383372] Modules linked in: salsa20_generic(+) camellia_generic > cast6_generic cast_common serpent_generic twofish_generic twofish_common lrw > algif_skcipher cuse hci_vhci bluetooth snd_seq snd_seq_device tgr192 > ecdh_generic snd_timer snd wp512 soundcore rmd320 rmd256 rmd160 uhid hid > vhost_net tap sctp unix_diag userio rmd128 vhost_vsock > vmw_vsock_virtio_transport_common md4 dccp_ipv4 vhost vsock algif_hash dccp > af_alg xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 > iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack > nf_conntrack libcrc32c ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc > devlink ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter > kvm_hv kvm binfmt_misc ipmi_powernv leds_powernv ipmi_devintf vmx_crypto > uio_pdrv_genirq ipmi_msghandler > [ 629.383451] powernv_rng crct10dif_vpmsum uio ibmpowernv powernv_op_panel > sch_fq_codel ip_tables x_tables autofs4 ses enclosure scsi_transport_sas > btrfs xor zstd_compress raid6_pq tg3 uas crc32c_vpmsum ipr usb_storage > [ 629.383475] CPU: 73 PID: 4388 Comm: stress-ng-dev Tainted: G > L 4.15.0-15-generic #16-Ubuntu > [ 629.383478] NIP: c0000000000a259c LR: c00000000009df5c CTR: > 0000000030036830 > [ 629.383481] REGS: c000000007c37d80 TRAP: 0900 Tainted: G L > (4.15.0-15-generic) > [ 629.383482] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: > 48002222 XER: 20000000 > [ 629.383494] CFAR: c00000000009df48 SOFTE: 0 > [ 629.383497] GPR00: 0000000030005128 c000001fef1dba30 c0000000016eb400 > 0000000000000000 > [ 629.383504] GPR04: 0000000048002222 c0000000000a259c 9000000000009033 > 00000000000000f1 > [ 629.383510] GPR08: 0000000000000000 00000000300b0218 c00000000009df70 > 9000000000001003 > [ 629.383517] GPR12: c00000000009df48 c000000007a52300 0000714a9655a560 > 0000000000000000 > [ 629.383524] GPR16: 0000000000000027 0000000000000027 c000000001572a00 > 0000000000000000 > [ 629.383530] GPR20: 20c49ba5e353f7cf 0000000000000000 0000000000000017 > 000000000000000d > [ 629.383537] GPR24: fffffffffffffff5 0000000000000000 0000000000000010 > c0000000018a8d18 > [ 629.383544] GPR28: 0000000000000000 0000000000000010 c000001fef1dbad0 > 0000000000000010 > [ 629.383551] NIP [c0000000000a259c] opal_put_chars+0x19c/0x280 > [ 629.383553] LR [c00000000009df5c] opal_return+0x14/0x48 > [ 629.383554] Call Trace: > [ 629.383556] [c000001fef1dba30] [c0000000000a259c] > opal_put_chars+0x19c/0x280 (unreliable) > [ 629.383561] [c000001fef1dbab0] [c0000000008089b0] > hvc_console_print+0xd0/0x210 > [ 629.383564] [c000001fef1dbb30] [c00000000018c59c] > console_unlock+0x2dc/0x6c0 > [ 629.383568] [c000001fef1dbc20] [c00000000018ccec] vprintk_emit+0x36c/0x420 > [ 629.383571] [c000001fef1dbc90] [c00000000018ec54] vprintk_func+0x64/0xf0 > [ 629.383574] [c000001fef1dbcb0] [c00000000018e354] printk+0x40/0x54 > [ 629.383577] [c000001fef1dbcd0] [d00000001efc2f20] > vsock_dev_do_ioctl.isra.4+0xb8/0xe0 [vsock] > [ 629.383581] [c000001fef1dbd40] [c0000000003efc34] do_vfs_ioctl+0xd4/0xa00 > [ 629.383584] [c000001fef1dbde0] [c0000000003f0624] SyS_ioctl+0xc4/0x130 > [ 629.383587] [c000001fef1dbe30] [c00000000000b184] system_call+0x58/0x6c > [ 629.383589] Instruction dump: > [ 629.383591] 7f03c378 38210080 eb01ffc0 eb61ffd8 4e800020 7f63db78 > 7f24cb78 48c6a5e1 > [ 629.383600] 60000000 38600000 3b00fff5 4bffc01d <60000000> e8010090 > eb210048 eb810060 > [ 638.478556] INFO: rcu_sched detected stalls on CPUs/tasks: > [ 638.478581] 73-....: (149 ticks this GP) idle=0aa/140000000000000/0 > softirq=14950/14950 fqs=15015 > [ 638.478586] (detected by 83, t=37345 jiffies, g=2733, c=2732, > q=9230463) > [ 638.478645] Sending NMI from CPU 83 to CPUs 73: > > So, actual issue might be with powersave. These issue are due to huge data > dumped to console. > > - Harish == Breno Leitao <brenohl@br.ibm.com> == > I think we probably need below upstream commit: > > commit 15b4dd7981496f51c5f9262a5e0761e48de6655f > Author: Nicholas Piggin <npiggin@gmail.com> > Date: Tue Mar 27 01:01:03 2018 +1000 > > powerpc/64s: return more carefully from sreset NMI This patch apply cleanly on top of bionic kernel, but we definitely need to test it before sending to canonical for inclusion. Holding the mirror until we have an ack that this is the patch we need for this issue. == Gustavo Luiz Ferreira Walbon <gwalbon@br.ibm.com> == Breno, Two patches are missing, 6bed3237624e3faad1592543952907cd01a42c83 powerpc: use NMI IPI for smp_send_stop ac61c1156623455c46701654abd8c99720bceea1 powerpc: Fix smp_send_stop NMI IPI handling I have rebased the tree with latest proposed kernel and built the packages again : http://pokgsa.ibm.com/gsa/pokgsa/home/g/w/gwalbon/web/public/Bug166791/ Patches : https://github.com/walbon/ubuntu-bionic/commits/bug166791 == Harish Sriram <hasriram@in.ibm.com>== > > Hard Lockups are observed with stressing the 18.04.1 kernel. > > > > # uname -a > > Linux ltc-wspoon12 4.15.0-23-generic #25-Ubuntu SMP Wed May 23 17:59:00 UTC > > 2018 ppc64le ppc64le ppc64le GNU/Linux > > > > [ 5295.047807] Watchdog CPU:64 Hard LOCKUP > > [ 5295.047817] Modules linked in: snd_seq snd_seq_device snd_timer snd > > soundcore twofish_generic twofish_common lrw uhid hid algif_skcipher > > vhost_net tgr192 vhost tap wp512 hci_vhci bluetooth rmd320 ecdh_generic > > rmd256 cuse rmd160 rmd128 btrfs zstd_compress md4 algif_hash xor unix_diag > > binfmt_misc raid6_pq sctp af_alg userio dccp_ipv4 dccp xt_CHECKSUM > > iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 > > nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack libcrc32c > > ipt_REJECT nf_reject_ipv4 xt_tcpudp bridge stp llc ebtable_filter ebtables > > ip6table_filter ip6_tables iptable_filter idt_89hpesx at24 ofpart > > uio_pdrv_genirq ipmi_powernv ipmi_devintf ipmi_msghandler opal_prd uio > > cmdlinepart powernv_flash mtd ibmpowernv vmx_crypto crct10dif_vpmsum kvm_hv > > kvm sch_fq_codel > > [ 5295.047928] ip_tables x_tables autofs4 mlx5_ib ib_core nouveau ast > > i2c_algo_bit ttm mlx5_core drm_kms_helper syscopyarea sysfillrect sysimgblt > > fb_sys_fops ahci mlxfw crc32c_vpmsum drm tg3 libahci devlink > > [ 5295.047956] CPU: 64 PID: 15874 Comm: stress-ng-hrtim Not tainted > > 4.15.0-23-generic #25-Ubuntu > > [ 5295.047958] NIP: c000000000cffc04 LR: c000000000126468 CTR: > > 0000000000000000 > > [ 5295.047962] REGS: c00000003fcffd80 TRAP: 0900 Not tainted > > (4.15.0-23-generic) > > [ 5295.047963] MSR: 9000000000009033 <SF,HV,EE,ME,IR,DR,RI,LE> CR: > > 28000448 XER: 00000000 > > [ 5295.047973] CFAR: c000000000126464 SOFTE: 0 > > GPR00: c000000000126468 c0002036131f7c70 c0000000016eaf00 > > c0002038f4c4b208 > > GPR04: c0002036131f7d30 0000000000000000 00000ebedb3a39a4 > > 00000ebedb3bc840 > > GPR08: 0000000000000021 0000000000000000 0000000080000040 > > 0000000052000442 > > GPR12: 0000000000000001 c00000000faac000 0000000000000000 > > 00000000000186a0 > > GPR16: 00007ffff15aaec0 00000ebedb366800 00007ffff15aaeb8 > > 00000ebedb366808 > > GPR20: 00007ffff15aaebc 00007ffff15ab17b ffffffffffffffff > > 00007ffff15ab17a > > GPR24: 0000000000010000 00007ab613340000 00007ffff15aad68 > > 00007ffff15aaec8 > > GPR28: 0000000000000000 0000000000040002 c0002036131f7cf0 > > c0002038f4c4b208 > > [ 5295.048016] NIP [c000000000cffc04] _raw_spin_lock_irq+0x44/0xe0 > > [ 5295.048021] LR [c000000000126468] __set_current_blocked+0x48/0xb0 > > [ 5295.048022] Call Trace: > > [ 5295.048027] [c0002036131f7ca0] [00007ffff15aaebc] 0x7ffff15aaebc > > [ 5295.048031] [c0002036131f7cd0] [c000000000126584] > > signal_setup_done+0x84/0xe0 > > [ 5295.048036] [c0002036131f7d10] [c00000000001dc60] do_signal+0x110/0x2c0 > > [ 5295.048040] [c0002036131f7e00] [c00000000001dfb0] > > do_notify_resume+0xd0/0x100 > > [ 5295.048045] [c0002036131f7e30] [c00000000000b8c4] > > ret_from_except_lite+0x70/0x74 > > [ 5295.048046] Instruction dump: > > [ 5295.048050] f821ffd1 7c7f1b78 39400000 892d028b 994d028b 39400000 > > 994d028d 814d0008 > > [ 5295.048074] 7d201829 2c090000 40c20010 7d40192d <40c2fff0> 7c2004ac > > 2fa90000 409e0010 > > [ 5295.272204] Watchdog CPU:64 became unstuck > > > > This is on WSP DD 2.2. > > PNOR: > > Version of System Firmware : > > Product Name : OpenPOWER Firmware > > Product Extra : version-witherspoon-ibm-OP9-v2.0-2.14 > > Product Extra : occ-77bb5e6 > > Product Extra : skiboot-v6.0.1 > > Product Extra : buildroot-2018.02.1-6-ga8d1126 > > It would be interesting to run an irqsoff latency tracer with this stress > test. It might require kernel recompile to add the irqsoff tracer option. > > Where is this hrtimer stress test from? This test from stress ng.
2018-06-25 13:38:27 Frank Heimes ubuntu-power-systems: status Triaged In Progress
2018-07-17 15:00:57 Stefan Bader linux (Ubuntu Bionic): status In Progress Fix Committed
2018-07-18 10:01:36 Brad Figg tags architecture-ppc64le bugnameltc-166791 severity-high targetmilestone-inin1804 triage-g architecture-ppc64le bugnameltc-166791 severity-high targetmilestone-inin1804 triage-g verification-needed-bionic
2018-07-20 06:40:23 bugproxy attachment added syslog-29.txt https://bugs.launchpad.net/bugs/1777194/+attachment/5165491/+files/syslog.txt
2018-07-20 11:17:48 Kleber Sacilotto de Souza tags architecture-ppc64le bugnameltc-166791 severity-high targetmilestone-inin1804 triage-g verification-needed-bionic architecture-ppc64le bugnameltc-166791 severity-high targetmilestone-inin1804 triage-g verification-done-bionic
2018-07-20 15:39:45 Launchpad Janitor linux (Ubuntu Bionic): status Fix Committed Fix Released
2018-07-26 05:13:52 Launchpad Janitor linux (Ubuntu): status In Progress Fix Released
2018-07-26 07:45:31 Andrew Cloke ubuntu-power-systems: status In Progress Fix Released
2019-07-24 21:05:35 Brad Figg tags architecture-ppc64le bugnameltc-166791 severity-high targetmilestone-inin1804 triage-g verification-done-bionic architecture-ppc64le bugnameltc-166791 cscc severity-high targetmilestone-inin1804 triage-g verification-done-bionic