Ubuntu Bionic Beaver (development branch) boslcp3 hvc0 boslcp3 login: root Password: Last login: Thu Apr 12 00:28:27 CDT 2018 from 10.33.11.31 on pts/1 Welcome to Ubuntu Bionic Beaver (development branch) (GNU/Linux 4.15.0-15-generic ppc64le) * Documentation: https://help.ubuntu.com * Management: https://landscape.canonical.com * Support: https://ubuntu.com/advantage ******************************************************************************** IBM Business Use Statement -------------------------- IBM's internal systems must only be used for conducting IBM's business or for purposes authorized by IBM management. Use is subject to audit at any time by IBM management. Distribution: Ubuntu Bionic Beaver Kernel Build: 4.15.0-15-generic System Name : boslcp3 Model/Type : machine : PowerNV 9006-12C Platform : powerpc64le ******************************************************************************** root@boslcp3:~# kdump-config status current state : ready to kdump root@boslcp3:~# [ 69.747689] Unable to handle kernel paging request for data at address 0x8882f6ed90e91374 [ 69.747760] Faulting instruction address: 0xc00000000038a110 cpu 0x50: Vector: 380 (Data Access Out of Range) at [c000000004013650] pc: c00000000038a110: kmem_cache_alloc_node+0x2f0/0x350 lr: c00000000038a0fc: kmem_cache_alloc_node+0x2dc/0x350 sp: c0000000040138d0 msr: 9000000000009033 dar: 8882f6ed90e91374 current = 0xc000000005fb9800 paca = 0xc000000007a57000 softe: 0 irq_happened: 0x01 pid = 1771, comm = systemd-journal Linux version 4.15.0-15-generic (buildd@bos02-ppc64el-002) (gcc version 7.3.0 (Ubuntu 7.3.0-14ubuntu1)) #16-Ubuntu SMP Wed Apr 4 13:57:51 UTC 2018 (Ubuntu 4.15.0-15.16-generic 4.15.15) `/ecenter ? for help [c0000000040138d0] c000000000389fd4 kmem_cache_alloc_node+0x1b4/0x350 (unreliable) [c000000004013940] c000000000b2ec6c __alloc_skb+0x6c/0x220 [c0000000040139a0] c000000000b30b6c alloc_skb_with_frags+0x7c/0x2e0 [c000000004013a30] c000000000b247cc sock_alloc_send_pskb+0x29c/0x2c0 [c000000004013ae0] c000000000c5705c unix_dgram_sendmsg+0x15c/0x8f0 [c000000004013bc0] c000000000b1ec64 sock_sendmsg+0x64/0x90 [c000000004013bf0] c000000000b20abc ___sys_sendmsg+0x31c/0x390 [c000000004013d90] c000000000b221ec __sys_sendmsg+0x5c/0xc0 [c000000004013e30] c00000000000b184 system_call+0x58/0x6c --- Exception: c00 (System Call) at 00007179ce4ea9c4 SP (7ffffbe4ff60) is in userspace 50:mon> t [c0000000040138d0] c000000000389fd4 kmem_cache_alloc_node+0x1b4/0x350 (unreliable) [c000000004013940] c000000000b2ec6c __alloc_skb+0x6c/0x220 [c0000000040139a0] c000000000b30b6c alloc_skb_with_frags+0x7c/0x2e0 [c000000004013a30] c000000000b247cc sock_alloc_send_pskb+0x29c/0x2c0 [c000000004013ae0] c000000000c5705c unix_dgram_sendmsg+0x15c/0x8f0 [c000000004013bc0] c000000000b1ec64 sock_sendmsg+0x64/0x90 [c000000004013bf0] c000000000b20abc ___sys_sendmsg+0x31c/0x390 [c000000004013d90] c000000000b221ec __sys_sendmsg+0x5c/0xc0 [c000000004013e30] c00000000000b184 system_call+0x58/0x6c --- Exception: c00 (System Call) at 00007179ce4ea9c4 SP (7ffffbe4ff60) is in userspace 50:mon> ? Commands: b show breakpoints bd set data breakpoint bi set instruction breakpoint bc clear breakpoint c print cpus stopped in xmon c# try to switch to cpu number h (in hex) C checksum d dump bytes d1 dump 1 byte values d2 dump 2 byte values d4 dump 4 byte values d8 dump 8 byte values di dump instructions df dump float values dd dump double values dl dump the kernel log buffer do dump the OPAL message log dp[#] dump paca for current cpu, or cpu # dpa dump paca for all possible cpus dr dump stream of raw bytes dv dump virtual address translation dt dump the tracing buffers (uses printk) dtc dump the tracing buffers for current CPU (uses printk) dx# dump xive on CPU # dxi# dump xive irq state # dxa dump xive on all CPUs e print exception information f flush cache la lookup symbol+offset of specified address ls lookup address of specified symbol m examine/change memory mm move a block of memory ms set a block of memory md compare two blocks of memory ml locate a block of memory mz zero a block of memory mi show information about memory allocation p call a procedure P list processes/tasks r print registers s single step S print special registers Sa print all SPRs Sr # read SPR # Sw #v write v to SPR # t print backtrace x exit monitor and recover X exit monitor and don't recover u dump segment table or SLB U show uptime information ? help # n limit output to n lines per page (for dp, dpa, dl) zr reboot zh halt 50:mon> b type address 50:mon> 50:mon> 50:mon> b type address 50:mon> X [ 2668.621134] Oops: Kernel access of bad area, sig: 11 [#1] [ 2668.621166] INFO: rcu_sched self-detected stall on CPU [ 2668.621174] INFO: rcu_sched detected stalls on CPUs/tasks: [ 2668.621193] 0-...!: (1 GPs behind) idle=0ba/1/0 softirq=1461/1461 fqs=26 [ 2668.621224] 9-...!: (22 ticks this GP) idle=056/140000000000000/2 softirq=1792/1792 fqs=26 [ 2668.6LOCK ERROR: Unlocked non-owned lock @0x30314850 (state: 0x0000080000000001) [ 2818.008943797,0] Aborting! CPU 0800 Backtrace: S: 0000000033c03920 R: 000000003001362c .backtrace+0x48 S: 0000000033c039c0 R: 000000003001a3c4 ._abort+0x4c S: 0000000033c03a40 R: 0000000030017d58 .lock_error+0x64 S: 0000000033c03ac0 R: 0000000030017874 .unlock+0x60 S: 0000000033c03b30 R: 0000000030039474 .__uart_do_poll+0x8c S: 0000000033c03c20 R: 000000003001b940 .opal_run_pollers+0x148 S: 0000000033c03ca0 R: 000000003001b9cc .opal_poll_events+0x74 S: 0000000033c03d20 R: 000000003000515c opal_entry+0xac S: 0000000033c03f00 R: 0000000030002788 secondary_wait+0x8c [ 2818.010974312,3] OPAL exiting with locks held, token=145 retval=0 [ 2818.010976742,3] core/lock.c:216 [ 2818.010977835,3] core/lock.c:216 21234] 0-...!: (1 GPs behind) idle=0ba/1/0 softirq=1461/1461 fqs=26 h[ 2668.621248] 80-...!: (1 GPs behind) idle=cf6/140000000000000/0 softirq=350/350 fqs=26 [ 2668.621248] [ 2668.621267] [ 2668.621292] (t=99933 jiffies g=1337 c=1336 q=1) [ 2668.621294] (detected by 3, t=99933 jiffies, g=1337, c=1336, q=1) [ 2668.621298] rcu_sched kthread starved for 99881 jiffies! g1337 c1336 f0x0 RCU_GP_WAIT_FQS(3) ->state=0x402 ->cpu=21 [ 2668.621300] Sending NMI from CPU 3 to CPUs 0: [ 2668.621302] rcu_sched I 0 9 2 0x00000800 [ 2668.621330] Call Trace: [ 2668.621337] [c000000ff91ab8c0] [0000001500000000] 0x1500000000 (unreliable) [ 2668.621363] [c000000ff91aba90] [c00000000001c220] __switch_to+0x2a0/0x4d0 [ 2668.621385] [c000000ff91abaf0] [c000000000d05d24] __schedule+0x2a4/0xaf0 [ 2668.621389] [c000000ff91abbc0] [c000000000d065b0] schedule+0x40/0xc0 [ 2668.621406] [c000000ff91abbe0] [c000000000d0b3d0] schedule_timeout+0xb0/0x4e0 [ 2668.621433] [c000000ff91abce0] [c0000000001a6ea4] rcu_gp_kthread+0x714/0xb30 [ 2668.621453] [c000000ff91abdc0] [c00000000013bba8] kthread+0x1a8/0x1b0 [ 2668.621470] [c000000ff91abe30] [[ 2818.094807155,0] Assert fail: core/mem_region.c:444:lock_held_by_me(®ion->free_list_lock) c00000000000b528] ret_from_kernel_thread+0x5c/0xb4 [ 2668.621534] INFO: rcu_sched detected stalls on CPUs/tasks: [ 2668.621712] NMI backtrace for cpu 0 [ 2668.621717] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.15.0-15-generic #16-Ubuntu [ 2668.621719] NIP: c000000000d0cad4 LR: c000000000d0cad0 CTR: c000000000acb8d0 [ 2668.621721] REGS: c000000007ffbd80 TRAP: 0100 Not tainted (4.15.0-15-generic) [ 2668.621722] MSR: 9000000000089033 CR: 28002824 XER: 00000000 [ 2668.621733] CFAR: c000000000d0caf0 SOFTE: 0 [ 2668.621733] GPR00: c0000000001a90c0 c0000000016e7690 c0000000016eb400 0000000000000000 [ 2668.621733] GPR04: 0000000000000000 0000000000000004 c0000000016e7648 0000026d563c25d6 [ 2668.621733] GPR08: 00000000000f4240 0000000080000003 0000000080000000 c000000ff58a1230 [ 2668.621733] GPR12: c00000000171dd78 c000000007a20000 0000000000000001 c000000000f7a9f8 [ 2668.621733] GPR16: c000000000f7aa28 c000000000f7aa60 c000000000f7aa90 c000000001576780 [ 2668.621733] GPR20: c000000001576580 c000000000f44640 c000000000f7a9c8 c0000000011c1020 [ 2668.621733] GPR24: c000000001596580 c0000000017223a0 c000000001722394 0000000ff9a10000 [ 2668.621733] GPR28: c0000000017223a0 0000000000000000 0000000000000000 c000000001576780 [ 2668.621767] NIP [c000000000d0cad4] _raw_spin_lock_irqsave+0x84/0x110 [ 2668.621771] LR [c000000000d0cad0] _raw_spin_lock_irqsave+0x80/0x110 [ 2668.621773] Call Trace: [ 2668.621777] [c0000000016e7690] [c0000000016e7720] init_thread_union+0x3720/0x4000 (unreliable) [ 2668.621781] [c0000000016e76d0] [c0000000001a90c0] rcu_dump_cpu_stacks+0x78/0x158 [ 2668.621785] [c0000000016e7720] [c0000000001a81e8] rcu_check_callbacks+0x8e8/0xb40 [ 2668.621788] [c0000000016e7850] [c0000000001b64a8] update_process_times+0x48/0x90 [ 2668.621792] [c0000000016e7880] [c0000000001ce268] tick_sched_handle.isra.5+0xa8/0xd0 [ 2668.621795] [c0000000016e78b0] [c0000000001ce2f0] tick_sched_timer+0x60/0xe0 [ 2668.621798] [c0000000016e78f0] [c0000000001b7054] __hrtimer_run_queues+0x144/0x370 [ 2668.621802] [c0000000016e7970] [c0000000001b7fac] hrtimer_interrupt+0xfc/0x350 [ 2668.621806] [c0000000016e7a40] [c0000000000248f0] __timer_interrupt+0x90/0x260 [ 2668.621809] [c0000000016e7a90] [c000000000024d08] timer_interrupt+0x98/0xe0 [ 2668.621815] [c0000000016e7ac0] [c000000000009014] decrementer_common+0x114/0x120 [ 2668.621821] --- interrupt: 901 at replay_interrupt_return+0x0/0x4 [ 2668.621821] LR = arch_local_irq_restore+0x74/0x90 [ 2668.621822] [c0000000016e7db0] [00000000100049dc] 0x100049dc (unreliable) [ 2668.621827] [c0000000016e7dd0] [c000000000acf0b0] cpuidle_enter_state+0xf0/0x450 [ 2668.621832] [c0000000016e7e30] [c00000000017239c] call_cpuidle+0x4c/0x90 [ 2668.621835] [c0000000016e7e50] [c0000000001727b0] do_idle+0x2b0/0x330 [ 2668.621839] [c0000000016e7ea0] [c000000000172a68] cpu_startup_entry+0x38/0x50 [ 2939.402867169,0] Assert fail: core/mem_region.c:444:lock_held_by_me(®ion->free_list_lock) > > > > > > > > > > > ^C root@boslcp3:~# ping kte ^C^C