[witherspoon] removing module nouveau causes cpu hard lockup

Bug #1811470 reported by Manoj Iyer on 2019-01-11
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
The Ubuntu-power-systems project
High
Unassigned

Bug Description

Installed 18.04 and upgraded kernel to linux-image-generic-hwe-18.04 (4.18.0-13-generic #14~18.04.1-Ubuntu SMP Thu Dec 6 14:03:47). Copied gv100 firmware to /lib/firmware/nvidia, removed and reloaded nouveau (modprobe -r and modprobe). Tried to remove nouveau again using modprobe -r and I see the trace below. After a while the modprobe -r command completed the module was removed successfully.

[ 618.185258] nouveau 0035:04:00.0: DRM: failed to idle channel 1 [DRM]
[ 630.314599] watchdog: CPU 4 self-detected hard LOCKUP @ ioread32+0x2c/0x170
[ 630.314601] watchdog: CPU 4 TB:415266697100, last heartbeat TB:410146341428 (10000ms ago)
[ 630.314601] Modules linked in: nouveau(-) ofpart at24 cmdlinepart uio_pdrv_genirq ipmi_powernv ipmi_devintf powernv_flash uio mtd opal_prd ipmi_msghandler ibmpowernv vmx_crypto sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ast i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_vpmsum ahci crc32c_vpmsum tg3 libahci drm_panel_orientation_quirks [last unloaded: nouveau]
[ 630.314629] CPU: 4 PID: 6623 Comm: modprobe Not tainted 4.18.0-13-generic #14~18.04.1-Ubuntu
[ 630.314630] NIP: c000000000729afc LR: c00800000f766990 CTR: c000000000729ad0
[ 630.314630] REGS: c000003fffd87d80 TRAP: 0900 Not tainted (4.18.0-13-generic)
[ 630.314631] MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44002824 XER: 00000000
[ 630.314638] CFAR: c00800000f7b2ac4 IRQMASK: 1
[ 630.314639] GPR00: c00800000f7ad3f8 c000003f6c1bb850 c00000000178c200 c00c00008e7e0000
[ 630.314642] GPR04: 0000000000000000 0000000000000000 0000000000000000 0000000000000008
[ 630.314644] GPR08: c000003fffffa800 000000007fffffff c00c00008e7e0000 c00800000f7b2ab0
[ 630.314647] GPR12: c000000000729ad0 c000003fffffa800 0000000000000001 0000000000000000
[ 630.314649] GPR16: 0000000000000000 0000000000000000 0000039a7cc60074 0000039a7cc3c978
[ 630.314652] GPR20: 0000039a7cc60070 0000000000000000 00007ffff8448f88 0000000000000000
[ 630.314654] GPR24: 0000039a8bf40ee8 0000000000000000 c00800000f82dca0 c000003fed072098
[ 630.314657] GPR28: c000203968a2c290 c000203968840908 c000203968840900 0000000000000001
[ 630.314660] NIP [c000000000729afc] ioread32+0x2c/0x170
[ 630.314660] LR [c00800000f766990] nouveau_bo_rd32+0x48/0x70 [nouveau]
[ 630.314661] Call Trace:
[ 630.314662] [c000003f6c1bb850] [c000003f6c1bb890] 0xc000003f6c1bb890 (unreliable)
[ 630.314663] [c000003f6c1bb880] [c000003f6c1bb8b0] 0xc000003f6c1bb8b0
[ 630.314664] [c000003f6c1bb8a0] [c00800000f7ad3f8] nv84_fence_read+0x40/0x60 [nouveau]
[ 630.314666] [c000003f6c1bb8c0] [c00800000f7aab3c] nouveau_fence_update+0x44/0x100 [nouveau]
[ 630.314667] [c000003f6c1bb900] [c00800000f7ab5d8] nouveau_fence_done+0x100/0x180 [nouveau]
[ 630.314668] [c000003f6c1bb940] [c00800000f7ab8c8] nouveau_fence_wait+0x90/0x150 [nouveau]
[ 630.314669] [c000003f6c1bb970] [c00800000f7a8f90] nouveau_channel_idle+0xd8/0x140 [nouveau]
[ 630.314670] [c000003f6c1bba00] [c00800000f75f75c] nouveau_accel_fini+0x74/0xe0 [nouveau]
[ 630.314671] [c000003f6c1bba30] [c00800000f75f8e8] nouveau_drm_unload+0x60/0x130 [nouveau]
[ 630.314672] [c000003f6c1bba60] [c008000013b1b118] drm_dev_unregister+0x70/0x160 [drm]
[ 630.314673] [c000003f6c1bbaa0] [c008000013b1b3f0] drm_put_dev+0x48/0xa0 [drm]
[ 630.314675] [c000003f6c1bbb10] [c00800000f760f8c] nouveau_drm_device_remove+0x54/0x90 [nouveau]
[ 630.314676] [c000003f6c1bbb50] [c0000000007a746c] pci_device_remove+0x6c/0x120
[ 630.314677] [c000003f6c1bbb90] [c0000000008a7014] device_release_driver_internal+0x294/0x380
[ 630.314678] [c000003f6c1bbbe0] [c0000000008a719c] driver_detach+0x7c/0x140
[ 630.314679] [c000003f6c1bbc20] [c0000000008a5304] bus_remove_driver+0x84/0x170
[ 630.314680] [c000003f6c1bbc90] [c0000000008a7ef8] driver_unregister+0x48/0x90
[ 630.314681] [c000003f6c1bbd00] [c0000000007a52b8] pci_unregister_driver+0x38/0x150
[ 630.314682] [c000003f6c1bbd50] [c00800000f7af048] nouveau_drm_exit+0x30/0xfc08 [nouveau]
[ 630.314683] [c000003f6c1bbd70] [c0000000001e6b14] sys_delete_module+0x1d4/0x310
[ 630.314684] [c000003f6c1bbe30] [c00000000000b288] system_call+0x5c/0x70
[ 630.314685] Instruction dump:
[ 630.314686] 60000000 3c4c0106 38422730 fbe1fff8 f821ffd1 3920ffff 79290060 7c6a1b78
[ 630.314690] 7fa34840 409d0030 7c0004ac 83e30000 <0c1f0000> 4c00012c 2f9fffff 7bff0020
[ 633.222292] nouveau 0035:04:00.0: DRM: failed to idle channel 0 [DRM]
[ 633.222397] watchdog: CPU 4 became unstuck TB:416755750721
[ 633.222451] CPU: 4 PID: 37 Comm: ksoftirqd/4 Not tainted 4.18.0-13-generic #14~18.04.1-Ubuntu
[ 633.222452] Call Trace:
[ 633.222455] [c000003fe6af3910] [c000000000d4950c] dump_stack+0xb0/0xf4 (unreliable)
[ 633.222459] [c000003fe6af3950] [c00000000002f75c] wd_smp_clear_cpu_pending+0x41c/0x430
[ 633.222462] [c000003fe6af3a00] [c00000000002fec4] wd_timer_fn+0x64/0x3f0
[ 633.222465] [c000003fe6af3ac0] [c0000000001bfae0] call_timer_fn+0x50/0x1c0
[ 633.222467] [c000003fe6af3b40] [c0000000001bfd88] expire_timers+0x138/0x1f0
[ 633.222470] [c000003fe6af3bb0] [c0000000001bff48] run_timer_softirq+0x108/0x270
[ 633.222473] [c000003fe6af3c50] [c000000000d6bd58] __do_softirq+0x158/0x3d4
[ 633.222475] [c000003fe6af3d40] [c00000000011c1c4] run_ksoftirqd+0x64/0x90
[ 633.222478] [c000003fe6af3d60] [c000000000149130] smpboot_thread_fn+0x250/0x290
[ 633.222481] [c000003fe6af3dc0] [c000000000142d98] kthread+0x1a8/0x1b0
[ 633.222484] [c000003fe6af3e30] [c00000000000b65c] ret_from_kernel_thread+0x5c/0x80
[ 648.259266] nouveau 0035:03:00.0: DRM: failed to idle channel 1 [DRM]
[ 663.296153] nouveau 0035:03:00.0: DRM: failed to idle channel 0 [DRM]
[ 673.329216] watchdog: CPU 4 self-detected hard LOCKUP @ ioread32+0x2c/0x170
[ 673.329217] watchdog: CPU 4 TB:437290181288, last heartbeat TB:432168104943 (10004ms ago)
[ 673.329218] Modules linked in: nouveau(-) ofpart at24 cmdlinepart uio_pdrv_genirq ipmi_powernv ipmi_devintf powernv_flash uio mtd opal_prd ipmi_msghandler ibmpowernv vmx_crypto sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear ast i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm crct10dif_vpmsum ahci crc32c_vpmsum tg3 libahci drm_panel_orientation_quirks [last unloaded: nouveau]
[ 673.329245] CPU: 4 PID: 6623 Comm: modprobe Not tainted 4.18.0-13-generic #14~18.04.1-Ubuntu
[ 673.329246] NIP: c000000000729afc LR: c00800000f766990 CTR: c000000000729ad0
[ 673.329247] REGS: c000003fffd87d80 TRAP: 0900 Not tainted (4.18.0-13-generic)
[ 673.329247] MSR: 900000000280b033 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44002884 XER: 00000000
[ 673.329254] CFAR: c00800000f7b2ac4 IRQMASK: 1
[ 673.329255] GPR00: c00800000f7ad3f8 c000003f6c1bb850 c00000000178c200 c00c00008b240010
[ 673.329258] GPR04: 0000000000000010 0000000000000000 0000000000000001 0000000000000008
[ 673.329260] GPR08: c000003fffffa800 000000007fffffff c00c00008b240010 c00800000f7b2ab0
[ 673.329263] GPR12: c000000000729ad0 c000003fffffa800 0000000000000001 0000000000000000
[ 673.329265] GPR16: 0000000000000000 0000000000000000 0000039a7cc60074 0000039a7cc3c978
[ 673.329268] GPR20: 0000039a7cc60070 0000000000000000 00007ffff8448f88 0000000000000000
[ 673.329270] GPR24: 0000039a8bf40ee8 0000000000000000 c00800000f82dca0 c000003fed064098
[ 673.329273] GPR28: c000003fed6d2290 c000003fbd4ffe08 c000003fbd4ffe00 0000000000000000
[ 673.329275] NIP [c000000000729afc] ioread32+0x2c/0x170
[ 673.329276] LR [c00800000f766990] nouveau_bo_rd32+0x48/0x70 [nouveau]
[ 673.329277] Call Trace:
[ 673.329277] [c000003f6c1bb850] [c000003f6c1bb890] 0xc000003f6c1bb890 (unreliable)
[ 673.329279] [c000003f6c1bb880] [c000003f6c1bb8b0] 0xc000003f6c1bb8b0
[ 673.329280] [c000003f6c1bb8a0] [c00800000f7ad3f8] nv84_fence_read+0x40/0x60 [nouveau]
[ 673.329281] [c000003f6c1bb8c0] [c00800000f7aab3c] nouveau_fence_update+0x44/0x100 [nouveau]
[ 673.329282] [c000003f6c1bb900] [c00800000f7ab5d8] nouveau_fence_done+0x100/0x180 [nouveau]
[ 673.329284] [c000003f6c1bb940] [c00800000f7ab8c8] nouveau_fence_wait+0x90/0x150 [nouveau]
[ 673.329285] [c000003f6c1bb970] [c00800000f7a8f90] nouveau_channel_idle+0xd8/0x140 [nouveau]
[ 673.329286] [c000003f6c1bba00] [c00800000f75f714] nouveau_accel_fini+0x2c/0xe0 [nouveau]
[ 673.329287] [c000003f6c1bba30] [c00800000f75f8e8] nouveau_drm_unload+0x60/0x130 [nouveau]
[ 673.329288] [c000003f6c1bba60] [c008000013b1b118] drm_dev_unregister+0x70/0x160 [drm]
[ 673.329289] [c000003f6c1bbaa0] [c008000013b1b3f0] drm_put_dev+0x48/0xa0 [drm]
[ 673.329290] [c000003f6c1bbb10] [c00800000f760f8c] nouveau_drm_device_remove+0x54/0x90 [nouveau]
[ 673.329291] [c000003f6c1bbb50] [c0000000007a746c] pci_device_remove+0x6c/0x120
[ 673.329292] [c000003f6c1bbb90] [c0000000008a7014] device_release_driver_internal+0x294/0x380
[ 673.329293] [c000003f6c1bbbe0] [c0000000008a719c] driver_detach+0x7c/0x140
[ 673.329294] [c000003f6c1bbc20] [c0000000008a5304] bus_remove_driver+0x84/0x170
[ 673.329296] [c000003f6c1bbc90] [c0000000008a7ef8] driver_unregister+0x48/0x90
[ 673.329297] [c000003f6c1bbd00] [c0000000007a52b8] pci_unregister_driver+0x38/0x150
[ 673.329298] [c000003f6c1bbd50] [c00800000f7af048] nouveau_drm_exit+0x30/0xfc08 [nouveau]
[ 673.329299] [c000003f6c1bbd70] [c0000000001e6b14] sys_delete_module+0x1d4/0x310
[ 673.329300] [c000003f6c1bbe30] [c00000000000b288] system_call+0x5c/0x70
[ 673.329301] Instruction dump:
[ 673.329302] 60000000 3c4c0106 38422730 fbe1fff8 f821ffd1 3920ffff 79290060 7c6a1b78
[ 673.329306] 7fa34840 409d0030 7c0004ac 83e30000 <0c1f0000> 4c00012c 2f9fffff 7bff0020
[ 678.329001] nouveau 0004:05:00.0: DRM: failed to idle channel 1 [DRM]
[ 678.329115] watchdog: CPU 4 became unstuck TB:439850390082
[ 678.329165] CPU: 4 PID: 37 Comm: ksoftirqd/4 Not tainted 4.18.0-13-generic #14~18.04.1-Ubuntu
[ 678.329166] Call Trace:
[ 678.329169] [c000003fe6af3910] [c000000000d4950c] dump_stack+0xb0/0xf4 (unreliable)
[ 678.329173] [c000003fe6af3950] [c00000000002f75c] wd_smp_clear_cpu_pending+0x41c/0x430
[ 678.329176] [c000003fe6af3a00] [c00000000002fec4] wd_timer_fn+0x64/0x3f0
[ 678.329178] [c000003fe6af3ac0] [c0000000001bfae0] call_timer_fn+0x50/0x1c0
[ 678.329180] [c000003fe6af3b40] [c0000000001bfd88] expire_timers+0x138/0x1f0
[ 678.329182] [c000003fe6af3bb0] [c0000000001bff48] run_timer_softirq+0x108/0x270
[ 678.329184] [c000003fe6af3c50] [c000000000d6bd58] __do_softirq+0x158/0x3d4
[ 678.329186] [c000003fe6af3d40] [c00000000011c1c4] run_ksoftirqd+0x64/0x90
[ 678.329188] [c000003fe6af3d60] [c000000000149130] smpboot_thread_fn+0x250/0x290
[ 678.329191] [c000003fe6af3dc0] [c000000000142d98] kthread+0x1a8/0x1b0
[ 678.329193] [c000003fe6af3e30] [c00000000000b65c] ret_from_kernel_thread+0x5c/0x80
[ 693.373756] nouveau 0004:05:00.0: DRM: failed to idle channel 0 [DRM]
[ 708.414472] nouveau 0004:04:00.0: DRM: failed to idle channel 1 [DRM]
[ 723.455144] nouveau 0004:04:00.0: DRM: failed to idle channel 0 [DRM]

Manoj Iyer (manjo) on 2019-01-11
description: updated
bugproxy (bugproxy) on 2019-01-14
tags: added: architecture-ppc64le bugnameltc-174680 severity-high targetmilestone-inin18043
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers