Comment 2 for bug 2069300

Revision history for this message
Seth Arnold (seth-arnold) wrote : Re: Xorg crash

These nvidia drivers look really unhappy, the first trace:

[ 11.997286] ------------[ cut here ]------------
[ 11.997290] UBSAN: array-index-out-of-bounds in build/nvidia/535.171.04/build/nvidia-uvm/uvm_pmm_gpu.c:2364:28
[ 11.997293] index 0 is out of range for type 'uvm_gpu_chunk_t *[*]'
[ 11.997296] CPU: 15 PID: 2599 Comm: ollama Tainted: P O 6.8.0-35-generic #35-Ubuntu
[ 11.997298] Hardware name: HP Victus by HP Gaming Laptop 16-s0xxx/8BD5, BIOS F.20 03/20/2024
[ 11.997300] Call Trace:
[ 11.997301] <TASK>
[ 11.997305] dump_stack_lvl+0x48/0x70
[ 11.997313] dump_stack+0x10/0x20
[ 11.997315] __ubsan_handle_out_of_bounds+0xc6/0x110
[ 11.997320] split_gpu_chunk+0x13f/0x410 [nvidia_uvm]
[ 11.997346] uvm_pmm_gpu_alloc+0x2da/0x6d0 [nvidia_uvm]
[ 11.997366] phys_mem_allocate+0xac/0x230 [nvidia_uvm]
[ 11.997389] allocate_directory+0xb4/0x130 [nvidia_uvm]
[ 11.997405] ? allocate_directory+0xb4/0x130 [nvidia_uvm]
[ 11.997422] uvm_page_tree_init+0x133/0x450 [nvidia_uvm]
[ 11.997442] uvm_gpu_retain_by_uuid+0x19df/0x2b80 [nvidia_uvm]
[ 11.997461] ? __mod_memcg_lruvec_state+0xd6/0x1a0
[ 11.997468] uvm_va_space_register_gpu+0x47/0x740 [nvidia_uvm]
[ 11.997486] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997489] ? bdev_getblk+0x53/0x90
[ 11.997493] uvm_api_register_gpu+0x5a/0x90 [nvidia_uvm]
[ 11.997510] uvm_ioctl+0x1a26/0x1cd0 [nvidia_uvm]
[ 11.997526] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997528] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997530] ? xas_find+0x74/0x1e0
[ 11.997533] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997535] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997537] ? next_uptodate_folio+0xa9/0x320
[ 11.997541] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997542] ? _raw_spin_lock_irqsave+0xe/0x20
[ 11.997546] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997548] ? thread_context_non_interrupt_add+0x13a/0x250 [nvidia_uvm]
[ 11.997568] uvm_unlocked_ioctl_entry.part.0+0x7b/0xf0 [nvidia_uvm]
[ 11.997584] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997586] ? do_read_fault+0x112/0x1d0
[ 11.997589] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997591] ? do_fault+0x109/0x350
[ 11.997592] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997595] uvm_unlocked_ioctl_entry+0x6b/0x90 [nvidia_uvm]
[ 11.997610] __x64_sys_ioctl+0xa0/0xf0
[ 11.997613] x64_sys_call+0x143b/0x25c0
[ 11.997616] do_syscall_64+0x7f/0x180
[ 11.997619] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997621] ? __count_memcg_events+0x6b/0x120
[ 11.997624] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997625] ? count_memcg_events.constprop.0+0x2a/0x50
[ 11.997628] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997630] ? handle_mm_fault+0xad/0x380
[ 11.997633] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997634] ? do_user_addr_fault+0x338/0x6b0
[ 11.997637] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997639] ? irqentry_exit_to_user_mode+0x7b/0x260
[ 11.997641] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997643] ? irqentry_exit+0x43/0x50
[ 11.997644] ? srso_alias_return_thunk+0x5/0xfbef5
[ 11.997646] ? exc_page_fault+0x94/0x1b0
[ 11.997648] entry_SYSCALL_64_after_hwframe+0x78/0x80
[ 11.997650] RIP: 0033:0x78462c924ded
[ 11.997672] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
[ 11.997674] RSP: 002b:00007845dcbff4a0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 11.997676] RAX: ffffffffffffffda RBX: 00007845bbd00860 RCX: 000078462c924ded
[ 11.997678] RDX: 00007845dcbff540 RSI: 0000000000000025 RDI: 0000000000000008
[ 11.997679] RBP: 00007845dcbff4f0 R08: 00007845bbd008f0 R09: 0000000000000000
[ 11.997680] R10: 000078462c80d630 R11: 0000000000000246 R12: 00007845b003b8b6
[ 11.997681] R13: 00007845bbd008f0 R14: 00007845dcbff540 R15: 0000000000000008
[ 11.997684] </TASK>
[ 11.997692] ---[ end trace ]---