These nvidia drivers look really unhappy, the first trace:
[ 11.997286] ------------[ cut here ]------------ [ 11.997290] UBSAN: array-index-out-of-bounds in build/nvidia/535.171.04/build/nvidia-uvm/uvm_pmm_gpu.c:2364:28 [ 11.997293] index 0 is out of range for type 'uvm_gpu_chunk_t *[*]' [ 11.997296] CPU: 15 PID: 2599 Comm: ollama Tainted: P O 6.8.0-35-generic #35-Ubuntu [ 11.997298] Hardware name: HP Victus by HP Gaming Laptop 16-s0xxx/8BD5, BIOS F.20 03/20/2024 [ 11.997300] Call Trace: [ 11.997301] <TASK> [ 11.997305] dump_stack_lvl+0x48/0x70 [ 11.997313] dump_stack+0x10/0x20 [ 11.997315] __ubsan_handle_out_of_bounds+0xc6/0x110 [ 11.997320] split_gpu_chunk+0x13f/0x410 [nvidia_uvm] [ 11.997346] uvm_pmm_gpu_alloc+0x2da/0x6d0 [nvidia_uvm] [ 11.997366] phys_mem_allocate+0xac/0x230 [nvidia_uvm] [ 11.997389] allocate_directory+0xb4/0x130 [nvidia_uvm] [ 11.997405] ? allocate_directory+0xb4/0x130 [nvidia_uvm] [ 11.997422] uvm_page_tree_init+0x133/0x450 [nvidia_uvm] [ 11.997442] uvm_gpu_retain_by_uuid+0x19df/0x2b80 [nvidia_uvm] [ 11.997461] ? __mod_memcg_lruvec_state+0xd6/0x1a0 [ 11.997468] uvm_va_space_register_gpu+0x47/0x740 [nvidia_uvm] [ 11.997486] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997489] ? bdev_getblk+0x53/0x90 [ 11.997493] uvm_api_register_gpu+0x5a/0x90 [nvidia_uvm] [ 11.997510] uvm_ioctl+0x1a26/0x1cd0 [nvidia_uvm] [ 11.997526] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997528] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997530] ? xas_find+0x74/0x1e0 [ 11.997533] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997535] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997537] ? next_uptodate_folio+0xa9/0x320 [ 11.997541] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997542] ? _raw_spin_lock_irqsave+0xe/0x20 [ 11.997546] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997548] ? thread_context_non_interrupt_add+0x13a/0x250 [nvidia_uvm] [ 11.997568] uvm_unlocked_ioctl_entry.part.0+0x7b/0xf0 [nvidia_uvm] [ 11.997584] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997586] ? do_read_fault+0x112/0x1d0 [ 11.997589] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997591] ? do_fault+0x109/0x350 [ 11.997592] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997595] uvm_unlocked_ioctl_entry+0x6b/0x90 [nvidia_uvm] [ 11.997610] __x64_sys_ioctl+0xa0/0xf0 [ 11.997613] x64_sys_call+0x143b/0x25c0 [ 11.997616] do_syscall_64+0x7f/0x180 [ 11.997619] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997621] ? __count_memcg_events+0x6b/0x120 [ 11.997624] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997625] ? count_memcg_events.constprop.0+0x2a/0x50 [ 11.997628] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997630] ? handle_mm_fault+0xad/0x380 [ 11.997633] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997634] ? do_user_addr_fault+0x338/0x6b0 [ 11.997637] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997639] ? irqentry_exit_to_user_mode+0x7b/0x260 [ 11.997641] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997643] ? irqentry_exit+0x43/0x50 [ 11.997644] ? srso_alias_return_thunk+0x5/0xfbef5 [ 11.997646] ? exc_page_fault+0x94/0x1b0 [ 11.997648] entry_SYSCALL_64_after_hwframe+0x78/0x80 [ 11.997650] RIP: 0033:0x78462c924ded [ 11.997672] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00 [ 11.997674] RSP: 002b:00007845dcbff4a0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [ 11.997676] RAX: ffffffffffffffda RBX: 00007845bbd00860 RCX: 000078462c924ded [ 11.997678] RDX: 00007845dcbff540 RSI: 0000000000000025 RDI: 0000000000000008 [ 11.997679] RBP: 00007845dcbff4f0 R08: 00007845bbd008f0 R09: 0000000000000000 [ 11.997680] R10: 000078462c80d630 R11: 0000000000000246 R12: 00007845b003b8b6 [ 11.997681] R13: 00007845bbd008f0 R14: 00007845dcbff540 R15: 0000000000000008 [ 11.997684] </TASK> [ 11.997692] ---[ end trace ]---
These nvidia drivers look really unhappy, the first trace:
[ 11.997286] ------------[ cut here ]------------ out-of- bounds in build/nvidia/ 535.171. 04/build/ nvidia- uvm/uvm_ pmm_gpu. c:2364: 28 lvl+0x48/ 0x70 0x10/0x20 handle_ out_of_ bounds+ 0xc6/0x110 chunk+0x13f/ 0x410 [nvidia_uvm] gpu_alloc+ 0x2da/0x6d0 [nvidia_uvm] allocate+ 0xac/0x230 [nvidia_uvm] directory+ 0xb4/0x130 [nvidia_uvm] directory+ 0xb4/0x130 [nvidia_uvm] tree_init+ 0x133/0x450 [nvidia_uvm] retain_ by_uuid+ 0x19df/ 0x2b80 [nvidia_uvm] lruvec_ state+0xd6/ 0x1a0 space_register_ gpu+0x47/ 0x740 [nvidia_uvm] return_ thunk+0x5/ 0xfbef5 0x53/0x90 register_ gpu+0x5a/ 0x90 [nvidia_uvm] 0x1a26/ 0x1cd0 [nvidia_uvm] return_ thunk+0x5/ 0xfbef5 return_ thunk+0x5/ 0xfbef5 return_ thunk+0x5/ 0xfbef5 return_ thunk+0x5/ 0xfbef5 folio+0xa9/ 0x320 return_ thunk+0x5/ 0xfbef5 lock_irqsave+ 0xe/0x20 return_ thunk+0x5/ 0xfbef5 context_ non_interrupt_ add+0x13a/ 0x250 [nvidia_uvm] ioctl_entry. part.0+ 0x7b/0xf0 [nvidia_uvm] return_ thunk+0x5/ 0xfbef5 fault+0x112/ 0x1d0 return_ thunk+0x5/ 0xfbef5 0x109/0x350 return_ thunk+0x5/ 0xfbef5 ioctl_entry+ 0x6b/0x90 [nvidia_uvm] ioctl+0xa0/ 0xf0 call+0x143b/ 0x25c0 64+0x7f/ 0x180 return_ thunk+0x5/ 0xfbef5 memcg_events+ 0x6b/0x120 return_ thunk+0x5/ 0xfbef5 events. constprop. 0+0x2a/ 0x50 return_ thunk+0x5/ 0xfbef5 mm_fault+ 0xad/0x380 return_ thunk+0x5/ 0xfbef5 addr_fault+ 0x338/0x6b0 return_ thunk+0x5/ 0xfbef5 exit_to_ user_mode+ 0x7b/0x260 return_ thunk+0x5/ 0xfbef5 exit+0x43/ 0x50 return_ thunk+0x5/ 0xfbef5 fault+0x94/ 0x1b0 64_after_ hwframe+ 0x78/0x80 bff4a0 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 11.997290] UBSAN: array-index-
[ 11.997293] index 0 is out of range for type 'uvm_gpu_chunk_t *[*]'
[ 11.997296] CPU: 15 PID: 2599 Comm: ollama Tainted: P O 6.8.0-35-generic #35-Ubuntu
[ 11.997298] Hardware name: HP Victus by HP Gaming Laptop 16-s0xxx/8BD5, BIOS F.20 03/20/2024
[ 11.997300] Call Trace:
[ 11.997301] <TASK>
[ 11.997305] dump_stack_
[ 11.997313] dump_stack+
[ 11.997315] __ubsan_
[ 11.997320] split_gpu_
[ 11.997346] uvm_pmm_
[ 11.997366] phys_mem_
[ 11.997389] allocate_
[ 11.997405] ? allocate_
[ 11.997422] uvm_page_
[ 11.997442] uvm_gpu_
[ 11.997461] ? __mod_memcg_
[ 11.997468] uvm_va_
[ 11.997486] ? srso_alias_
[ 11.997489] ? bdev_getblk+
[ 11.997493] uvm_api_
[ 11.997510] uvm_ioctl+
[ 11.997526] ? srso_alias_
[ 11.997528] ? srso_alias_
[ 11.997530] ? xas_find+0x74/0x1e0
[ 11.997533] ? srso_alias_
[ 11.997535] ? srso_alias_
[ 11.997537] ? next_uptodate_
[ 11.997541] ? srso_alias_
[ 11.997542] ? _raw_spin_
[ 11.997546] ? srso_alias_
[ 11.997548] ? thread_
[ 11.997568] uvm_unlocked_
[ 11.997584] ? srso_alias_
[ 11.997586] ? do_read_
[ 11.997589] ? srso_alias_
[ 11.997591] ? do_fault+
[ 11.997592] ? srso_alias_
[ 11.997595] uvm_unlocked_
[ 11.997610] __x64_sys_
[ 11.997613] x64_sys_
[ 11.997616] do_syscall_
[ 11.997619] ? srso_alias_
[ 11.997621] ? __count_
[ 11.997624] ? srso_alias_
[ 11.997625] ? count_memcg_
[ 11.997628] ? srso_alias_
[ 11.997630] ? handle_
[ 11.997633] ? srso_alias_
[ 11.997634] ? do_user_
[ 11.997637] ? srso_alias_
[ 11.997639] ? irqentry_
[ 11.997641] ? srso_alias_
[ 11.997643] ? irqentry_
[ 11.997644] ? srso_alias_
[ 11.997646] ? exc_page_
[ 11.997648] entry_SYSCALL_
[ 11.997650] RIP: 0033:0x78462c924ded
[ 11.997672] Code: 04 25 28 00 00 00 48 89 45 c8 31 c0 48 8d 45 10 c7 45 b0 10 00 00 00 48 89 45 b8 48 8d 45 d0 48 89 45 c0 b8 10 00 00 00 0f 05 <89> c2 3d 00 f0 ff ff 77 1a 48 8b 45 c8 64 48 2b 04 25 28 00 00 00
[ 11.997674] RSP: 002b:00007845dc
[ 11.997676] RAX: ffffffffffffffda RBX: 00007845bbd00860 RCX: 000078462c924ded
[ 11.997678] RDX: 00007845dcbff540 RSI: 0000000000000025 RDI: 0000000000000008
[ 11.997679] RBP: 00007845dcbff4f0 R08: 00007845bbd008f0 R09: 0000000000000000
[ 11.997680] R10: 000078462c80d630 R11: 0000000000000246 R12: 00007845b003b8b6
[ 11.997681] R13: 00007845bbd008f0 R14: 00007845dcbff540 R15: 0000000000000008
[ 11.997684] </TASK>
[ 11.997692] ---[ end trace ]---