Summary:
Kernel panic occurs after multiple TCP connection creations/closures to the localhost.
The bug was found using STAF RPC calls, but is easily reproducible with SSH.
The bug doesn't appear on an identical virtual machine booting from the disk.
The bug is not reproducible on a similarly-prepared Ubuntu 16.04 machine.
The bug is reproducible using an older 4.13.0-16-generic kernel
Reproducible on multiple hardware types.
Unable to create a kernel memory dump due to makedumpfile errors.
apport-bug save attached.
Reproduction steps:
1. Boot a system from a nfsroot
2. Configure password-less localhost ssh access
3. Run a loop: `while true; do ssh localhost 'uname -a'; done`
4. Wait for system to crash
Summary:
Kernel panic occurs after multiple TCP connection creations/closures to the localhost.
The bug was found using STAF RPC calls, but is easily reproducible with SSH.
The bug doesn't appear on an identical virtual machine booting from the disk.
The bug is not reproducible on a similarly-prepared Ubuntu 16.04 machine.
The bug is reproducible using an older 4.13.0-16-generic kernel
Reproducible on multiple hardware types.
Unable to create a kernel memory dump due to makedumpfile errors.
apport-bug save attached.
NFSRoot boot options: 190.0.0. 254:/diskless/ host/u1616/ Ubuntu/ 17.10 intel_iommu=on net.ifnames=0 biosdevname=0 apparmor=0 ip=:::::eth0:dhcp blacklist= i40e,ixgbe, fm10k crashkernel= 384M-:768M rw
vmlinuz initrd=initrd.img boot=nfs root=/dev/nfs nfsroot=
Software:
OS: Ubuntu 17.10
Kernel: 4.13.0-17-generic x86_64
Reproduction steps:
1. Boot a system from a nfsroot
2. Configure password-less localhost ssh access
3. Run a loop: `while true; do ssh localhost 'uname -a'; done`
4. Wait for system to crash
Trace: 52372730, -;general protection fault: 0000 [#1] SMP 52372771, -;Modules linked in: arc4 md4 rpcsec_gss_krb5 nls_utf8 auth_rpcgss cifs nfsv4 ccm ipmi_ssif intel_rapl sb_edac x86_pkg_ temp_thermal intel_powerclamp coretemp intel_cstate mei_me input_leds joydev intel_rapl_perf mei kvm_intel lpc_ich ioatdma kvm irqbypass ipmi_si ipmi_devintf ipmi_msghandler shpchp acpi_pad acpi_power_meter mac_hid ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_ iscsi ip_tables x_tables autofs4 nfsv3 nfs_acl nfs lockd grace sunrpc fscache raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor hid_generic usbhid hid raid6_pq libcrc32c raid1 raid0 multipath linear uas usb_storage crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc ast ttm aesni_intel igb drm_kms_helper aes_x86_64 crypto_simd syscopyarea glue_helper 52373322, -;CPU: 11 PID: 1848 Comm: STAFProc Not tainted 4.13.0-17-generic #20-Ubuntu 52373371, -;Hardware name: Supermicro Super Server/X10SRD-F, BIOS 2.0 12/17/2015 52373418, -;task: ffff9d09267f5d00 task.stack: ffffafddc3a70000 52373461, -;RIP: 0010:kfree+ 0x53/0x160 52373486, -;RSP: 0018:ffff9d092e cc3bc8 EFLAGS: 00010207 52373521, -;RAX: 0000000000000000 RBX: 241c894900000001 RCX: 0000000000000004 52373566, -;RDX: 000032d49081cc08 RSI: 0000000000010080 RDI: 000062fac0000000 52373611, -;RBP: ffff9d092ecc3be0 R08: 000000000001f4c0 R09: ffffffff943bb839 52373656, -;R10: 00904c7891000000 R11: 0000000000000000 R12: ffff9d09267ef000 52373701, -;R13: ffffffff93fa155e R14: ffff9d09267ef000 R15: ffff9d09267ef000 52373746, -;FS: 00007f3a5331370 0(0000) GS:ffff9d092ecc 0000(0000) knlGS:000000000 0000000 52373797, -;CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 52373834, -;CR2: 00007fd5c9ffa780 CR3: 00000004666d7000 CR4: 00000000003406e0 52373878, -;DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 52373923, -;DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 52373968, -;Call Trace: sk_free+ 0x3e/0x50 0x108/0x190 0x20/0x30 rcv+0x94d/ 0x9d0 deliver_ finish+ 0x5c/0x1f0 deliver+ 0x6f/0xe0 finish+ 0x120/0x410 load_avg+ 0x46d/0x590 receive_ skb_core+ 0x39a/0xaa0 receive_ skb+0x18/ 0x60 receive_ skb+0x18/ 0x60 backlog+ 0x89/0x140 action+ 0x13b/0x380 0xde/0x2a5 own_stack+ 0x1c/0x30 part.17+ 0x4e/0x50 bh_enable_ ip+0x5a/ 0x60 output2+ 0x172/0x3a0 output+ 0x190/0x250 output+ 0x190/0x250 base+0x81/ 0xa0 out+0x35/ 0x40 xmit+0x160/ 0x3e0 skb+0x87/ 0x1e0 skb+0x538/ 0x9e0 ack.part. 35+0xbd/ 0x130 ack+0x16/ 0x20 rbuf+0x67/ 0x100 0x572/0xb60 0x4b/0xc0 0x3d/0x50 iter+0x90/ 0xe0 read+0xde/ 0x130 0x26/0x40 64_fastpath+ 0x1e/0xa9 52400845, -;RIP: 0033:0x7f3a58da0d5d 52401583, -;RSP: 002b:00007f3a53 310e70 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 52402370, -;RAX: ffffffffffffffda RBX: 00007f3a44000078 RCX: 00007f3a58da0d5d 52403153, -;RDX: 0000000000000020 RSI: 00007f3a4400b5f8 RDI: 000000000000000f 52403952, -;RBP: 0000000000001010 R08: 0000000000000000 R09: 0000000000000000 52404752, -;R10: 00007f3a593e4d20 R11: 0000000000000246 R12: 00007f3a44000020 52405553, -;R13: 0000000000001000 R14: 00007f3a44000078 R15: 0000000000000001 52406363, -;Code: 00 80 49 01 da 0f 82 1c 01 00 00 48 c7 c7 00 00 00 80 48 2b 3d 7f e6 c1 00 49 01 fa 49 c1 ea 0c 49 c1 e2 06 4c 03 15 5d e6 c1 00 <49> 8b 42 20 48 8d 50 ff a8 01 4c 0f 45 d2 49 8b 52 20 48 8d 42 52408171, -;RIP: kfree+0x53/0x160 RSP: ffff9d092ecc3bc8 52409129, -;---[ end trace f7b53e7f81a1cbda ]--- 52414686, -;Kernel panic - not syncing: Fatal exception in interrupt
4,1151,
4,1152,
4,1153,52373251,c; sysfillrect dca cryptd sysimgblt i2c_algo_bit fb_sys_fops ahci ptp drm libahci pps_core wmi
4,1154,
4,1155,
4,1156,
4,1157,
4,1158,
4,1159,
4,1160,
4,1161,
4,1162,
4,1163,
4,1164,
4,1165,
4,1166,
4,1167,
4,1168,
4,1169,
4,1170,52373987,-; <IRQ>
4,1171,52374009,-; security_
4,1172,52374042,-; __sk_destruct+
4,1173,52374070,-; sk_destruct+
4,1174,52374095,-; __sk_free+0x82/0xa0
4,1175,52374120,-; sk_free+0x19/0x20
4,1176,52374144,-; sock_put+0x14/0x20
4,1177,52374168,-; tcp_v4_
4,1178,52374195,-; ip_local_
4,1179,52374226,-; ip_local_
4,1180,52374254,-; ip_rcv_
4,1181,52374281,-; ip_rcv+0x28c/0x3a0
4,1182,52374305,-; ? update_
4,1183,52374335,-; __netif_
4,1184,52374369,-; __netif_
4,1185,52374398,-; ? __netif_
4,1186,52374428,-; process_
4,1187,52374457,-; net_rx_
4,1188,52374485,-; __do_softirq+
4,1189,52375837,-; do_softirq_
4,1190,52377188,-; </IRQ>
4,1191,52378538,-; do_softirq.
4,1192,52379869,-; __local_
4,1193,52381174,-; ip_finish_
4,1194,52382442,-; ip_finish_
4,1195,52383658,-; ? ip_finish_
4,1196,52384833,-; ip_output+0x70/0xe0
4,1197,52385959,-; ? lock_timer_
4,1198,52387043,-; ip_local_
4,1199,52388078,-; ip_queue_
4,1200,52389068,-; ? __alloc_
4,1201,52390019,-; tcp_transmit_
4,1202,52390944,-; tcp_send_
4,1203,52391849,-; tcp_send_
4,1204,52392729,-; tcp_cleanup_
4,1205,52393600,-; tcp_recvmsg+
4,1206,52394464,-; inet_recvmsg+
4,1207,52395307,-; sock_recvmsg+
4,1208,52396135,-; sock_read_
4,1209,52396955,-; new_sync_
4,1210,52397766,-; __vfs_read+
4,1211,52398568,-; vfs_read+0x8e/0x130
4,1212,52399347,-; SyS_read+0x55/0xc0
4,1213,52400099,-; entry_SYSCALL_
4,1214,
4,1215,
4,1216,
4,1217,
4,1218,
4,1219,
4,1220,
4,1221,
1,1222,
4,1223,
0,1224,