Kernel Panic with linux kernel 4.15.0-60 possibly related to network subsystem

Bug #1843152 reported by David Burrow on 2019-09-08
24
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned

Bug Description

1. Releases: Ubuntu Server 16.04 and Ubuntu Server 18.04
2. Package: Linux Kernel 4.15.0-60 amd64
3. What I expected to happen: Not a recurrent kernel panic.
4. What happened instead: Recurrent kernel panic.

While running Ubuntu 16.04 server on an internet appliance that was serving as my router, after the system was updated from kernel 4.15.0-58 to 4.15.0-60, I began getting frequent kernel panics (system seldom remained up for more than an hour after a reboot). I have included example stack traces below. After memtest revealed no problems, and fsck revealed no problems, I opted to rebuild from a fresh install, this time of Ubuntu 18.04, server. Upon completing the rebuilt, and put it back in service. It crashed with the same kernel panic within 30 minutes.

I have since updated to kernel 5.0.0-27 and the kernel panics have completely stopped.

[10170.296117] kernel BUG at /build/linux-5mCauq/linux-4.15.0/net/ipv4/ip_output.c:636!
[10170.304214] invalid opcode: 0000 [#1] SMP PTI
[10170.308692] Modules linked in: st lp parport_pc ppdev parport ipt_REJECT nf_reject_ipv4 xt_condition(OE) xt_time xt_comment xt_iface(OE) xt_multiport xt_conntrack xt_set xt_recent xt_hashlimit xt_addrtype xt_mark iptable_mangle xt_nat xt_REDIRECT nf_nat_redirect ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_iprange iptable_nat nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_CT iptable_raw ip_set_list_set ip_set_hash_ip ip_set_hash_net ip_set_hash_mac ip_set xt_NFLOG nfnetlink_log nf_log_ipv4 nf_log_common xt_LOG nf_conntrack_sane nf_conntrack_netlink nfnetlink nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netbios_ns
[10170.381810] nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp ts_kmp nf_conntrack_amanda nf_conntrack iptable_filter bridge intel_rapl intel_soc_dts_thermal intel_soc_dts_iosf intel_powerclamp coretemp kvm_intel snd_hda_codec_hdmi kvm irqbypass punit_atom_debug intel_cstate snd_hda_intel snd_hda_codec snd_hda_core joydev snd_hwdep snd_pcm snd_timer serio_raw input_leds snd lpc_ich soundcore shpchp mac_hid mei_txe mei sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp mrp stp llc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear i915 crct10dif_pclmul crc32_pclmul i2c_algo_bit drm_kms_helper
[10170.454089] ghash_clmulni_intel e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ptp ahci psmouse cryptd drm pps_core libahci video hid_generic usbhid hid [last unloaded: parport_pc]
[10170.471175] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G OE 4.15.0-60-generic #67-Ubuntu
[10170.480269] Hardware name: Protectli FW1/FW1, BIOS 5.6.5 05/14/2019
[10170.486704] RIP: 0010:ip_do_fragment+0x482/0x820
[10170.491462] RSP: 0018:ffff92533fc83a18 EFLAGS: 00010202
[10170.496824] RAX: 0000000000000001 RBX: ffff92532b43ed00 RCX: ffffffff8d64cdf0
[10170.504158] RDX: 0000000000000024 RSI: 00000000000005c8 RDI: ffff925329f06300
[10170.511485] RBP: ffff92533fc83a80 R08: ffff925330be9700 R09: 00000000000005dc
[10170.518777] R10: 0000000000000000 R11: ffff92533fc839d0 R12: 0000000000000014
[10170.526096] R13: ffff92532dea4300 R14: 0000000000002328 R15: ffff925330be974e
[10170.533430] FS: 0000000000000000(0000) GS:ffff92533fc80000(0000) knlGS:0000000000000000
[10170.541729] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[10170.547631] CR2: 00007f3a9dd13b80 CR3: 000000009640a000 CR4: 00000000001006e0
[10170.554957] Call Trace:
[10170.557452] <IRQ>
[10170.559541] ? ip_copy_metadata+0x220/0x220
[10170.563849] ip_fragment.constprop.45+0x43/0x80
[10170.568499] ip_finish_output+0x182/0x270
[10170.572604] ? nf_hook_slow+0x48/0xc0
[10170.576354] ip_output+0x70/0xe0
[10170.579664] ? ip_fragment.constprop.45+0x80/0x80
[10170.584496] ip_forward_finish+0x51/0x80
[10170.588540] ip_forward+0x376/0x470
[10170.592108] ? ip4_key_hashfn+0xc0/0xc0
[10170.596056] ip_rcv_finish+0x129/0x430
[10170.599911] ip_rcv+0x296/0x360
[10170.603125] ? inet_del_offload+0x40/0x40
[10170.607242] __netif_receive_skb_core+0x432/0xb80
[10170.612070] ? __slab_free+0x14d/0x2c0
[10170.615937] ? __slab_free+0x14d/0x2c0
[10170.619794] ? __build_skb+0x2b/0xf0
[10170.623477] __netif_receive_skb+0x18/0x60
[10170.627687] ? __netif_receive_skb+0x18/0x60
[10170.632067] netif_receive_skb_internal+0x45/0xe0
[10170.637118] napi_gro_receive+0xc5/0xf0
[10170.641070] e1000_receive_skb+0x86/0xe0 [e1000e]
[10170.645913] e1000_clean_rx_irq+0x1fe/0x3e0 [e1000e]
[10170.651001] e1000e_poll+0x7e/0x2e0 [e1000e]
[10170.655368] net_rx_action+0x140/0x3a0
[10170.659220] __do_softirq+0xe4/0x2d4
[10170.662886] irq_exit+0xc5/0xd0
[10170.666098] do_IRQ+0x86/0xe0
[10170.669157] common_interrupt+0x8c/0x8c
[10170.673083] </IRQ>
[10170.675248] RIP: 0010:cpuidle_enter_state+0xa7/0x2f0
[10170.680331] RSP: 0018:ffffbc22c0cbbe68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffda
[10170.688098] RAX: ffff92533fca2840 RBX: 0000093ff4e19b87 RCX: 000000000000001f
[10170.695402] RDX: 0000093ff4e19b87 RSI: fffffffa71561a78 RDI: 0000000000000000
[10170.702712] RBP: ffffbc22c0cbbea8 R08: 0000000000000002 R09: 0000000000022080
[10170.710036] R10: ffffbc22c0cbbe38 R11: 0000128930a6e3a0 R12: ffff92533fcac230
[10170.717346] R13: 0000000000000001 R14: ffffffff8e373298 R15: 0000000000000000
[10170.724679] ? cpuidle_enter_state+0x97/0x2f0
[10170.729186] cpuidle_enter+0x17/0x20
[10170.732864] call_cpuidle+0x23/0x40
[10170.740100] do_idle+0x18c/0x1f0
[10170.747042] cpu_startup_entry+0x73/0x80
[10170.754667] start_secondary+0x1ab/0x200
[10170.762254] secondary_startup_64+0xa5/0xb0
[10170.770047] Code: 8b 87 d8 00 00 00 48 2b 87 d0 00 00 00 39 c2 0f 87 f4 00 00 00 8b 87 e4 00 00 00 83 f8 01 0f 85 e5 00 00 00 48 83 7f 18 00 74 8e <0f> 0b 8b 0a 89 08 44 89 e1 8b 54 0a fc 89 54 08 fc e9 00 fd ff
[10170.792899] RIP: ip_do_fragment+0x482/0x820 RSP: ffff92533fc83a18
[10170.802654] ---[ end trace 1c2b11293ec6a0b9 ]---
[10170.810523] Kernel panic - not syncing: Fatal exception in interrupt
[10170.820170] Kernel Offset: 0xbe00000 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffffbfffffff)
[10170.834057] ---[ end Kernel panic - not syncing: Fatal exception in interrupt
[10170.844204] ------------[ cut here ]------------

[10170.851662] sched: Unexpected reschedule of offline CPU#0!
[10170.859895] WARNING: CPU: 1 PID: 0 at /build/linux-5mCauq/linux-4.15.0/arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x3a/0x40
[10170.874714] Modules linked in: st lp parport_pc ppdev parport ipt_REJECT nf_reject_ipv4 xt_condition(OE) xt_time xt_comment xt_iface(OE) xt_multiport xt_conntrack xt_set xt_recent xt_hashlimit xt_addrtype xt_mark iptable_mangle xt_nat xt_REDIRECT nf_nat_redirect ipt_MASQUERADE nf_nat_masquerade_ipv4 xt_iprange iptable_nat nf_nat_ipv4 nf_conntrack_ipv4 nf_defrag_ipv4 xt_tcpudp xt_CT iptable_raw ip_set_list_set ip_set_hash_ip ip_set_hash_net ip_set_hash_mac ip_set xt_NFLOG nfnetlink_log nf_log_ipv4 nf_log_common xt_LOG nf_conntrack_sane nf_conntrack_netlink nfnetlink nf_nat_tftp nf_nat_snmp_basic nf_conntrack_snmp nf_nat_sip nf_nat_pptp nf_nat_proto_gre nf_nat_irc nf_nat_h323 nf_nat_ftp nf_nat_amanda nf_nat nf_conntrack_tftp nf_conntrack_sip nf_conntrack_pptp nf_conntrack_proto_gre nf_conntrack_netbios_ns
[10170.958595] nf_conntrack_broadcast nf_conntrack_irc nf_conntrack_h323 nf_conntrack_ftp ts_kmp nf_conntrack_amanda nf_conntrack iptable_filter bridge intel_rapl intel_soc_dts_thermal intel_soc_dts_iosf intel_powerclamp coretemp kvm_intel snd_hda_codec_hdmi kvm irqbypass punit_atom_debug intel_cstate snd_hda_intel snd_hda_codec snd_hda_core joydev snd_hwdep snd_pcm snd_timer serio_raw input_leds snd lpc_ich soundcore shpchp mac_hid mei_txe mei sch_fq_codel ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp mrp stp llc ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear i915 crct10dif_pclmul crc32_pclmul i2c_algo_bit drm_kms_helper
[10171.043760] ghash_clmulni_intel e1000e syscopyarea sysfillrect sysimgblt fb_sys_fops ptp ahci psmouse cryptd drm pps_core libahci video hid_generic usbhid hid [last unloaded: parport_pc]
[10171.064380] CPU: 1 PID: 0 Comm: swapper/1 Tainted: G D OE 4.15.0-60-generic #67-Ubuntu
[10171.077028] Hardware name: Protectli FW1/FW1, BIOS 5.6.5 05/14/2019
[10171.087035] RIP: 0010:native_smp_send_reschedule+0x3a/0x40
[10171.096240] RSP: 0018:ffff92533fc83450 EFLAGS: 00010086
[10171.105173] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000006
[10171.116090] RDX: 0000000000000007 RSI: 0000000000000082 RDI: ffff92533fc96490
[10171.126979] RBP: ffff92533fc83450 R08: 0000000000002417 R09: 0000000000cdcdcd
[10171.137898] R10: 000000000000038f R11: 00000000ffffffff R12: ffff92533fc22840
[10171.148836] R13: ffff92532f0bab00 R14: ffff92533fc83508 R15: ffff92533fc22840
[10171.159786] FS: 0000000000000000(0000) GS:ffff92533fc80000(0000) knlGS:0000000000000000
[10171.171742] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[10171.181321] CR2: 00007f3a9dd13b80 CR3: 000000009640a000 CR4: 00000000001006e0
[10171.192326] Call Trace:
[10171.198576] <IRQ>
[10171.204384] resched_curr+0x5d/0xc0
[10171.211703] check_preempt_curr+0x7a/0x90
[10171.219531] ttwu_do_wakeup+0x1e/0x140
[10171.227098] ttwu_do_activate+0x77/0x80
[10171.234751] try_to_wake_up+0x1d8/0x4a0
[10171.242409] default_wake_function+0x12/0x20
[10171.250511] autoremove_wake_function+0x12/0x40
[10171.258890] __wake_up_common+0x73/0x130
[10171.266657] __wake_up_common_lock+0x80/0xc0
[10171.274768] __wake_up+0x13/0x20
[10171.281801] wake_up_klogd_work_func+0x40/0x60
[10171.290084] irq_work_run_list+0x52/0x80
[10171.297802] irq_work_run+0x2c/0x40
[10171.305098] smp_irq_work_interrupt+0x3e/0xd0
[10171.313268] irq_work_interrupt+0x8c/0xa0
[10171.321077] RIP: 0010:panic+0x201/0x254
[10171.328710] RSP: 0018:ffff92533fc83770 EFLAGS: 00000282 ORIG_RAX: ffffffffffffff09
[10171.340193] RAX: 0000000000000041 RBX: 0000000000000200 RCX: 0000000000000006
[10171.351265] RDX: 0000000000000000 RSI: 0000000000000086 RDI: ffff92533fc96490
[10171.362350] RBP: ffff92533fc837e8 R08: 0000000000002415 R09: 0000000000cdcdcd
[10171.373452] R10: ffffffff8e45ff60 R11: 00000000ffffffff R12: 0000000000000000
[10171.384546] R13: 0000000000000000 R14: 0000000000000000 R15: ffff92533fc83968
[10171.395633] oops_end+0xb6/0xd0
[10171.402625] die+0x42/0x50
[10171.409160] do_trap+0xb1/0x140
[10171.416118] do_error_trap+0xa6/0x140
[10171.423606] ? ip_do_fragment+0x482/0x820
[10171.431444] ? hash_mac4_resize+0x460/0x460 [ip_set_hash_mac]
[10171.441033] do_invalid_op+0x20/0x30
[10171.448408] invalid_op+0x1b/0x40
[10171.455490] RIP: 0010:ip_do_fragment+0x482/0x820
[10171.463929] RSP: 0018:ffff92533fc83a18 EFLAGS: 00010202
[10171.472982] RAX: 0000000000000001 RBX: ffff92532b43ed00 RCX: ffffffff8d64cdf0
[10171.484035] RDX: 0000000000000024 RSI: 00000000000005c8 RDI: ffff925329f06300
[10171.495087] RBP: ffff92533fc83a80 R08: ffff925330be9700 R09: 00000000000005dc
[10171.506069] R10: 0000000000000000 R11: ffff92533fc839d0 R12: 0000000000000014
[10171.516923] R13: ffff92532dea4300 R14: 0000000000002328 R15: ffff925330be974e
[10171.527788] ? sk_common_release+0xd0/0xd0
[10171.535542] ? conntrack_mt_v3+0x20/0x30 [xt_conntrack]
[10171.544423] ? ip_copy_metadata+0x220/0x220
[10171.552190] ip_fragment.constprop.45+0x43/0x80
[10171.560286] ip_finish_output+0x182/0x270
[10171.567840] ? nf_hook_slow+0x48/0xc0
[10171.574980] ip_output+0x70/0xe0
[10171.581536] ? ip_fragment.constprop.45+0x80/0x80
[10171.589467] ip_forward_finish+0x51/0x80
[10171.596456] ip_forward+0x376/0x470
[10171.602861] ? ip4_key_hashfn+0xc0/0xc0
[10171.609525] ip_rcv_finish+0x129/0x430
[10171.615939] ip_rcv+0x296/0x360
[10171.621596] ? inet_del_offload+0x40/0x40
[10171.628010] __netif_receive_skb_core+0x432/0xb80
[10171.635033] ? __slab_free+0x14d/0x2c0
[10171.640992] ? __slab_free+0x14d/0x2c0
[10171.646867] ? __build_skb+0x2b/0xf0
[10171.652549] __netif_receive_skb+0x18/0x60
[10171.658770] ? __netif_receive_skb+0x18/0x60
[10171.665132] netif_receive_skb_internal+0x45/0xe0
[10171.671957] napi_gro_receive+0xc5/0xf0
[10171.677880] e1000_receive_skb+0x86/0xe0 [e1000e]
[10171.684682] e1000_clean_rx_irq+0x1fe/0x3e0 [e1000e]
[10171.691757] e1000e_poll+0x7e/0x2e0 [e1000e]
[10171.698095] net_rx_action+0x140/0x3a0
[10171.703871] __do_softirq+0xe4/0x2d4
[10171.709447] irq_exit+0xc5/0xd0
[10171.714536] do_IRQ+0x86/0xe0
[10171.719457] common_interrupt+0x8c/0x8c
[10171.725266] </IRQ>
[10171.729323] RIP: 0010:cpuidle_enter_state+0xa7/0x2f0
[10171.736294] RSP: 0018:ffffbc22c0cbbe68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffda
[10171.745966] RAX: ffff92533fca2840 RBX: 0000093ff4e19b87 RCX: 000000000000001f
[10171.755236] RDX: 0000093ff4e19b87 RSI: fffffffa71561a78 RDI: 0000000000000000
[10171.764499] RBP: ffffbc22c0cbbea8 R08: 0000000000000002 R09: 0000000000022080
[10171.773763] R10: ffffbc22c0cbbe38 R11: 0000128930a6e3a0 R12: ffff92533fcac230
[10171.783052] R13: 0000000000000001 R14: ffffffff8e373298 R15: 0000000000000000
[10171.792370] ? cpuidle_enter_state+0x97/0x2f0
[10171.798864] cpuidle_enter+0x17/0x20
[10171.804551] call_cpuidle+0x23/0x40
[10171.810156] do_idle+0x18c/0x1f0
[10171.815483] cpu_startup_entry+0x73/0x80
[10171.821546] start_secondary+0x1ab/0x200
[10171.827600] secondary_startup_64+0xa5/0xb0
[10171.833894] Code: 8d 6c 60 01 73 17 48 8b 05 64 d8 14 01 be fd 00 00 00 48 8b 40 30 e8 16 9d ba 00 5d c3 89 fe 48 c7 c7 f0 b7 ea 8d e8 56 68 03 00 <0f> 0b 5d c3 66 90 0f 1f 44 00 00 55 48 89 e5 53 48 83 ec 20 65
[10171.855369] ---[ end trace 1c2b11293ec6a0ba ]---

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1843152

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
tags: added: bionic
Andrey--k (andrey--k) wrote :

I have same error (kernel panik message
[ 365.536168] kernel BUG at /build/linux-hwe-L8V6Q3/linux-hwe-4.15.0/net/ipv4/ip_output.c:636! )

How I can provide `apport-collect` ?

tags: added: apport-collected xenial

AlsaDevices:
 total 0
 crw-rw---- 1 root audio 116, 1 Sep 8 11:07 seq
 crw-rw---- 1 root audio 116, 33 Sep 8 11:07 timer
AplayDevices: Error: [Errno 2] No such file or directory
ApportVersion: 2.20.1-0ubuntu2.19
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
AudioDevicesInUse: Error: command ['fuser', '-v', '/dev/snd/seq', '/dev/snd/timer'] failed with exit code 1:
DistroRelease: Ubuntu 16.04
HibernationDevice: RESUME=UUID=f0a1910d-2e27-4679-bf5c-e3b30f2fe8a1
InstallationDate: Installed on 2017-06-06 (823 days ago)
InstallationMedia: Ubuntu-Server 16.04.2 LTS "Xenial Xerus" - Release amd64 (20170215.8)
IwConfig: Error: [Errno 2] No such file or directory
MachineType: HP ProLiant DL380 G5
NonfreeKernelModules: zfs zunicode zavl icp zcommon znvpair
Package: linux (not installed)
PciMultimedia:

ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcFB: 0 radeondrmfb
ProcKernelCmdLine: BOOT_IMAGE=/vmlinuz-4.15.0-58-generic root=UUID=9bcdbf2d-a0e7-4d04-af89-8c452d46607a ro apparmor=0 security=
ProcVersionSignature: Ubuntu 4.15.0-58.64~16.04.1-generic 4.15.18
RelatedPackageVersions:
 linux-restricted-modules-4.15.0-58-generic N/A
 linux-backports-modules-4.15.0-58-generic N/A
 linux-firmware 1.157.21
RfKill: Error: [Errno 2] No such file or directory
Tags: xenial xenial
Uname: Linux 4.15.0-58-generic x86_64
UnreportableReason: The report belongs to a package that is not installed.
UpgradeStatus: No upgrade log present (probably fresh install)
UserGroups:

_MarkForUpload: False
dmi.bios.date: 05/02/2011
dmi.bios.vendor: HP
dmi.bios.version: P56
dmi.chassis.type: 23
dmi.chassis.vendor: HP
dmi.modalias: dmi:bvnHP:bvrP56:bd05/02/2011:svnHP:pnProLiantDL380G5:pvr:cvnHP:ct23:cvr:
dmi.product.family: ProLiant
dmi.product.name: ProLiant DL380 G5
dmi.sys.vendor: HP

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

apport information

Andrey--k (andrey--k) wrote :

kernel log

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Ewen McNeill (ewen) wrote :

These symptoms sound very much like https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1842447 (I found the bug I'm commenting on while searching for additional links about the issue in 1842447). There's a -62 kernel in proposed updates which hopefully contains the fix for this bug. See https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1842447 for hints on how to install the proposed update kernel.

So far the trigger is sounding like NAT + 4.15.0-60 kernel + sufficient time that the relevant uninitialised variable is not clean from boot.

I think this is the fix in -62:

https://kernel.ubuntu.com/git/ubuntu/ubuntu-bionic.git/commit/?h=master-next&id=b502cfeffec81be8564189e5498fd3f252b27900

Ewen

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers