Reference counter issue in 4.15 (nf_xfrm_me_harder / dst_release)

Bug #1786752 reported by Juergen Kendzorra
16
This bug affects 3 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

Since upgrading from 14.04 to 18.04, we see very frequent warnings about negative refcnts in dst_release:

[ 3117.882227] WARNING: CPU: 6 PID: 0 at /build/linux-I4R9hO/linux-4.15.0/include/net/dst.h:256 nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[ 3117.882229] Modules linked in: xt_policy cls_u32 sch_sfq ip_vti ip_tunnel authenc echainiv xfrm6_mode_tunnel xfrm4_mode_tunnel xfrm4_tunnel tunnel4 ipcomp xfrm_ipcomp esp4 ah4 af_key nfnetlink_queue nfnetlink_log sch_htb xt_TPROXY xt_multiport veth nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo br_netfilter bridge overlay macvlan 8021q garp mrp stp llc bonding algif_skcipher af_alg xt_hl ip6t_rt nf_conntrack_ipv6 nf_defrag_ipv6 ip6t_REJECT nf_reject_ipv6 xt_nat ipt_MASQUERADE nf_nat_masquerade_ipv4 dm_crypt iptable_nat nf_nat_ipv4 xt_DSCP xt_dscp xt_mark iptable_mangle xt_limit xt_tcpudp xt_addrtype intel_rapl sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp nf_conntrack_ipv4 nf_defrag_ipv4 kvm_intel xt_conntrack kvm joydev input_leds ipt_REJECT nf_reject_ipv4 ftdi_sio usbserial ipmi_si
[ 3117.882265] irqbypass intel_cstate mei_me mei ioatdma acpi_pad intel_rapl_perf shpchp ipmi_devintf ipmi_msghandler acpi_power_meter lpc_ich mac_hid ip6table_filter ip6_tables nf_conntrack_netbios_ns nf_conntrack_broadcast nf_nat_ftp nf_nat nf_conntrack_ftp sch_fq_codel nf_conntrack iptable_filter ib_iser rdma_cm iw_cm ib_cm ib_core iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi ip_tables x_tables autofs4 btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear mgag200 ttm crct10dif_pclmul igb ixgbe crc32_pclmul drm_kms_helper hid_generic ghash_clmulni_intel syscopyarea i2c_algo_bit dca sysfillrect usbhid pcbc sysimgblt fb_sys_fops aesni_intel aes_x86_64 crypto_simd ahci glue_helper ptp hid mxm_wmi cryptd
[ 3117.882309] drm libahci megaraid_sas pps_core mdio wmi
[ 3117.882315] CPU: 6 PID: 0 Comm: swapper/6 Tainted: G W 4.15.0-30-generic #32-Ubuntu
[ 3117.882316] Hardware name: Intel Corporation S2600WT2R/S2600WT2R, BIOS SE5C610.86B.01.01.0022.062820171903 06/28/2017
[ 3117.882319] RIP: 0010:nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[ 3117.882320] RSP: 0018:ffff88d77f3839f0 EFLAGS: 00010246
[ 3117.882322] RAX: 0000000000000000 RBX: ffffffff90de4000 RCX: 0000000000001924
[ 3117.882323] RDX: 0000000000000000 RSI: ffff88d749ae6400 RDI: ffff88d655e96600
[ 3117.882324] RBP: ffff88d77f383a68 R08: ffff88d7493dc000 R09: 0000000000000018
[ 3117.882324] R10: 0000000000000001 R11: ffff88e770dedc00 R12: ffff88d655e96600
[ 3117.882325] R13: ffff88d77f383ae8 R14: ffff88d7799ff200 R15: ffff88d7493dc000
[ 3117.882327] FS: 0000000000000000(0000) GS:ffff88d77f380000(0000) knlGS:0000000000000000
[ 3117.882327] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3117.882328] CR2: ffffffffff600400 CR3: 000000193de0a002 CR4: 00000000003606e0
[ 3117.882329] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3117.882330] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 3117.882331] Call Trace:
[ 3117.882332] <IRQ>
[ 3117.882337] ? nf_nat_ipv4_fn+0x15b/0x200 [nf_nat_ipv4]
[ 3117.882339] nf_nat_ipv4_out+0xc5/0xe0 [nf_nat_ipv4]
[ 3117.882342] iptable_nat_ipv4_out+0x15/0x20 [iptable_nat]
[ 3117.882347] nf_hook_slow+0x48/0xc0
[ 3117.882353] ip_output+0xd2/0xe0
[ 3117.882355] ? ip_fragment.constprop.44+0x80/0x80
[ 3117.882357] ip_forward_finish+0x49/0x70
[ 3117.882359] ip_forward+0x366/0x440
[ 3117.882361] ? ip_frag_mem+0x20/0x20
[ 3117.882362] ip_rcv_finish+0x129/0x430
[ 3117.882364] ip_rcv+0x28f/0x3a0
[ 3117.882366] ? inet_del_offload+0x40/0x40
[ 3117.882372] __netif_receive_skb_core+0x432/0xb40
[ 3117.882379] ? handle_edge_irq+0x7c/0x190
[ 3117.882384] ? irq_exit+0x67/0xc0
[ 3117.882391] ? do_IRQ+0x82/0xd0
[ 3117.882393] __netif_receive_skb+0x18/0x60
[ 3117.882395] ? __netif_receive_skb+0x18/0x60
[ 3117.882397] netif_receive_skb_internal+0x37/0xd0
[ 3117.882398] napi_gro_receive+0xc5/0xf0
[ 3117.882407] ixgbe_clean_rx_irq+0x446/0xe30 [ixgbe]
[ 3117.882411] ixgbe_poll+0x256/0x710 [ixgbe]
[ 3117.882413] ? do_IRQ+0x82/0xd0
[ 3117.882415] net_rx_action+0x140/0x3a0
[ 3117.882418] __do_softirq+0xdf/0x2b2
[ 3117.882419] irq_exit+0xb6/0xc0
[ 3117.882421] do_IRQ+0x82/0xd0
[ 3117.882423] common_interrupt+0x84/0x84
[ 3117.882424] </IRQ>
[ 3117.882427] RIP: 0010:cpuidle_enter_state+0xa7/0x2f0
[ 3117.882428] RSP: 0018:ffff9e1a46413e68 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffda
[ 3117.882430] RAX: ffff88d77f3a2880 RBX: 000002d5f0422d24 RCX: 000000000000001f
[ 3117.882430] RDX: 000002d5f0422d24 RSI: fffa51c90e2bb456 RDI: 0000000000000000
[ 3117.882431] RBP: ffff9e1a46413ea8 R08: 000000000000003c R09: 0000000000000007
[ 3117.882432] R10: ffff9e1a46413e38 R11: 0000000000000036 R12: ffffbe0a3fb82240
[ 3117.882432] R13: 0000000000000001 R14: ffffffff90d71c98 R15: 0000000000000000
[ 3117.882435] ? cpuidle_enter_state+0x97/0x2f0
[ 3117.882436] cpuidle_enter+0x17/0x20
[ 3117.882439] call_cpuidle+0x23/0x40
[ 3117.882441] do_idle+0x18c/0x1f0
[ 3117.882443] cpu_startup_entry+0x73/0x80
[ 3117.882446] start_secondary+0x1ab/0x200
[ 3117.882449] secondary_startup_64+0xa5/0xb0
[ 3117.882450] Code: ff ff ff eb cc 48 83 e7 fe 48 89 45 88 e8 b2 bb c7 cf 48 8b 45 88 eb 90 85 c0 74 0f 8d 50 01 f0 0f b1 11 0f 84 53 ff ff ff eb ed <0f> 0b e9 4a ff ff ff e8 7d 1d 4b cf 0f 1f 00 66 2e 0f 1f 84 00
[ 3117.882475] ---[ end trace 034946ae5013518a ]---
[ 3117.882481] dst_release: dst:00000000444c06c4 refcnt:-1

# uname -a
Linux server 4.15.0-30-generic #32-Ubuntu SMP Thu Jul 26 17:42:43 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

Revision history for this message
Juergen Kendzorra (juergen-kendzorra) wrote :
Revision history for this message
Juergen Kendzorra (juergen-kendzorra) wrote :
Revision history for this message
Juergen Kendzorra (juergen-kendzorra) wrote :
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Status changed to Confirmed

This change was made by a bot.

Changed in linux (Ubuntu):
status: New → Confirmed
tags: added: bionic
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.18 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.18

Changed in linux (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Incomplete
Revision history for this message
Juergen Kendzorra (juergen-kendzorra) wrote :

The issue still is visible with the latest mainline kernel; see attached.

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
tags: added: kernel-bug-exists-upstream
Revision history for this message
Martin Zaharinov (micron10) wrote :
Download full text (15.2 KiB)

Hi
Do you find fix for this bug
I use latest kernel 4.19.8 and have same problem see down
i use pppoe with 1k+ users if i activate shaper with hfsc and imq machine crash and reboot
after stop shaper for test mashine only send in dmesg bug error mesg.

This is a Kernel BUG and need to by fix
if any need more info or debug i will send you.

[93408.910441] WARNING: CPU: 1 PID: 0 at include/net/dst.h:239 nf_xfrm_me_harder+0xe7/0x130 [nf_nat]
[93408.945500] Modules linked in: udp_diag unix_diag af_packet_diag sch_hfsc iptable_filter xt_IMQ iptable_mangle xt_addrtype xt_nat ipt_MASQUERADE iptable_nat nf_nat_ipv4 ip_tables bpfilter sch_fq_codel netconsole r8169 tg3 libphy igb i2c_algo_bit ixgb nf_nat_pptp nf_nat_proto_gre nf_conntrack_pptp nf_conntrack_proto_gre nf_nat_tftp nf_conntrack_tftp nf_nat_ftp nf_conntrack_ftp nf_nat nf_conntrack nf_defrag_ipv4 pppoe pptp gre pppox sha1_mb mcryptd sha1_ssse3 sha1_generic arc4 ppp_mppe ppp_generic slhc megaraid_sas [last unloaded: imq]
[93409.103093] CPU: 1 PID: 0 Comm: PDS/1 Tainted: G O 4.19.6 #1
[93409.124009] Hardware name: Supermicro Super Server/X11SSZ-F, BIOS 2.2b 02/12/2018
[93409.166624] RIP: 0010:nf_xfrm_me_harder+0xe7/0x130 [nf_nat]
[93409.188447] Code: 48 8b 5c 24 60 65 48 33 1c 25 28 00 00 00 75 53 48 83 c4 68 5b 5d 41 5c c3 85 c0 74 0d 8d 48 01 f0 0f b1 0a 74 86 85 c0 75 f3 <0f> 0b e9 7b ff ff ff 29 c6 31 d2 b9 20 00 48 00 4c 89 e7 e8 31 27
[93409.254096] RSP: 0018:ffff888454c83ce0 EFLAGS: 00010246
[93409.275791] RAX: 0000000000000000 RBX: ffffffff81e83f40 RCX: 00000000000098a2
[93409.297628] RDX: 0000000000000000 RSI: ffff88843c229480 RDI: 0000000000000038
[93409.319099] RBP: 0000000000000000 R08: 0000000000000038 R09: ffff8883e2811000
[93409.340547] R10: 0000000000000009 R11: 0000000000000000 R12: ffff8883e68d7a00
[93409.361901] R13: ffff888454c83db8 R14: ffff888454c9f64c R15: ffff888454c9f650
[93409.383419] FS: 0000000000000000(0000) GS:ffff888454c80000(0000) knlGS:0000000000000000
[93409.427294] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[93409.449919] CR2: 00007f864054e730 CR3: 0000000001e0a006 CR4: 00000000001606e0
[93409.472746] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[93409.495480] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[93409.518035] Call Trace:
[93409.539897] <IRQ>
[93409.561039] ? nf_nat_ipv4_manip_pkt+0x54/0x100 [nf_nat_ipv4]
[93409.582440] nf_nat_ipv4_out+0x78/0x90 [nf_nat_ipv4]
[93409.603476] nf_hook_slow+0x36/0xd0
[93409.623865] ip_output+0x9f/0xd0
[93409.643603] ? ip_fragment.constprop.5+0x70/0x70
[93409.663095] ip_forward+0x328/0x440
[93409.681987] ? ip_defrag.cold.3+0x22/0x22
[93409.700405] ip_rcv+0x8a/0xb0
[93409.718201] ? ip_rcv_finish_core.isra.0+0x340/0x340
[93409.735914] __netif_receive_skb_one_core+0x4b/0x70
[93409.753308] process_backlog+0x95/0x130
[93409.770199] net_rx_action+0x122/0x2d0
[93409.786744] __do_softirq+0xba/0x206
[93409.802880] irq_exit+0xae/0xf0
[93409.818424] call_function_single_interrupt+0xf/0x20
[93409.834147] </IRQ>
[93409.849337] RIP: 0010:mwait_idle+0x50/0x80
[93409.864491] Code: 0f ba e2 27 72 3a 31 d2 65 48 8b 04 25 40 4c 01 00 48 89 d1 0f 01 c8 48...

Revision history for this message
roobesh ganapathy mohandass (roobesh) wrote :

Just to follow up , on behalf of Juergen. Do we have any progress with this bug ?

Revision history for this message
roobesh ganapathy mohandass (roobesh) wrote :
Download full text (107.6 KiB)

We have upgraded the latest kernel and even occasionally we facing this call traces and CPU tainted,

# uname -rn
 4.15.0-45-generic
# uname -a
Linux 4.15.0-45-generic #48-Ubuntu SMP Tue Jan 29 16:28:13 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

# dmesg -T | grep "Fri Mar 1.*CPU"
[Fri Mar 1 10:28:42 2019] WARNING: CPU: 33 PID: 0 at /build/linux-uQJ2um/linux-4.15.0/include/net/dst.h:256 nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[Fri Mar 1 10:28:42 2019] CPU: 33 PID: 0 Comm: swapper/33 Tainted: G W OE 4.15.0-45-generic #48-Ubuntu
[Fri Mar 1 10:29:56 2019] WARNING: CPU: 12 PID: 0 at /build/linux-uQJ2um/linux-4.15.0/include/net/dst.h:256 nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[Fri Mar 1 10:29:56 2019] CPU: 12 PID: 0 Comm: swapper/12 Tainted: G W OE 4.15.0-45-generic #48-Ubuntu
[Fri Mar 1 10:53:01 2019] WARNING: CPU: 13 PID: 0 at /build/linux-uQJ2um/linux-4.15.0/include/net/dst.h:256 nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[Fri Mar 1 10:53:01 2019] WARNING: CPU: 1 PID: 18428 at /build/linux-uQJ2um/linux-4.15.0/include/net/dst.h:256 nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[Fri Mar 1 10:53:01 2019] WARNING: CPU: 5 PID: 18424 at /build/linux-uQJ2um/linux-4.15.0/include/net/dst.h:256 nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[Fri Mar 1 10:53:01 2019] CPU: 1 PID: 18428 Comm: haproxy Tainted: G W OE 4.15.0-45-generic #48-Ubuntu
[Fri Mar 1 10:53:01 2019] CPU: 5 PID: 18424 Comm: haproxy Tainted: G W OE 4.15.0-45-generic #48-Ubuntu
[Fri Mar 1 10:53:01 2019] CPU: 13 PID: 0 Comm: swapper/13 Tainted: G W OE 4.15.0-45-generic #48-Ubuntu
[Fri Mar 1 10:53:50 2019] WARNING: CPU: 5 PID: 0 at /build/linux-uQJ2um/linux-4.15.0/include/net/dst.h:256 nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[Fri Mar 1 10:53:50 2019] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W OE 4.15.0-45-generic #48-Ubuntu
[Fri Mar 1 12:29:32 2019] WARNING: CPU: 22 PID: 0 at /build/linux-uQJ2um/linux-4.15.0/include/net/dst.h:256 nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[Fri Mar 1 12:29:32 2019] CPU: 22 PID: 0 Comm: swapper/22 Tainted: G W OE 4.15.0-45-generic #48-Ubuntu
[Fri Mar 1 13:01:20 2019] WARNING: CPU: 31 PID: 20932 at /build/linux-uQJ2um/linux-4.15.0/include/net/dst.h:256 nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[Fri Mar 1 13:01:20 2019] CPU: 31 PID: 20932 Comm: thread.rb:70 Tainted: G W OE 4.15.0-45-generic #48-Ubuntu
[Fri Mar 1 13:01:23 2019] WARNING: CPU: 14 PID: 18429 at /build/linux-uQJ2um/linux-4.15.0/include/net/dst.h:256 nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[Fri Mar 1 13:01:23 2019] CPU: 14 PID: 18429 Comm: haproxy Tainted: G W OE 4.15.0-45-generic #48-Ubuntu
[Fri Mar 1 13:01:23 2019] WARNING: CPU: 31 PID: 7532 at /build/linux-uQJ2um/linux-4.15.0/include/net/dst.h:256 nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[Fri Mar 1 13:01:23 2019] CPU: 31 PID: 7532 Comm: telegraf Tainted: G W OE 4.15.0-45-generic #48-Ubuntu
[Fri Mar 1 13:16:28 2019] WARNING: CPU: 32 PID: 7842 at /build/linux-uQJ2um/linux-4.15.0/include/net/dst.h:256 nf_xfrm_me_harder+0x127/0x140 [nf_nat]
[Fri Mar 1 13:16:28 2019] CPU: 32 PID: 7842 Comm: haproxy Tainted: G W OE 4.15.0-45-generic #48-Ubuntu
[...

Revision history for this message
roobesh ganapathy mohandass (roobesh) wrote :
Download full text (24.0 KiB)

# lspci
00:00.0 Host bridge: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D DMI2 (rev 01)
00:01.0 PCI bridge: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 1 (rev 01)
00:02.0 PCI bridge: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 2 (rev 01)
00:02.2 PCI bridge: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 2 (rev 01)
00:03.0 PCI bridge: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D PCI Express Root Port 3 (rev 01)
00:04.0 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D Crystal Beach DMA Channel 0 (rev 01)
00:04.1 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D Crystal Beach DMA Channel 1 (rev 01)
00:04.2 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D Crystal Beach DMA Channel 2 (rev 01)
00:04.3 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D Crystal Beach DMA Channel 3 (rev 01)
00:04.4 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D Crystal Beach DMA Channel 4 (rev 01)
00:04.5 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D Crystal Beach DMA Channel 5 (rev 01)
00:04.6 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D Crystal Beach DMA Channel 6 (rev 01)
00:04.7 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D Crystal Beach DMA Channel 7 (rev 01)
00:05.0 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D Map/VTd_Misc/System Management (rev 01)
00:05.1 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D IIO Hot Plug (rev 01)
00:05.2 System peripheral: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D IIO RAS/Control Status/Global Errors (rev 01)
00:05.4 PIC: Intel Corporation Xeon E7 v4/Xeon E5 v4/Xeon E3 v4/Xeon D I/O APIC (rev 01)
00:11.0 Unassigned class [ff00]: Intel Corporation C610/X99 series chipset SPSR (rev 05)
00:11.1 SMBus: Intel Corporation C610/X99 series chipset MS SMBus 0 (rev 05)
00:11.4 SATA controller: Intel Corporation C610/X99 series chipset sSATA Controller [AHCI mode] (rev 05)
00:14.0 USB controller: Intel Corporation C610/X99 series chipset USB xHCI Host Controller (rev 05)
00:16.0 Communication controller: Intel Corporation C610/X99 series chipset MEI Controller #1 (rev 05)
00:16.1 Communication controller: Intel Corporation C610/X99 series chipset MEI Controller #2 (rev 05)
00:1a.0 USB controller: Intel Corporation C610/X99 series chipset USB Enhanced Host Controller #2 (rev 05)
00:1c.0 PCI bridge: Intel Corporation C610/X99 series chipset PCI Express Root Port #1 (rev d5)
00:1c.3 PCI bridge: Intel Corporation C610/X99 series chipset PCI Express Root Port #4 (rev d5)
00:1d.0 USB controller: Intel Corporation C610/X99 series chipset USB Enhanced Host Controller #1 (rev 05)
00:1f.0 ISA bridge: Intel Corporation C610/X99 series chipset LPC Controller (rev 05)
00:1f.2 SATA controller: Intel Corporation C610/X99 series chipset 6-Port SATA Controller [AHCI mode] (rev 05)
00:1f.3 SMBus: Intel Corporation C610/X99 serie...

Revision history for this message
Kai-Heng Feng (kaihengfeng) wrote :

Seems like it's fixed by this commit:
commit 542fbda0f08f1cbbc250f9e59f7537649651d0c8
Author: Florian Westphal <email address hidden>
Date: Tue Dec 11 07:45:29 2018 +0100

    netfilter: nat: can't use dst_hold on noref dst

    The dst entry might already have a zero refcount, waiting on rcu list
    to be free'd. Using dst_hold() transitions its reference count to 1, and
    next dst release will try to free it again -- resulting in a double free:

      WARNING: CPU: 1 PID: 0 at include/net/dst.h:239 nf_xfrm_me_harder+0xe7/0x130 [nf_nat]
      RIP: 0010:nf_xfrm_me_harder+0xe7/0x130 [nf_nat]
      Code: 48 8b 5c 24 60 65 48 33 1c 25 28 00 00 00 75 53 48 83 c4 68 5b 5d 41 5c c3 85 c0 74 0d 8d 48 01 f0 0f b1 0a 74 86 85 c0 75 f3 <0f> 0b e9 7b ff ff ff 29 c6 31 d2 b9 20 00 48 00 4c 89 e7 e8 31 27
      Call Trace:
      nf_nat_ipv4_out+0x78/0x90 [nf_nat_ipv4]
      nf_hook_slow+0x36/0xd0
      ip_output+0x9f/0xd0
      ip_forward+0x328/0x440
      ip_rcv+0x8a/0xb0

    Use dst_hold_safe instead and bail out if we cannot take a reference.

    Fixes: a4c2fd7f7891 ("net: remove DST_NOCACHE flag")
    Reported-by: Martin Zaharinov <email address hidden>
    Signed-off-by: Florian Westphal <email address hidden>
    Signed-off-by: Pablo Neira Ayuso <email address hidden>

Try kernel >= v4.20 to verify.

Revision history for this message
roobesh ganapathy mohandass (roobesh) wrote :

We have upgraded to kernel v4.20.15 and its going to be more than a week, we are not seeing any more call traces in this kernel logs. We will watch for few more days and i will report back here.

Revision history for this message
norman shen (jshen28) wrote :

Hi, sorry to ask non related questions. But could youe please teach how did you identity there is a negative refcnts from the dmesg? thanks...

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.