Fix fragmentation support for TC connection tracking

Bug #1940872 reported by Bodong Wang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux-bluefield (Ubuntu)
Undecided
Unassigned
Focal
Medium
Bodong Wang

Bug Description

* Explain the bug(s)
When using OVS with tc to offload connection tracking flows, sending udp/icmp fragmented traffic will cause call trace with NULL dereference.

[ 7229.433005] Modules linked in: act_tunnel_key act_csum act_pedit xt_nat netconsole rpcsec_gss_krb5 act_ct nf_flow_table xt_conntrack xt_MASQUERADE nf_conntrack_netlink xt_addrtype iptable_filter iptable_nat bpfilter br_netfilter bridge overlay sbsa_gwdt xfrm_user xfrm_algo target_core_mod ipmi_devintf ipmi_msghandler mst_pciconf(OE) 8021q garp stp mrp llc act_skbedit act_mirred ib_ipoib(OE) geneve ip6_udp_tunnel udp_tunnel nfnetlink_cttimeout nfnetlink act_gact cls_flower sch_ingress openvswitch nsh nf_conncount nf_nat ib_umad(OE) binfmt_misc dm_multipath mlx5_ib(OE) uio_pdrv_genirq uio mlxbf_pmc mlxbf_pka mlx_trio bluefield_edac mlx_bootctl(OE) sch_fq_codel rdma_ucm(OE) ib_uverbs(OE) rdma_cm(OE) iw_cm(OE) ib_cm(OE) ib_core(OE) ip_tables ipv6 crc_ccitt btrfs zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor xor_neon raid6_pq raid1 raid0 mlx5_core(OE) crct10dif_ce mlxfw(OE) psample mlxdevm(OE) auxiliary(OE) mlx_compat(OE) i2c_mlxbf(OE)
[ 7229.433074] gpio_mlxbf2(OE) mlxbf_gige(OE) aes_neon_bs aes_neon_blk [last unloaded: mst_pci]
[ 7229.433083] CPU: 4 PID: 1602 Comm: handler6 Tainted: G OE 5.4.0-1017-bluefield #20-Ubuntu
[ 7229.433085] Hardware name: https://www.mellanox.com BlueField SoC/BlueField SoC, BIOS BlueField:3.7.1-7-g9964f06 Aug 5 2021
[ 7229.433087] pstate: 60000005 (nZCv daif -PAN -UAO)
[ 7229.433101] pc : inet_frag_rbtree_purge+0x58/0x88
[ 7229.433103] lr : inet_frag_rbtree_purge+0x6c/0x88
[ 7229.433104] sp : ffff800013273500
[ 7229.433105] x29: ffff800013273500 x28: ffff00037b899e80
[ 7229.433107] x27: 0000000000000018 x26: ffff0003b6da2228
[ 7229.433109] x25: ffff0003b6da2200 x24: ffff80001191e140
[ 7229.433111] x23: ffff80001191e140 x22: ffff00037d6a56a8
[ 7229.433113] x21: 0000000000000000 x20: 0000000000000300
[ 7229.433114] x19: 0000000100000000 x18: 0000000000000000
[ 7229.433116] x17: 0000000000000000 x16: 0000000000000000
[ 7229.433118] x15: 0000000000000000 x14: ffff80000944e960
[ 7229.433119] x13: 0000000000000001 x12: ffff80000944e5e0
[ 7229.433121] x11: 0000000000000008 x10: 0000000000000000
[ 7229.433123] x9 : 0000000000000000 x8 : ffff0003b97ab3c0
[ 7229.433124] x7 : 0000000000000000 x6 : 000000005464ccee
[ 7229.433126] x5 : ffff800010be50a8 x4 : fffffe000dd9d820
[ 7229.433127] x3 : 0000000080200005 x2 : fffffe000dd9d820
[ 7229.433129] x1 : 0000000000000000 x0 : 0000000000000000
[ 7229.433131] Call trace:
[ 7229.433134] inet_frag_rbtree_purge+0x58/0x88
[ 7229.433138] ip_frag_queue+0x2d0/0x610
[ 7229.433139] ip_defrag+0xd0/0x170
[ 7229.433156] ovs_ct_execute+0x3f8/0x720 [openvswitch]
[ 7229.433160] Unable to handle kernel paging request at virtual address 00000001000000d0
[ 7229.433166] do_execute_actions+0x7b4/0xa80 [openvswitch]
[ 7229.433167] Mem abort info:
[ 7229.433172] ovs_execute_actions+0x74/0x188 [openvswitch]
[ 7229.433173] ESR = 0x96000004
[ 7229.433178] ovs_packet_cmd_execute+0x228/0x2a8 [openvswitch]
[ 7229.433180] EC = 0x25: DABT (current EL), IL = 32 bits
[ 7229.433183] genl_family_rcv_msg+0x1a4/0x3d8
[ 7229.433184] SET = 0, FnV = 0
[ 7229.433186] genl_rcv_msg+0x64/0xd8

 * brief explanation of fixes
The series contains 7 patches from upstream which fix act_ct handling of fragmented Packets.

* How to test
Create OVS bridge with 2 representors (uplink and BlueField representor for example).
Enable HW offload and configure connection tracking OpenFlow rules.
Send udp/icmp traffic from the VF with packet size larger then MTU.
Without the commits, call trace will appear in dmesg.

* What it could break.
Bug fix, doesn't break other functionality

CVE References

Stefan Bader (smb)
Changed in linux-bluefield (Ubuntu Focal):
assignee: nobody → Bodong Wang (bodong-wang)
importance: Undecided → Medium
status: New → In Progress
Changed in linux-bluefield (Ubuntu):
status: New → Invalid
Changed in linux-bluefield (Ubuntu Focal):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the linux-bluefield/5.4.0-1019.22 kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (95.1 KiB)

This bug was fixed in the package linux-bluefield - 5.4.0-1019.22

---------------
linux-bluefield (5.4.0-1019.22) focal; urgency=medium

  * focal/linux-bluefield: 5.4.0-1019.22 -proposed tracker (LP: #1942533)

  * Focal update: v5.4.134 upstream stable release (LP: #1939440)
    - [Config] bluefield: CONFIG_BATTERY_RT5033=m

  * Fix fragmentation support for TC connection tracking (LP: #1940872)
    - net/sched: act_ct: fix restore the qdisc_skb_cb after defrag
    - net/sched: act_ct: fix miss set mru for ovs after defrag in act_ct
    - net/sched: fix miss init the mru in qdisc_skb_cb
    - net/sched: act_ct: fix wild memory access when clearing fragments
    - Revert "net/sched: act_ct: Fix skb double-free in tcf_ct_handle_fragments()
      error flow"
    - net/sched: act_mirred: refactor the handle of xmit
    - net/sched: The error lable position is corrected in ct_init_module
    - net/sched: sch_frag: add generic packet fragment support.
    - ipv6: add ipv6_fragment hook in ipv6_stub

  * Add the upcoming BlueField-3 device ID (LP: #1941803)
    - net/mlx5: Update the list of the PCI supported devices

  * CT state not reset when packet redirected to different port (LP: #1940448)
    - Revert "UBUNTU: SAUCE: net/sched: act_mirred: Reset ct when reinserting skb
      back into queue"
    - net: sched: act_mirred: Reset ct info when mirror/redirect skb

  * Export xfrm_policy_lookup_bytype function (LP: #1934313)
    - SAUCE: xfrm: IPsec Export xfrm_policy_lookup_bytype function

  [ Ubuntu: 5.4.0-85.95 ]

  * focal/linux: 5.4.0-85.95 -proposed tracker (LP: #1942557)
  * please drop virtualbox-guest-dkms virtualbox-guest-source (LP: #1933248)
    - [Config] Disable virtualbox dkms build
  * Packaging resync (LP: #1786013)
    - debian/dkms-versions -- update from kernel-versions (main/2021.09.06)
  * LRMv5: switch primary version handling to kernel-versions data set
    (LP: #1928921)
    - [Packaging] switch to kernel-versions
  * disable “CONFIG_HISI_DMA” config for ubuntu version (LP: #1936771)
    - Disable CONFIG_HISI_DMA
    - [Config] Record hisi_dma no longer built for arm64
  * memory leaking when removing a profile (LP: #1939915)
    - apparmor: Fix memory leak of profile proxy
  * CryptoExpress EP11 cards are going offline (LP: #1939618)
    - s390/zcrypt: Support for CCA protected key block version 2
    - s390: Replace zero-length array with flexible-array member
    - s390/zcrypt: Use scnprintf() for avoiding potential buffer overflow
    - s390/zcrypt: replace snprintf/sprintf with scnprintf
    - s390/ap: Remove ap device suspend and resume callbacks
    - s390/zcrypt: use fallthrough;
    - s390/zcrypt: use kvmalloc instead of kmalloc for 256k alloc
    - s390/ap: remove power management code from ap bus and drivers
    - s390/ap: introduce new ap function ap_get_qdev()
    - s390/zcrypt: use kzalloc
    - s390/zcrypt: fix smatch warnings
    - s390/zcrypt: code beautification and struct field renames
    - s390/zcrypt: split ioctl function into smaller code units
    - s390/ap: rename and clarify ap state machine related stuff
    - s390/zcrypt: provide cex4 cca sysfs attributes for cex3
    - s390/ap: rework cry...

Changed in linux-bluefield (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers