memory is leaked when tasks are moved to net_prio

Bug #1886859 reported by Thadeu Lima de Souza Cascardo on 2020-07-08
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned
Bionic
Medium
Thadeu Lima de Souza Cascardo
Focal
Medium
Thadeu Lima de Souza Cascardo

Bug Description

[Impact]
In some container scenarios, there will be a memory leak, leading to OOM.

[Test case]
Run the following:

while true ; do mkdir net_prio/a unified/a ; bash -c 'echo $$ > unified/a/cgroup.procs ; echo $$ > net_prio/a/tasks ; ping -c 1 localhost > /dev/null' ; rmdir net_prio/a unified/a ; done

Or the attached program cgroup_leak.c, which is faster. A leak would be produced without the fix, while there should be no constant leak with the fix applied.

[Potential regression]
This patch has also caused breakage with BPF cgroup in the past, when racing with its disabling, when one attachs a process to netprio cgroup. Similar breakage could happen.

---------------------------------

When net_prio is used without setting ifpriomap and BFP cgroup is used, memory may be leaked. This was fixed by upstream commit 090e28b229af92dc5b40786ca673999d59e73056, but it had to be reverted to fix LP #1886668.

When a real fix for this cgroup BFP crash lands, this patch should be reinstated.

Cascardo.

CVE References

Changed in linux (Ubuntu Bionic):
status: New → Confirmed
importance: Undecided → Medium
Changed in linux (Ubuntu):
status: New → Invalid
Changed in linux (Ubuntu Focal):
status: New → Confirmed
assignee: nobody → Thadeu Lima de Souza Cascardo (cascardo)
Changed in linux (Ubuntu Bionic):
assignee: nobody → Thadeu Lima de Souza Cascardo (cascardo)
Changed in linux (Ubuntu Focal):
importance: Undecided → Medium
description: updated
Changed in linux (Ubuntu Bionic):
status: Confirmed → In Progress
Changed in linux (Ubuntu Focal):
status: Confirmed → In Progress
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Focal):
status: In Progress → Fix Committed

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic

Running the attached program on 4.15.0-128-generic, memory use builds up. This does not happen with 4.15.0-129-generic. Marking as verification-done-bionic.

tags: added: verification-done-bionic
removed: verification-needed-bionic

Same thing on focal, when running attached test program. Memory leak observed when running 5.4.0-58, not seen anymore with 5.4.0-59.

tags: added: verification-done-focal
removed: verification-needed-focal
Launchpad Janitor (janitor) wrote :
Download full text (32.1 KiB)

This bug was fixed in the package linux - 5.4.0-59.65

---------------
linux (5.4.0-59.65) focal; urgency=medium

  * focal/linux: 5.4.0-59.65 -proposed tracker (LP: #1907604)

  * focal: selftests/bpf build broken: test_map_init.skel.h: No such file or
    directory (LP: #1906866)
    - SAUCE: Revert selftests/ "bpf: Zero-fill re-used per-cpu map element"

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * memory is leaked when tasks are moved to net_prio (LP: #1886859)
    - netprio_cgroup: Fix unlimited memory leak of v2 cgroups

  * Focal update: v5.4.78 upstream stable release (LP: #1905618)
    - drm/i915/gem: Flush coherency domains on first set-domain-ioctl
    - time: Prevent undefined behaviour in timespec64_to_ns()
    - nbd: don't update block size after device is started
    - KVM: arm64: Force PTE mapping on fault resulting in a device mapping
    - PCI: qcom: Make sure PCIe is reset before init for rev 2.1.0
    - usb: dwc3: gadget: Continue to process pending requests
    - usb: dwc3: gadget: Reclaim extra TRBs after request completion
    - btrfs: tracepoints: output proper root owner for trace_find_free_extent()
    - btrfs: sysfs: init devices outside of the chunk_mutex
    - btrfs: reschedule when cloning lots of extents
    - ASoC: Intel: kbl_rt5663_max98927: Fix kabylake_ssp_fixup function
    - genirq: Let GENERIC_IRQ_IPI select IRQ_DOMAIN_HIERARCHY
    - hv_balloon: disable warning when floor reached
    - net: xfrm: fix a race condition during allocing spi
    - ASoC: codecs: wcd9335: Set digital gain range correctly
    - xfs: set xefi_discard when creating a deferred agfl free log intent item
    - netfilter: use actual socket sk rather than skb sk when routing harder
    - netfilter: nf_tables: missing validation from the abort path
    - netfilter: ipset: Update byte and packet counters regardless of whether they
      match
    - powerpc/eeh_cache: Fix a possible debugfs deadlock
    - perf trace: Fix segfault when trying to trace events by cgroup
    - perf tools: Add missing swap for ino_generation
    - ALSA: hda: prevent undefined shift in snd_hdac_ext_bus_get_link()
    - iommu/vt-d: Fix a bug for PDP check in prq_event_thread
    - afs: Fix warning due to unadvanced marshalling pointer
    - can: rx-offload: don't call kfree_skb() from IRQ context
    - can: dev: can_get_echo_skb(): prevent call to kfree_skb() in hard IRQ
      context
    - can: dev: __can_get_echo_skb(): fix real payload length return value for RTR
      frames
    - can: can_create_echo_skb(): fix echo skb generation: always use skb_clone()
    - can: j1939: swap addr and pgn in the send example
    - can: j1939: j1939_sk_bind(): return failure if netdev is down
    - can: ti_hecc: ti_hecc_probe(): add missed clk_disable_unprepare() in error
      path
    - can: xilinx_can: handle failure cases of pm_runtime_get_sync
    - can: peak_usb: add range checking in decode operations
    - can: peak_usb: peak_usb_get_ts_time(): fix timestamp wrapping
    - can: peak_canfd: pucan_handle_can_rx(): fix echo management when loopback is
      on
    - can: flexcan: remove FLEXCAN_QUIRK_DISABLE_MECR quirk for LS1021A
    - c...

Changed in linux (Ubuntu Focal):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (28.8 KiB)

This bug was fixed in the package linux - 4.15.0-129.132

---------------
linux (4.15.0-129.132) bionic; urgency=medium

  * bionic/linux: 4.15.0-129.132 -proposed tracker (LP: #1907635)

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * Ubuntu 18.04- call trace in kernel buffer when unloading ib_ipoib module
    (LP: #1904848)
    - SAUCE: net/mlx5e: IPoIB, initialize update_stat_work for ipoib devices

  * memory is leaked when tasks are moved to net_prio (LP: #1886859)
    - netprio_cgroup: Fix unlimited memory leak of v2 cgroups

  * s390: dbginfo.sh triggers kernel panic, reading from
    /sys/kernel/mm/page_idle/bitmap (LP: #1904884)
    - mm/page_idle.c: skip offline pages

  * Bionic update: upstream stable patchset 2020-11-23 (LP: #1905333)
    - drm/i915: Break up error capture compression loops with cond_resched()
    - tipc: fix use-after-free in tipc_bcast_get_mode
    - gianfar: Replace skb_realloc_headroom with skb_cow_head for PTP
    - gianfar: Account for Tx PTP timestamp in the skb headroom
    - net: usb: qmi_wwan: add Telit LE910Cx 0x1230 composition
    - sctp: Fix COMM_LOST/CANT_STR_ASSOC err reporting on big-endian platforms
    - sfp: Fix error handing in sfp_probe()
    - Blktrace: bail out early if block debugfs is not configured
    - i40e: Fix of memory leak and integer truncation in i40e_virtchnl.c
    - Fonts: Replace discarded const qualifier
    - ALSA: usb-audio: Add implicit feedback quirk for Qu-16
    - lib/crc32test: remove extra local_irq_disable/enable
    - kthread_worker: prevent queuing delayed work from timer_fn when it is being
      canceled
    - mm: always have io_remap_pfn_range() set pgprot_decrypted()
    - gfs2: Wake up when sd_glock_disposal becomes zero
    - ftrace: Fix recursion check for NMI test
    - ftrace: Handle tracing when switching between context
    - tracing: Fix out of bounds write in get_trace_buf
    - futex: Handle transient "ownerless" rtmutex state correctly
    - ARM: dts: sun4i-a10: fix cpu_alert temperature
    - x86/kexec: Use up-to-dated screen_info copy to fill boot params
    - of: Fix reserved-memory overlap detection
    - blk-cgroup: Fix memleak on error path
    - blk-cgroup: Pre-allocate tree node on blkg_conf_prep
    - scsi: core: Don't start concurrent async scan on same host
    - vsock: use ns_capable_noaudit() on socket create
    - drm/vc4: drv: Add error handding for bind
    - ACPI: NFIT: Fix comparison to '-ENXIO'
    - vt: Disable KD_FONT_OP_COPY
    - fork: fix copy_process(CLONE_PARENT) race with the exiting ->real_parent
    - serial: 8250_mtk: Fix uart_get_baud_rate warning
    - serial: txx9: add missing platform_driver_unregister() on error in
      serial_txx9_init
    - USB: serial: cyberjack: fix write-URB completion race
    - USB: serial: option: add Quectel EC200T module support
    - USB: serial: option: add LE910Cx compositions 0x1203, 0x1230, 0x1231
    - USB: serial: option: add Telit FN980 composition 0x1055
    - USB: Add NO_LPM quirk for Kingston flash drive
    - usb: mtu3: fix panic in mtu3_gadget_stop()
    - ARC: stack unwinding: avoid indefinite looping
    - Revert "ARC: entry: fix potential EFA c...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments