improper memcg accounting causes NULL pointer derefs

Bug #1918668 reported by Thadeu Lima de Souza Cascardo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
Undecided
Unassigned
Groovy
Fix Released
Critical
Thadeu Lima de Souza Cascardo

Bug Description

[Impact]
BUGs/panics/memory corruption, leading to unbootable systems, or systems hanging when doing IO.

[Test case]
Boot a groovy system and run update-grub, do a new kernel install.

[Fix]
Revert the commit that did an improper memcg accounting, leading to refcounts going past 0.

[Regression potential]
memcg accounting can be wrong, leading to either containers being more or less restricted in memory then they are supposed to.

=============================================================

After booting with groovy:linux master-next branch as of 2021-03-10, NULL pointer dereferences are seen.

One of them is like the one below:

[ 10.012503] BUG: kernel NULL pointer dereference, address: 0000000000000518
[ 10.030761] #PF: supervisor read access in kernel mode
[ 10.042518] #PF: error_code(0x0000) - not-present page
[ 10.050165] PGD 0 P4D 0
[ 10.077050] Oops: 0000 [#1] SMP PTI
[ 10.081927] CPU: 0 PID: 516 Comm: kexec-load Tainted: G W 5.8.0-45-generic #51
[ 10.092486] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1 04/01/2014
[ 10.103510] RIP: 0010:__mod_memcg_state.part.0+0xc/0x90
[ 10.115100] Code: f0 56 d0 ba e8 f5 9e 2e 00 5b 41 5c 41 5d 5d c3 4c 8b 25 ff 52 99 01 e9 76 ff ff ff 0f 0b 0f 1f 44 00 00 48 63 d2 55 48 63 f6 <48> 8b 87 18 05 00 00 65 48 8b 0c f0 48 01 ca 48 c1 e6 03 49 89 d0
[ 10.145025] RSP: 0018:ffffab9780557ab0 EFLAGS: 00010096
[ 10.146841] RAX: ffffffffffffffe2 RBX: 0000000000000002 RCX: 0000000000032183
[ 10.149891] RDX: ffffffffffffffff RSI: 0000000000000002 RDI: 0000000000000000
[ 10.153006] RBP: ffffab9780557ae8 R08: ffffffffffffffff R09: 0000000000000004
[ 10.165999] R10: fffff30fc1cb2a88 R11: ffffffffffffffff R12: ffff88ec39f32400
[ 10.168142] R13: ffffffffffffffff R14: 0000000000000001 R15: ffff88ec3ffb2000
[ 10.170299] FS: 0000000000000000(0000) GS:ffff88ec3dc00000(0000) knlGS:0000000000000000
[ 10.172783] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 10.175285] CR2: 0000000000000518 CR3: 0000000078a7c000 CR4: 00000000000006f0
[ 10.178009] Call Trace:
[ 10.179133] ? __mod_lruvec_state+0x47/0xf0
[ 10.180897] __activate_page.part.0+0x125/0x290
[ 10.182665] __activate_page+0x3a/0x40
[ 10.184496] pagevec_lru_move_fn+0x9d/0xe0
[ 10.186124] ? __activate_page.part.0+0x290/0x290
[ 10.188030] lru_add_drain_cpu+0xeb/0x1b0
[ 10.190041] lru_add_drain+0x28/0x40
[ 10.194029] exit_mmap+0x82/0x1b0
[ 10.195400] ? get_file_caps.constprop.0+0xa2/0x150
[ 10.197578] ? _cond_resched+0x1a/0x50
[ 10.199834] ? mutex_lock+0x13/0x40
[ 10.201931] mmput+0x5f/0x140
[ 10.203772] exec_mmap+0x198/0x220
[ 10.205484] begin_new_exec+0x9e/0x2d0
[ 10.207132] load_elf_binary+0x7b2/0xe20
[ 10.209471] ? ima_bprm_check+0x89/0xb0
[ 10.211378] search_binary_handler+0xe1/0x270
[ 10.213590] exec_binprm+0x51/0x1a0
[ 10.215013] __do_execve_file+0x361/0x5b0
[ 10.216671] do_execve+0x27/0x30
[ 10.218596] __x64_sys_execve+0x2c/0x40
[ 10.220646] do_syscall_64+0x49/0xc0
[ 10.222729] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 10.226379] RIP: 0033:0x7f8881dafb7b
[ 10.228548] Code: Unable to access opcode bytes at RIP 0x7f8881dafb51.
[ 10.230985] RSP: 002b:00007fffa1572278 EFLAGS: 00000246 ORIG_RAX: 000000000000003b
[ 10.233907] RAX: ffffffffffffffda RBX: 00000000ffffffff RCX: 00007f8881dafb7b
[ 10.236543] RDX: 00005576aad6e7a8 RSI: 00005576aad6e788 RDI: 00005576aad6e7d8
[ 10.240265] RBP: 00005576aad6e788 R08: 00005576aad6e7d8 R09: feff5475a9d4ff72
[ 10.243031] R10: 00007f8881d76610 R11: 0000000000000246 R12: 00005576aa32447e
[ 10.245755] R13: 00005576aad6e7a8 R14: 00005576aad6e7a8 R15: 00005576aad6e7d8
[ 10.248772] Modules linked in: isofs binfmt_misc nls_iso8859_1 joydev input_leds serio_raw sch_fq_codel drm ip_tables x_tables autofs4 ahci psmouse libahci virtio_blk xhci_pci xhci_pci_renesas virtio_net net_failover failover
[ 10.258738] CR2: 0000000000000518
[ 10.260139] ---[ end trace f7c347003caf39b8 ]---

CVE References

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :
Download full text (4.0 KiB)

One other example:

[ 41.499636] BUG: kernel NULL pointer dereference, address: 0000000000000518
[ 41.506015] #PF: supervisor read access in kernel mode
[ 41.508850] #PF: error_code(0x0000) - not-present page
[ 41.510728] PGD 0 P4D 0
[ 41.511714] Oops: 0000 [#1] SMP PTI
[ 41.513040] CPU: 1 PID: 198 Comm: kworker/u8:4 Tainted: G W 5.8.0-45-generic #51
[ 41.516172] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-1 04/01/2014
[ 41.519019] Workqueue: writeback wb_workfn (flush-252:0)
[ 41.520954] RIP: 0010:__mod_memcg_state.part.0+0xc/0x90
[ 41.522845] Code: f0 56 30 93 e8 15 9f 2e 00 5b 41 5c 41 5d 5d c3 4c 8b 25 ff 52 99 01 e9 76 ff ff ff 0f 0b 0f 1f 44 00 00 48 63 d2 55 48 63 f6 <48> 8b 87 18 05 00 00 65 48 8b 0c f0 48 01 ca 48 c1 e6 03 49 89 d0
[ 41.536800] RSP: 0018:ffffabad803ff7d8 EFLAGS: 00010097
[ 41.540726] RAX: ffffffffffffffe2 RBX: 0000000000000011 RCX: 0000000000032192
[ 41.543210] RDX: ffffffffffffffff RSI: 0000000000000011 RDI: 0000000000000000
[ 41.545567] RBP: ffffabad803ff810 R08: ffffffffffffffff R09: ffff96e43801ec00
[ 41.547992] R10: 0000000000000000 R11: 0000000000001000 R12: ffff96e43801ec00
[ 41.550528] R13: ffffffffffffffff R14: 0000000000000000 R15: ffff96e43ffb2000
[ 41.552904] FS: 0000000000000000(0000) GS:ffff96e43dc80000(0000) knlGS:0000000000000000
[ 41.557020] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 41.559173] CR2: 0000000000000518 CR3: 0000000035f4c000 CR4: 00000000000006e0
[ 41.561005] Call Trace:
[ 41.561769] ? __mod_lruvec_state+0x47/0xf0
[ 41.562897] clear_page_dirty_for_io+0x187/0x200
[ 41.564111] mpage_submit_page+0x24/0x90
[ 41.565181] mpage_map_and_submit_buffers+0xe3/0x190
[ 41.566477] mpage_map_and_submit_extent+0x5a/0x200
[ 41.567732] ext4_writepages+0x671/0x860
[ 41.568882] ? update_load_avg+0x82/0x630
[ 41.570181] do_writepages+0x38/0xc0
[ 41.571320] ? write_inode+0x5c/0x100
[ 41.572625] __writeback_single_inode+0x40/0x230
[ 41.574046] writeback_sb_inodes+0x22a/0x4e0
[ 41.575380] __writeback_inodes_wb+0x56/0xf0
[ 41.576798] wb_writeback+0x201/0x2e0
[ 41.578252] wb_check_old_data_flush+0xb7/0xc0
[ 41.580364] wb_do_writeback+0xbe/0x180
[ 41.581989] ? set_worker_desc+0xa6/0xb0
[ 41.583553] wb_workfn+0x74/0x290
[ 41.589094] ? __switch_to+0x7f/0x380
[ 41.590524] ? __switch_to_asm+0x42/0x70
[ 41.591753] ? __switch_to_asm+0x36/0x70
[ 41.593102] process_one_work+0x1e8/0x3b0
[ 41.594571] worker_thread+0x50/0x370
[ 41.595935] kthread+0x12f/0x150
[ 41.597224] ? process_one_work+0x3b0/0x3b0
[ 41.598772] ? __kthread_bind_mask+0x70/0x70
[ 41.600473] ret_from_fork+0x22/0x30
[ 41.601997] Modules linked in: isofs binfmt_misc nls_iso8859_1 input_leds joydev serio_raw sch_fq_codel drm ip_tables x_tables autofs4 ahci xhci_pci xhci_pci_renesas psmouse virtio_net libahci net_failover virtio_blk failover
[ 41.609197] CR2: 0000000000000518
[ 41.610567] ---[ end trace 63fecb49c24b6bde ]---
[ 41.612023] RIP: 0010:__mod_memcg_state.part.0+0xc/0x90
[ 41.613631] Code: f0 56 30 93 e8 15 9f 2e 00 5b 41 5c 41 5d 5d c3 4c 8b 25 ff 52 99 01 e9 76 ff ff ff 0f 0b 0f 1...

Read more...

Changed in linux (Ubuntu):
status: New → Invalid
Changed in linux (Ubuntu Groovy):
status: New → In Progress
assignee: nobody → Thadeu Lima de Souza Cascardo (cascardo)
importance: Undecided → Critical
Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

Running update-grub/grub-probe seems to trigger it every time.

summary: - vm changes cause NULL pointer derefs
+ improper memcg accounting causes NULL pointer derefs
description: updated
Changed in linux (Ubuntu Groovy):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-groovy' to 'verification-done-groovy'. If the problem still exists, change the tag 'verification-needed-groovy' to 'verification-failed-groovy'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-groovy
Revision history for this message
Khaled El Mously (kmously) wrote :

To verify:
 - I installed groovy on 'durin' (in the maas pod), which installed with kernel -48.
 - Enabled -proposed, installed the new -49 kernel. (this automatically runs update-grub)
 - rebooted, confirmed that -49 is running.
 - ran update-grub manually.

No NULL dereferences were observed.
Marking as verified.

tags: added: verification-done-groovy
removed: verification-needed-groovy
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (28.3 KiB)

This bug was fixed in the package linux - 5.8.0-49.55

---------------
linux (5.8.0-49.55) groovy; urgency=medium

  * groovy/linux: 5.8.0-49.55 -proposed tracker (LP: #1921053)

  * selftests: bpf verifier fails after sanitize_ptr_alu fixes (LP: #1920995)
    - bpf: Simplify alu_limit masking for pointer arithmetic
    - bpf: Add sanity check for upper ptr_limit
    - bpf, selftests: Fix up some test_verifier cases for unprivileged

  * Packaging resync (LP: #1786013)
    - update dkms package versions

  * improper memcg accounting causes NULL pointer derefs (LP: #1918668)
    - SAUCE: Revert "mm: memcg/slab: optimize objcg stock draining"

  * kernel: Enable CONFIG_BPF_LSM on Ubuntu (LP: #1905975)
    - [Config] Enable CONFIG_BPF_LSM

  * Groovy update: upstream stable patchset 2021-03-10 (LP: #1918516)
    - gpio: mvebu: fix pwm .get_state period calculation
    - HID: wacom: Correct NULL dereference on AES pen proximity
    - media: v4l2-subdev.h: BIT() is not available in userspace
    - RDMA/vmw_pvrdma: Fix network_hdr_type reported in WC
    - kernel/io_uring: cancel io_uring before task works
    - io_uring: dont kill fasync under completion_lock
    - objtool: Don't fail on missing symbol table
    - mm/page_alloc: add a missing mm_page_alloc_zone_locked() tracepoint
    - mm: fix a race on nr_swap_pages
    - tools: Factor HOSTCC, HOSTLD, HOSTAR definitions
    - iwlwifi: provide gso_type to GSO packets
    - tty: avoid using vfs_iocb_iter_write() for redirected console writes
    - ACPI: sysfs: Prefer "compatible" modalias
    - kernel: kexec: remove the lock operation of system_transition_mutex
    - ALSA: hda/realtek: Enable headset of ASUS B1400CEPE with ALC256
    - ALSA: hda/via: Apply the workaround generically for Clevo machines
    - parisc: Enable -mlong-calls gcc option by default when !CONFIG_MODULES
    - media: cec: add stm32 driver
    - media: hantro: Fix reset_raw_fmt initialization
    - media: rc: fix timeout handling after switch to microsecond durations
    - media: rc: ite-cir: fix min_timeout calculation
    - media: rc: ensure that uevent can be read directly after rc device register
    - ARM: dts: tbs2910: rename MMC node aliases
    - ARM: dts: ux500: Reserve memory carveouts
    - ARM: dts: imx6qdl-gw52xx: fix duplicate regulator naming
    - wext: fix NULL-ptr-dereference with cfg80211's lack of commit()
    - ASoC: AMD Renoir - refine DMI entries for some Lenovo products
    - drm/i915: Always flush the active worker before returning from the wait
    - drm/i915/gt: Always try to reserve GGTT address 0x0
    - drivers/nouveau/kms/nv50-: Reject format modifiers for cursor planes
    - net: usb: qmi_wwan: added support for Thales Cinterion PLSx3 modem family
    - s390: uv: Fix sysfs max number of VCPUs reporting
    - s390/vfio-ap: No need to disable IRQ after queue reset
    - PM: hibernate: flush swap writer after marking
    - x86/entry: Emit a symbol for register restoring thunk
    - efi/apple-properties: Reinstate support for boolean properties
    - drivers: soc: atmel: Avoid calling at91_soc_init on non AT91 SoCs
    - drivers: soc: atmel: add null entry at the end of at91_soc_allowed_list[]
   ...

Changed in linux (Ubuntu Groovy):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.