ubunut_kernel_selftests: memory-hotplug: avoid spamming logs with dump_page()

Bug #1941829 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
Undecided
Po-Hsu Lin
linux (Ubuntu)
Undecided
Po-Hsu Lin
Bionic
Undecided
Po-Hsu Lin
Focal
Undecided
Po-Hsu Lin
Hirsute
Undecided
Po-Hsu Lin
Impish
Undecided
Po-Hsu Lin
linux-oem-5.10 (Ubuntu)
Undecided
Unassigned
Bionic
Undecided
Unassigned
Focal
Undecided
Po-Hsu Lin
Hirsute
Undecided
Unassigned
Impish
Undecided
Unassigned
linux-oem-5.13 (Ubuntu)
Undecided
Unassigned
Bionic
Undecided
Unassigned
Focal
Undecided
Po-Hsu Lin
Hirsute
Undecided
Unassigned
Impish
Undecided
Unassigned

Bug Description

[Impact]
The memory-hotplug test has been intermittently timing out (or trashing the test
VM, see below) on Impish/Hirsute ppc64el and x86-64 for quite some time now.

While the offline memory test obey ratio limit, the same test with
error injection does not and tries to offline all the hotpluggable
memory, spamming system logs with hundreds of thousands of dump_page()
entries, slowing system down (to the point the test itself timesout and
gets terminated) and excessive fs occupation:

...
[ 9784.393354] page:c00c0000007d1b40 refcount:3 mapcount:0 mapping:c0000001fc03e950 index:0xe7b
[ 9784.393355] def_blk_aops
[ 9784.393356] flags: 0x3ffff800002062(referenced|active|workingset|private)
[ 9784.393358] raw: 003ffff800002062 c0000001b9343a68 c0000001b9343a68 c0000001fc03e950
[ 9784.393359] raw: 0000000000000e7b c000000006607b18 00000003ffffffff c00000000490d000
[ 9784.393359] page dumped because: migration failure
[ 9784.393360] page->mem_cgroup:c00000000490d000
[ 9784.393416] migrating pfn 1f46d failed ret:1
...

$ grep "page dumped because: migration failure" /var/log/kern.log | wc -l
2405558

$ ls -la /var/log/kern.log
-rw-r----- 1 syslog adm 2256109539 Jun 30 14:19 /var/log/kern.log

[Fix]
* 0c0f6299ba71fa selftests: memory-hotplug: avoid spamming logs with dump_page(), ratio limit hot-remove error test

[Test Plan]
Run the memory-hotplug test, this log spamming issue should not happen again.

[Where problems could occur]
If this fix is incorrect we might be unable to catch some other issue.

CVE References

Po-Hsu Lin (cypressyew)
tags: added: ubuntu-kernel-selftests
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1941829

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Po-Hsu Lin (cypressyew)
Changed in linux-oem-5.10 (Ubuntu Bionic):
status: New → Invalid
Changed in ubuntu-kernel-tests:
assignee: nobody → Po-Hsu Lin (cypressyew)
status: New → In Progress
Changed in linux-oem-5.10 (Ubuntu Hirsute):
status: New → Invalid
Changed in linux-oem-5.10 (Ubuntu Impish):
status: New → Invalid
Changed in linux (Ubuntu Bionic):
assignee: nobody → Po-Hsu Lin (cypressyew)
status: New → In Progress
Changed in linux (Ubuntu Focal):
assignee: nobody → Po-Hsu Lin (cypressyew)
status: New → In Progress
Changed in linux (Ubuntu Hirsute):
assignee: nobody → Po-Hsu Lin (cypressyew)
status: New → In Progress
Changed in linux (Ubuntu Impish):
assignee: nobody → Po-Hsu Lin (cypressyew)
status: Incomplete → In Progress
Changed in linux-oem-5.13 (Ubuntu Bionic):
status: New → Invalid
Changed in linux-oem-5.13 (Ubuntu Hirsute):
status: New → Invalid
Changed in linux-oem-5.13 (Ubuntu Impish):
status: New → Invalid
Changed in linux-oem-5.10 (Ubuntu Focal):
assignee: nobody → Po-Hsu Lin (cypressyew)
status: New → In Progress
Changed in linux-oem-5.13 (Ubuntu Focal):
assignee: nobody → Po-Hsu Lin (cypressyew)
status: New → In Progress
Po-Hsu Lin (cypressyew)
tags: added: 4.15 5.10 5.11 5.13 5.4 bionic focal hirsute
tags: added: impish
Changed in linux (Ubuntu Bionic):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Focal):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Hirsute):
status: In Progress → Fix Committed
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-focal' to 'verification-done-focal'. If the problem still exists, change the tag 'verification-needed-focal' to 'verification-failed-focal'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-focal
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-hirsute' to 'verification-done-hirsute'. If the problem still exists, change the tag 'verification-needed-hirsute' to 'verification-failed-hirsute'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-hirsute
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-bionic' to 'verification-done-bionic'. If the problem still exists, change the tag 'verification-needed-bionic' to 'verification-failed-bionic'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-bionic
AceLan Kao (acelankao)
Changed in linux-oem-5.10 (Ubuntu Focal):
status: In Progress → Fix Committed
Timo Aaltonen (tjaalton)
Changed in linux-oem-5.13 (Ubuntu Focal):
status: In Progress → Fix Committed
Revision history for this message
Po-Hsu Lin (cypressyew) wrote :

Didn't see this issue on B/F/H

tags: added: verification-done-bionic verification-done-focal verification-done-hirsute
removed: verification-needed-bionic verification-needed-focal verification-needed-hirsute
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (34.1 KiB)

This bug was fixed in the package linux - 5.4.0-88.99

---------------
linux (5.4.0-88.99) focal; urgency=medium

  * focal/linux: 5.4.0-88.99 -proposed tracker (LP: #1944747)

  * Packaging resync (LP: #1786013)
    - debian/dkms-versions -- update from kernel-versions (main/2021.09.06)

  * please drop virtualbox-guest-dkms virtualbox-guest-source (LP: #1933248)
    - Revert "UBUNTU: [Config] Disable virtualbox dkms build"

linux (5.4.0-87.98) focal; urgency=medium

  * please drop virtualbox-guest-dkms virtualbox-guest-source (LP: #1933248)
    - [Config] Disable virtualbox dkms build

  * Packaging resync (LP: #1786013)
    - debian/dkms-versions -- update from kernel-versions (main/2021.09.06)

  * LRMv5: switch primary version handling to kernel-versions data set
    (LP: #1928921)
    - [Packaging] switch to kernel-versions

  * disable “CONFIG_HISI_DMA” config for ubuntu version (LP: #1936771)
    - Disable CONFIG_HISI_DMA
    - [Config] Record hisi_dma no longer built for arm64

  * memory leaking when removing a profile (LP: #1939915)
    - apparmor: Fix memory leak of profile proxy

  * CryptoExpress EP11 cards are going offline (LP: #1939618)
    - s390/zcrypt: Support for CCA protected key block version 2
    - s390: Replace zero-length array with flexible-array member
    - s390/zcrypt: Use scnprintf() for avoiding potential buffer overflow
    - s390/zcrypt: replace snprintf/sprintf with scnprintf
    - s390/ap: Remove ap device suspend and resume callbacks
    - s390/zcrypt: use fallthrough;
    - s390/zcrypt: use kvmalloc instead of kmalloc for 256k alloc
    - s390/ap: remove power management code from ap bus and drivers
    - s390/ap: introduce new ap function ap_get_qdev()
    - s390/zcrypt: use kzalloc
    - s390/zcrypt: fix smatch warnings
    - s390/zcrypt: code beautification and struct field renames
    - s390/zcrypt: split ioctl function into smaller code units
    - s390/ap: rename and clarify ap state machine related stuff
    - s390/zcrypt: provide cex4 cca sysfs attributes for cex3
    - s390/ap: rework crypto config info and default domain code
    - s390/zcrypt: simplify cca_findcard2 loop code
    - s390/zcrypt: remove set_fs() invocation in zcrypt device driver
    - s390/ap: remove unnecessary spin_lock_init()
    - s390/zcrypt: Support for CCA APKA master keys
    - s390/zcrypt: introduce msg tracking in zcrypt functions
    - s390/ap: split ap queue state machine state from device state
    - s390/ap: add error response code field for ap queue devices
    - s390/ap: add card/queue deconfig state
    - s390/sclp: Add support for SCLP AP adapter config/deconfig
    - s390/ap: Support AP card SCLP config and deconfig operations
    - s390/ap/zcrypt: revisit ap and zcrypt error handling
    - s390/zcrypt: move ap_msg param one level up the call chain
    - s390/zcrypt: Introduce Failure Injection feature
    - s390/zcrypt: fix wrong format specifications
    - s390/ap: fix ap devices reference counting
    - s390/zcrypt: return EIO when msg retry limit reached
    - s390/zcrypt: fix zcard and zqueue hot-unplug memleak
    - s390/ap: Fix hanging ioctl caused by wrong msg counter

  * memfd from ubuntu_kernel_s...

Changed in linux (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (68.6 KiB)

This bug was fixed in the package linux - 5.11.0-37.41

---------------
linux (5.11.0-37.41) hirsute; urgency=medium

  * hirsute/linux: 5.11.0-37.41 -proposed tracker (LP: #1944180)

  * CVE-2021-41073
    - io_uring: ensure symmetry in handling iter types in loop_rw_iter()

  * Packaging resync (LP: #1786013)
    - debian/dkms-versions -- update from kernel-versions (main/2021.09.06)

  * LRMv5: switch primary version handling to kernel-versions data set
    (LP: #1928921)
    - [Packaging] switch to kernel-versions

  * disable “CONFIG_HISI_DMA” config for ubuntu version (LP: #1936771)
    - Disable CONFIG_HISI_DMA
    - [Config] Record hisi_dma no longer built for arm64

  * ubunut_kernel_selftests: memory-hotplug: avoid spamming logs with
    dump_page() (LP: #1941829)
    - selftests: memory-hotplug: avoid spamming logs with dump_page(), ratio limit
      hot-remove error test

  * alsa: the soundwire audio doesn't work on the Dell TGL-H machines
    (LP: #1941669)
    - ASoC: SOF: allow soundwire use desc->default_fw_filename
    - ASoC: Intel: tgl: remove sof_fw_filename set for tgl_3_in_1_default

  * e1000e blocks the boot process when it tried to write checksum to its NVM
    (LP: #1936998)
    - e1000e: Do not take care about recovery NVM checksum

  * Dell XPS 17 (9710) PCI/internal sound card not detected (LP: #1935850)
    - ASoC: Intel: sof_sdw: include rt711.h for RT711 JD mode
    - ASoC: Intel: sof_sdw: add quirk for Dell XPS 9710

  * mute/micmute LEDs no function on HP ProBook 650 G8 (LP: #1939473)
    - ALSA: hda/realtek: fix mute/micmute LEDs for HP ProBook 650 G8 Notebook PC

  * Fix mic noise on HP ProBook 445 G8 (LP: #1940610)
    - ALSA: hda/realtek: Limit mic boost on HP ProBook 445 G8

  * GPIO error logs in start and dmesg after update of kernel (LP: #1937897)
    - ODM: mfd: Check AAEON BFPI version before adding device

  * External displays not working on Thinkpad T490 with ThinkPad Thunderbolt 3
    Dock (LP: #1938999)
    - drm/i915/ilk-glk: Fix link training on links with LTTPRs

  * Fix kernel panic caused by legacy devices on AMD platforms (LP: #1936682)
    - SAUCE: iommu/amd: Keep swiotlb enabled to ensure devices with 32bit DMA
      still work

  * Hirsute update: upstream stable patchset 2021-08-30 (LP: #1942123)
    - drm/i915: Revert "drm/i915/gem: Asynchronous cmdparser"
    - Revert "drm/i915: Propagate errors on awaiting already signaled fences"
    - regulator: rtmv20: Fix wrong mask for strobe-polarity-high
    - regulator: rt5033: Fix n_voltages settings for BUCK and LDO
    - spi: stm32h7: fix full duplex irq handler handling
    - ASoC: tlv320aic31xx: fix reversed bclk/wclk master bits
    - r8152: Fix potential PM refcount imbalance
    - qed: fix possible unpaired spin_{un}lock_bh in _qed_mcp_cmd_and_union()
    - ASoC: rt5682: Fix the issue of garbled recording after powerd_dbus_suspend
    - net: Fix zero-copy head len calculation.
    - ASoC: ti: j721e-evm: Fix unbalanced domain activity tracking during startup
    - ASoC: ti: j721e-evm: Check for not initialized parent_clk_id
    - efi/mokvar: Reserve the table only if it is in boot services data
    - nvme: fix nvme_setup_command ...

Changed in linux (Ubuntu Hirsute):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-oem-5.13 - 5.13.0-1014.18

---------------
linux-oem-5.13 (5.13.0-1014.18) focal; urgency=medium

  * focal/linux-oem-5.13: 5.13.0-1014.18 -proposed tracker (LP: #1944429)

  * CVE-2021-40490
    - ext4: fix race writing to an inline_data file while its xattrs are changing

  * CVE-2021-41073
    - io_uring: ensure symmetry in handling iter types in loop_rw_iter()

 -- Timo Aaltonen <email address hidden> Wed, 22 Sep 2021 18:34:24 +0300

Changed in linux-oem-5.13 (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (12.2 KiB)

This bug was fixed in the package linux - 4.15.0-159.167

---------------
linux (4.15.0-159.167) bionic; urgency=medium

  * Packaging resync (LP: #1786013)
    - debian/dkms-versions -- update from kernel-versions (main/2021.09.06)

  * dell300x: rsi wifi and bluetooth crash after suspend and resume
    (LP: #1940488)
    - Revert "rsi: Use resume_noirq for SDIO"

  * LRMv5: switch primary version handling to kernel-versions data set
    (LP: #1928921)
    - [Packaging] switch to kernel-versions

  * kvm_unit_tests: emulator test fails on 4.4 / 4.15 kernel, timeout
    (LP: #1932966)
    - kvm: Add emulation for movups/movupd

  * memory leaking when removing a profile (LP: #1939915)
    - security/apparmor/label.c: Clean code by removing redundant instructions
    - apparmor: Fix memory leak of profile proxy

  * ubunut_kernel_selftests: memory-hotplug: avoid spamming logs with
    dump_page() (LP: #1941829)
    - selftests: memory-hotplug: avoid spamming logs with dump_page(), ratio limit
      hot-remove error test

  * Bionic update: upstream stable patchset 2021-08-27 (LP: #1941916)
    - btrfs: mark compressed range uptodate only if all bio succeed
    - regulator: rt5033: Fix n_voltages settings for BUCK and LDO
    - r8152: Fix potential PM refcount imbalance
    - qed: fix possible unpaired spin_{un}lock_bh in _qed_mcp_cmd_and_union()
    - net: Fix zero-copy head len calculation.
    - Revert "Bluetooth: Shutdown controller after workqueues are flushed or
      cancelled"
    - KVM: do not allow mapping valid but non-reference-counted pages
    - Revert "watchdog: iTCO_wdt: Account for rebooting on second timeout"
    - spi: mediatek: Fix fifo transfer
    - padata: validate cpumask without removed CPU during offline
    - Revert "ACPICA: Fix memory leak caused by _CID repair function"
    - ALSA: seq: Fix racy deletion of subscriber
    - clk: stm32f4: fix post divisor setup for I2S/SAI PLLs
    - omap5-board-common: remove not physically existing vdds_1v8_main fixed-
      regulator
    - scsi: sr: Return correct event when media event code is 3
    - media: videobuf2-core: dequeue if start_streaming fails
    - net: natsemi: Fix missing pci_disable_device() in probe and remove
    - nfp: update ethtool reporting of pauseframe control
    - mips: Fix non-POSIX regexp
    - bnx2x: fix an error code in bnx2x_nic_load()
    - net: pegasus: fix uninit-value in get_interrupt_interval
    - net: fec: fix use-after-free in fec_drv_remove
    - net: vxge: fix use-after-free in vxge_device_unregister
    - Bluetooth: defer cleanup of resources in hci_unregister_dev()
    - USB: usbtmc: Fix RCU stall warning
    - USB: serial: option: add Telit FD980 composition 0x1056
    - USB: serial: ch341: fix character loss at high transfer rates
    - USB: serial: ftdi_sio: add device ID for Auto-M3 OP-COM v2
    - usb: gadget: f_hid: added GET_IDLE and SET_IDLE handlers
    - usb: gadget: f_hid: fixed NULL pointer dereference
    - usb: gadget: f_hid: idle uses the highest byte for duration
    - usb: otg-fsm: Fix hrtimer list corruption
    - scripts/tracing: fix the bug that can't parse raw_trace_func
    - staging: rtl8723bs: Fix a resource lea...

Changed in linux (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux-oem-5.10 - 5.10.0-1049.51

---------------
linux-oem-5.10 (5.10.0-1049.51) focal; urgency=medium

  * focal/linux-oem-5.10: 5.10.0-1049.50 -proposed tracker (LP: #1944209)

  * e1000e extremly slow (LP: #1930754)
    - SAUCE: e1000e: Separate TGP board type from SPT
    - SAUCE: e1000e: Fixing packet loss issues on new platforms

  * CVE-2021-41073
    - io_uring: ensure symmetry in handling iter types in loop_rw_iter()

 -- Chia-Lin Kao (AceLan) <email address hidden> Mon, 27 Sep 2021 18:33:36 +0800

Changed in linux-oem-5.10 (Ubuntu Focal):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers