Slab page exclusion issue on Linux 6.2-rc1

Bug #2038248 reported by Chengen Du
14
This bug affects 1 person
Affects Status Importance Assigned to Milestone
makedumpfile (Ubuntu)
Fix Released
Undecided
Unassigned
Jammy
Fix Released
Medium
Chengen Du
Lunar
Fix Released
Medium
Chengen Du

Bug Description

[Impact]

The kernel crashdumps generated by makedumpfile on kernel 6.2
(affects Lunar, and Jammy with the HWE kernel) might not open
on crash, due to kernel changes not reflected in makedumpfile.

The Kernel commit 130d4df57390 ("mm/sl[au]b: rearrange struct slab fields to allow larger rcu_head"), included in Linux 6.2-rc1 and later versions, introduced a change that aligns the offset of slab.slabs with that of page.mapping.
However, this modification unintentionally causes the makedumpfile command with the -d 8 option, meant to exclude user data, to incorrectly exclude certain slab pages.
Consequently, when utilizing dumpfiles generated in this manner, the "crash" utility may encounter an error when attempting to initiate a session:

crash: page excluded: kernel virtual address: ffff0000e269d428 type: "xa_node shift"

[Fix]

An upstream fix is available.
==========
commit 5f17bdd2128998a3eeeb4521d136a192222fadb6
Author: Kazuhito Hagio <email address hidden>
Date: Wed Dec 21 11:06:39 2022 +0900

    [PATCH] Fix wrong exclusion of slab pages on Linux 6.2-rc1
==========

[Test Plan]

1. Install the required packages and then proceed to reboot the machine.
# sudo apt install crash linux-crashdump -y
# reboot

2. To check the status of kdump, use the `kdump-config show` command.
# kdump-config show
DUMP_MODE: kdump
USE_KDUMP: 1
KDUMP_COREDIR: /var/crash
crashkernel addr: 0x64000000
   /var/lib/kdump/vmlinuz: symbolic link to /boot/vmlinuz-6.2.0-33-generic
kdump initrd:
   /var/lib/kdump/initrd.img: symbolic link to /var/lib/kdump/initrd.img-6.2.0-33-generic
current state: ready to kdump

kexec command:
  /sbin/kexec -p --command-line="BOOT_IMAGE=/boot/vmlinuz-6.2.0-33-generic root=UUID=3e72f5d5-870b-4b8e-9a0d-8ba920391379 ro console=tty1 console=ttyS0 reset_devices systemd.unit=kdump-tools-dump.service nr_cpus=1 irqpoll usbcore.nousb" --initrd=/var/lib/kdump/initrd.img /var/lib/kdump/vmlinuz

3. To trigger a crash dump forcefully, execute the `echo c | sudo tee /proc/sysrq-trigger` command.
4. Download the kernel .ddeb file, which will be used for analyzing the dump file.
# sudo -i
# cd /var/crash
# pull-lp-ddebs linux-image-unsigned-$(uname -r)
# dpkg-deb -x linux-image-unsigned-$(uname -r)-*.ddeb dbgsym-$(uname -r)
5. Utilize the "crash" utility to parse and analyze the dump file.
# crash dbgsym-$(uname -r)/usr/lib/debug/boot/vmlinux-$(uname -r) XXXX/dump.XXXX
...
please wait... (gathering task table data)
crash: page excluded: kernel virtual address: ffff0000e269d428 type: "xa_node shift"

[Where problems could occur]

The patch has altered the method for excluding slab pages, aligning with the structural changes introduced in Linux 6.2-rc1.
This modification is essential for Linux kernel 6.2.
However, it's crucial to note that this change may impact the content of the dump file, potentially leading to a situation where the "crash" utility is unable to parse it in the worst-case scenario.

Chengen Du (chengendu)
Changed in makedumpfile (Ubuntu Lunar):
assignee: nobody → Chengen Du (chengendu)
status: New → In Progress
Revision history for this message
Chengen Du (chengendu) wrote :

debdiff for Lunar

Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "lp2038248-makedumpfile-lunar.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi Chengen,

Thanks for the detailed SRU template and debdiff!

I have only 2 minor fixes, which I already performed:
- Version: s/ubuntu1/ubuntu0.1/ (see doc [1])
- Maintainer: this is the first 'ubuntu' version, so run `update-maintainer` (see `debian/control` hunk).

The updated debdiff is attached for reference,
and I'll continue the work on sponsoring this.

cheers

[1] https://wiki.ubuntu.com/SecurityTeam/UpdatePreparation#Update_the_packaging

tags: added: se-sponsor-mfo
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi Chengen,

The fix is included in 1.7.3 in mantic, so only lunar needs the fix.

We probably would like jammy as well, for compatibility with the 6.2+ HWE kernel (without regression to the 5.15 GA kernel).

Could you please check jammy for that too? (I'll add a task as Incomplete.)

 $ git describe --contains 5f17bdd2128998a3eeeb4521d136a192222fadb6
 1.7.3~6

 $ rmadison -a source makedumpfile
  makedumpfile | 1.5.5-2ubuntu1 | trusty | source
  makedumpfile | 1.5.5-2ubuntu1.6 | trusty-updates | source
  makedumpfile | 1:1.5.9-5~ubuntu14.04.1 | trusty-backports | source
  makedumpfile | 1:1.5.9-5 | xenial | source
  makedumpfile | 1:1.6.3-2~16.04.3 | xenial-updates | source
  makedumpfile | 1:1.6.3-2 | bionic | source
  makedumpfile | 1:1.6.5-1ubuntu1~18.04.7 | bionic-updates | source
  makedumpfile | 1:1.6.7-1ubuntu2 | focal | source
  makedumpfile | 1:1.6.7-1ubuntu2.4 | focal-updates | source
  makedumpfile | 1:1.7.0-1build1 | jammy | source
  makedumpfile | 1:1.7.2-1 | lunar | source
  makedumpfile | 1:1.7.3-1 | mantic | source

Packages verified with LXD VM and upstream crash for now (before bug 2038248). All good!

Uploaded to Lunar.

Thanks!

Details:
---

Setup:

 $ lxc launch --vm --config limits.memory=2GiB ubuntu:lunar mdf-l
 $ lxc shell mdf-l

 # apt update && apt install -y linux-image-generic linux-crashdump crash
 # apt remove -y $(dpkg -l | awk '$2 ~ /linux-.*kvm/ { print $2 }')

 # sed 's/crashkernel=[^ "]\+/crashkernel=512M/' -i /etc/default/grub.d/kdump-tools.cfg
 # update-grub
 # reboot
 # kdump-config show | grep state:
 current state: ready to kdump
 # echo c >/proc/sysrq-trigger

Debug symbols:

 # wget https://launchpad.net/ubuntu/+archive/primary/+files/linux-image-unsigned-6.2.0-34-generic-dbgsym_6.2.0-34.34_amd64.ddeb
 # ar x linux-image-unsigned-6.2.0-34-generic-dbgsym_6.2.0-34.34_amd64.ddeb data.tar.xz
 # tar xvf data.tar.xz ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic
 ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic

Upstream crash (for now):

 # apt build-dep -y crash
 # git clone https://github.com/crash-utility/crash.git
 # cd crash
 # make

Original package:
---

 # ./crash /var/crash/202310072357/dump.202310072357 ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic
 ...
 please wait... (gathering task table data)
 crash: page excluded: kernel virtual address: ffff9b13c2b826c8 type: "xa_node shift"

Patched package:
---

 # wget https://launchpad.net/~mfo/+archive/ubuntu/test/+build/26759821/+files/makedumpfile_1.7.2-1ubuntu0.1_amd64.deb
 # apt install ./makedumpfile_1.7.2-1ubuntu0.1_amd64.deb

 # kdump-config reload
 # kdump-config show | grep state:
 current state: ready to kdump
 # echo c >/proc/sysrq-trigger

 # ./crash /var/crash/202310080054/dump.202310080054 ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic
 ...
       KERNEL: ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic
     DUMPFILE: /var/crash/202310080054/dump.202310080054 [PARTIAL DUMP]
 ...
 crash>

Changed in makedumpfile (Ubuntu Jammy):
status: New → Incomplete
assignee: nobody → Chengen Du (chengendu)
Changed in makedumpfile (Ubuntu Lunar):
importance: Undecided → Medium
Changed in makedumpfile (Ubuntu Jammy):
importance: Undecided → Medium
Changed in makedumpfile (Ubuntu):
status: New → Fix Released
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

(The package built correctly in a PPA on all architectures.)

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Jammy is also affected for the 6.2 HWE kernel.

The patch for makedumpfile is the same, and applies cleanly.
It fixes the issue with the 6.2 HWE kernel, and causes no regression with the 5.15 GA kernel (ie, the dump file can still be opened in crash).

Details:
---

Setup:

 $ lxc launch --vm --config limits.memory=2GiB ubuntu:jammy mdf-j
 $ lxc shell mdf-j

 # apt update && apt install -y linux-image-generic-hwe-22.04 linux-crashdump crash
 # apt remove -y $(dpkg -l | awk '$2 ~ /linux-.*kvm/ { print $2 }')

 # sed 's/crashkernel=[^ "]\+/crashkernel=512M/' -i /etc/default/grub.d/kdump-tools.cfg
 # update-grub
 # reboot
 # kdump-config show | grep state:
 current state: ready to kdump
 # echo c >/proc/sysrq-trigger

Debug symbols:

 # wget https://launchpad.net/ubuntu/+archive/primary/+files/linux-image-unsigned-6.2.0-34-generic-dbgsym_6.2.0-34.34~22.04.1_amd64.ddeb
 # ar x linux-image-unsigned-6.2.0-34-generic-dbgsym_6.2.0-34.34~22.04.1_amd64.ddeb data.tar.xz
 # tar xvf data.tar.xz ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic
 ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic

Upstream crash (for now):

 # apt build-dep -y crash
 # git clone https://github.com/crash-utility/crash.git
 # cd crash
 # make

Original package:
---

 # ./crash ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic /var/crash/202310101134/dump.202310101134
 ...
 please wait... (gathering task table data)
 crash: page excluded: kernel virtual address: ffff9b13c2b826c8 type: "xa_node shift"

Patched package:
---

 $ ./crash ./usr/lib/debug/boot/vmlinux-6.2.0-34-generic /var/crash/202310101206/dump.202310101206
 ...
      RELEASE: 6.2.0-34-generic
      VERSION: #34~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Thu Sep 7 13:12:03 UTC 2
 ...
 crash>

Patched package & GA kernel:
---

 (upstream crash and Ubuntu crash, both work)

 $ ./crash ./usr/lib/debug/boot/vmlinux-5.15.0-86-generic /var/crash/202310101225/dump.202310101225
 ...
      RELEASE: 5.15.0-86-generic
      VERSION: #96-Ubuntu SMP Wed Sep 20 08:23:49 UTC 2023
 ...
 crash>

 $ crash ./usr/lib/debug/boot/vmlinux-5.15.0-86-generic /var/crash/202310101225/dump.202310101225
 ...
      RELEASE: 5.15.0-86-generic
      VERSION: #96-Ubuntu SMP Wed Sep 20 08:23:49 UTC 2023
 ...
 crash>

Changed in makedumpfile (Ubuntu Jammy):
status: Incomplete → Confirmed
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
description: updated
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Uploaded to Jammy too.

The SRU template is now updated to reflect that.

Packages verified with LXD VM and upstream crash for the 6.2 kernel for now (before bug 2038248) and Ubuntu crash for the 5.15 GA kernel. All good!
(Details in comment #7.)

The package built correctly in a PPA on all architectures.

Changed in makedumpfile (Ubuntu Jammy):
status: Confirmed → In Progress
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Please test proposed package

Hello Chengen, or anyone else affected,

Accepted makedumpfile into lunar-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/makedumpfile/1:1.7.2-1ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-lunar to verification-done-lunar. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-lunar. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in makedumpfile (Ubuntu Lunar):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-lunar
Changed in makedumpfile (Ubuntu Jammy):
status: In Progress → Fix Committed
tags: added: verification-needed-jammy
Revision history for this message
Timo Aaltonen (tjaalton) wrote :

Hello Chengen, or anyone else affected,

Accepted makedumpfile into jammy-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/makedumpfile/1:1.7.0-1ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-jammy to verification-done-jammy. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-jammy. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Chengen Du (chengendu) wrote :

The package in -proposed has been successfully tested:
1:1.7.0-1ubuntu0.1 in Jammy, 1:1.7.2-1ubuntu0.1 in Lunar

tags: added: verification-done verification-done-jammy verification-done-lunar
removed: verification-needed verification-needed-jammy verification-needed-lunar
Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Hi Chengen, or someone else:

could you please clarify that you also performed the test with the 5.15 GA kernel for Jammy, not just 6.2, which is what I assume you did?

I see that @mfo tested with 5.15 in comment #7, but that was before the package landed in jammy-proposed.

Revision history for this message
Chengen Du (chengendu) wrote :

Hi Andreas,

I apologize for the lack of detail.
The 5.15 GA kernel for Jammy has already passed testing. Please see the test results below.
==========
# uname -a
Linux ubuntu 5.15.0-87-generic #97-Ubuntu SMP Tue Oct 3 09:52:42 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
# apt-cache policy makedumpfile
makedumpfile:
  Installed: 1:1.7.0-1ubuntu0.1
  Candidate: 1:1.7.0-1ubuntu0.1
  Version table:
 *** 1:1.7.0-1ubuntu0.1 500
        500 http://ports.ubuntu.com/ubuntu-ports jammy-proposed/main arm64 Packages
        100 /var/lib/dpkg/status
     1:1.7.0-1build1 500
        500 http://ports.ubuntu.com/ubuntu-ports jammy/main arm64 Packages
# crash dbgsym-$(uname -r)/usr/lib/debug/boot/vmlinux-$(uname -r) 202310270330/dump.202310270330
crash 8.0.0
Copyright (C) 2002-2021 Red Hat, Inc.
Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
Copyright (C) 1999-2006 Hewlett-Packard Co
Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
Copyright (C) 2005, 2011, 2020-2021 NEC Corporation
Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
Copyright (C) 2015, 2021 VMware, Inc.
This program is free software, covered by the GNU General Public License,
and you are welcome to change it and/or distribute copies of it under
certain conditions. Enter "help copying" to see the conditions.
This program has absolutely no warranty. Enter "help warranty" for details.

GNU gdb (GDB) 10.2
Copyright (C) 2021 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "aarch64-unknown-linux-gnu".
Type "show configuration" for configuration details.
Find the GDB manual and other documentation resources online at:
    <http://www.gnu.org/software/gdb/documentation/>.

For help, type "help".
Type "apropos word" to search for commands related to "word"...

      KERNEL: dbgsym-5.15.0-87-generic/usr/lib/debug/boot/vmlinux-5.15.0-87-generic
    DUMPFILE: 202310270330/dump.202310270330 [PARTIAL DUMP]
        CPUS: 8
        DATE: Fri Oct 27 03:30:26 UTC 2023
      UPTIME: 2135039823346 days, 00:14:59
LOAD AVERAGE: 0.91, 0.45, 0.17
       TASKS: 215
    NODENAME: ubuntu
     RELEASE: 5.15.0-87-generic
     VERSION: #97-Ubuntu SMP Tue Oct 3 09:52:42 UTC 2023
     MACHINE: aarch64 (unknown Mhz)
      MEMORY: 8 GB
       PANIC: "Kernel panic - not syncing: sysrq triggered crash"
         PID: 1199
     COMMAND: "bash"
        TASK: ffff0000c6459040 [THREAD_INFO: ffff0000c6459040]
         CPU: 4
       STATE: TASK_RUNNING (PANIC)

crash>
==========

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package makedumpfile - 1:1.7.2-1ubuntu0.1

---------------
makedumpfile (1:1.7.2-1ubuntu0.1) lunar; urgency=medium

  * d/p/lp2038248-fix-wrong-exclusion-of-slab-pages-on-Linux-6.2.patch:
    When using the "makedumpfile -d 8" command,
    exclude user data rather than other slab pages (LP: #2038248)

 -- Chengen Du <email address hidden> Tue, 03 Oct 2023 06:26:06 +0000

Changed in makedumpfile (Ubuntu Lunar):
status: Fix Committed → Fix Released
Revision history for this message
Timo Aaltonen (tjaalton) wrote : Update Released

The verification of the Stable Release Update for makedumpfile has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package makedumpfile - 1:1.7.0-1ubuntu0.1

---------------
makedumpfile (1:1.7.0-1ubuntu0.1) jammy; urgency=medium

  * d/p/lp2038248-fix-wrong-exclusion-of-slab-pages-on-Linux-6.2.patch:
    When using the "makedumpfile -d 8" command,
    exclude user data rather than other slab pages (LP: #2038248)

 -- Chengen Du <email address hidden> Tue, 10 Oct 2023 13:52:06 +0000

Changed in makedumpfile (Ubuntu Jammy):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.