Kdump-Tools: Makedumpfile Failed, Falling Back To 'Cp'

Bug #1869465 reported by chenrongwen
16
This bug affects 1 person
Affects Status Importance Assigned to Milestone
makedumpfile (Debian)
Fix Released
Unknown
makedumpfile (Ubuntu)
Fix Released
Medium
Ioanna Alifieraki
Xenial
Fix Released
Medium
Unassigned
Bionic
Fix Released
Medium
Unassigned
Eoan
Fix Released
Medium
Unassigned
Focal
Fix Released
Medium
Unassigned
Groovy
Fix Released
Medium
Ioanna Alifieraki

Bug Description

[Impact]

On some arm systems makedumpfile fails to translate virtual to physical addresses properly.
This may result in makedumpfile looping forever exhausting
all memory, or translating a virtual address to an invalid physical address
and then failing and falling back to cp.
The reason it cannot resolve some addresses is because the PMD mask is wrong.
When physical address mask allows up to 48bits pmd mask should allow the
same, currently pmd mask is set to 40bits (see commit [1]).

Commit [1] fixes this bug.

[Test Case]

To hit this bug you need a system that needs physical addresses over 1TB.
This may be either because you have a lot
of memory or because the firmware mapped some memory above 1TB for some
reason [1].

A user hit this bug because firmware mapped memory above 1TB and provided a
dump so I could reproduce the bug when running makedumpfile on the dump.

[Regression Potential]

This commit changes the PMD_SECTION_MASK for arm64. So any regression potential
would only affect arm64 systems. In addition PMD_SECTION_MASK is used in translation
from virtual to physical addresses and therefore any regression would happen during
this process.

[Other]

[1] https://github.com/makedumpfile/makedumpfile/commit/7242ae4cb5288df626f464ced0a8b60fd669100b

When testing kdump on Ubuntu 18.04.4 (arm64) GA kernel, makedumpfile fails. The test steps are as follows:
# echo 1> / proc / sys / kernel / sysrq
# echo c> / proc / sysrq-trigger
The logs are as follows:

kdump-tools[646]: starting kdump-tools: * running makedumpfile -c -d 31 /proc/vmcore /var/crash/202003251128/dump-incomplete
kdump-tools[646]: readpage_elf: Attempt to read non-existent page at 0x0
kdump-tools[646]: readmem: type_addr: 1, addr:ff0, size:8
kdump-tools[646]: vaddr_to_paddr_arm64: Can't read pud
kdump-tools[646]: readmem: Can't convert a virtual address(ffff9e653690) to physical address.
kdump-tools[646]: readmem: type_addr: 0, addr:ffff9e653690, size:1032
kdump-tools[646]: validate_mem_section: Can't read mem_section array.
kdump-tools[646]: get_mem_section: Could not validate mem_section.
kdump-tools[646]: get_mm_sparsemem: Can't get the address of mem_section.
kdump-tools[646]: makedumpfile Failed.
kdump-tools[646]: * kdump-tools: makedumpfile failed, falling back to 'cp'

But when I use the HWE kernel, I find that there is no such problem.
The HEW kernel version: 5.3.0-42-generic

chenrongwen (were0415)
information type: Public → Public Security
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1869465

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
chenrongwen (were0415) wrote :
Download full text (53.3 KiB)

Here is the capture kernel's log:
[ 0.000000] Booting Linux on physical CPU 0x00001f0200 [0x481fd010]
[ 0.000000] Linux version 4.15.0-91-generic (buildd@bos02-arm64-019) (gcc version 7.4.0 (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1)) #92-Ubuntu SMP Fri Feb 28 11:10:16 UTC 2020 (Ubuntu 4.15.0-91.92-generic 4.15.18)
[ 0.000000] efi: Getting EFI parameters from FDT:
[ 0.000000] efi: EFI v2.70 by EDK II
[ 0.000000] efi: ACPI 2.0=0x20570000 SMBIOS 3.0=0x204e0000 MEMATTR=0x23e32018 ESRT=0x23e77798
[ 0.000000] esrt: Reserving ESRT space from 0x0000000023e77798 to 0x0000000023e777d0.
[ 0.000000] Reserving 5KB of memory at 0x7fffe000 for elfcorehdr
[ 0.000000] ACPI: Early table checksum verification disabled
[ 0.000000] ACPI: RSDP 0x0000000020570000 000024 (v02 HISI )
[ 0.000000] ACPI: XSDT 0x0000000020560000 0000AC (v01 HISI HIP08 00000000 01000013)
[ 0.000000] ACPI: FACP 0x0000000020040000 000114 (v06 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: DSDT 0x000000001FDC0000 00CFCC (v02 HISI HIP08 00000000 INTL 20181213)
[ 0.000000] ACPI: PCCT 0x0000000020550000 00008A (v01 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: SSDT 0x0000000020540000 00E56A (v02 HISI HIP07 00000000 INTL 20181213)
[ 0.000000] ACPI: BERT 0x0000000020490000 000030 (v01 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: HEST 0x0000000020470000 00058C (v01 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: ERST 0x0000000020430000 000230 (v01 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: EINJ 0x0000000020420000 000170 (v01 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: GTDT 0x0000000020020000 00007C (v02 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: SDEI 0x000000001FE20000 000030 (v01 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: MCFG 0x000000001FE10000 00003C (v01 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: SLIT 0x000000001FE00000 000030 (v01 HISI HIP08 00000001 HISI 20151124)
[ 0.000000] ACPI: SPCR 0x000000001FDF0000 000050 (v02 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: SRAT 0x000000001FDE0000 000540 (v03 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: APIC 0x000000001FDD0000 00146C (v04 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: IORT 0x000000001FDB0000 001060 (v00 HISI HIP08 00000000 INTL 20181213)
[ 0.000000] ACPI: PPTT 0x000000001F8E0000 002130 (v01 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: SPMI 0x000000001F8D0000 000041 (v05 HISI HIP08 00000000 HISI 20151124)
[ 0.000000] ACPI: SPCR: console: uart,mmio,0x3f00002f8,115200
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x180000 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x180100 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x180200 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x180300 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x190000 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x190100 -> Node 0
[ 0.000000] ACPI: NUMA: SRAT: PXM 0 -> MPIDR 0x190200 -> Node 0
[ 0.000000] ACPI: NUM...

Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Seth Arnold (seth-arnold) wrote : Bug is not a security issue

Thanks for taking the time to report this bug and helping to make Ubuntu better. We appreciate the difficulties you are facing, but this appears to be a "regular" (non-security) bug. I have unmarked it as a security issue since this bug does not show evidence of allowing attackers to cross privilege boundaries nor directly cause loss of data/privacy. Please feel free to report any other bugs you may find.

information type: Public Security → Public
Changed in linux (Ubuntu):
assignee: nobody → Ioanna Alifieraki (joalif)
Changed in linux (Ubuntu Bionic):
status: New → Confirmed
assignee: nobody → Ioanna Alifieraki (joalif)
Changed in linux (Ubuntu):
importance: Undecided → Medium
Changed in linux (Ubuntu Bionic):
importance: Undecided → Medium
description: updated
Revision history for this message
Ioanna Alifieraki (joalif) wrote :

Debdiff for Groovy

Changed in linux (Ubuntu Focal):
status: New → Confirmed
status: Confirmed → New
Revision history for this message
Ioanna Alifieraki (joalif) wrote :

Debdiff for Focal

Revision history for this message
Ioanna Alifieraki (joalif) wrote :

Debdiff for Eoan

Revision history for this message
Ioanna Alifieraki (joalif) wrote :

Debdiff for Bionic

Revision history for this message
Ioanna Alifieraki (joalif) wrote :

Debdiff for Xenial

tags: added: sts
tags: added: sts-sponsor-mfo
Changed in linux (Ubuntu Xenial):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Ioanna Alifieraki (joalif)
Changed in linux (Ubuntu Bionic):
status: Confirmed → In Progress
Changed in linux (Ubuntu Eoan):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Ioanna Alifieraki (joalif)
Changed in linux (Ubuntu Focal):
status: New → In Progress
importance: Undecided → Medium
assignee: nobody → Ioanna Alifieraki (joalif)
Changed in linux (Ubuntu Groovy):
status: Confirmed → In Progress
tags: added: patch
Changed in linux (Debian):
status: Unknown → New
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi Jo,

Thanks for the debdiffs!

The contents look good to me (versioning, dep3 headers)
except that the patch sequence number is missing, and
the changelog lacks a brief description of it.

I just fixed those minor things up, and verified all
debdiffs apply cleanly, quilt push -a applies cleanly,
and quilt series show the sequence numbers correctly.

Attaching the new 'v2' debdiffs.

The current status of upload queues, pending SRUs/
-proposed pockets, block-proposed-<release> tags,
is all good -- no makedumpfile uploads in progress.

This should be good to go. I'll ask a coredev for
sponsorship to groovy, and once it's there, I can
sponsor for the stable releases.

cheers,
Mauricio

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Eric Desrochers (slashd)
tags: added: sts-sponsor-mfo-and-slashd
removed: sts-sponsor-mfo
Revision history for this message
Eric Desrochers (slashd) wrote :

[Groovy sponsoring]

At first glance, It seems to be a good candidate for a 'sync' from debian here.

https://wiki.ubuntu.com/UbuntuDevelopment/Merging#Check_if_its_a_merge_or_a_sync
https://wiki.ubuntu.com/SyncRequestProcess

The fix seems to be found in makedumpfile 'unstable'

makedumpfile (1:1.6.7-2)

# arch/arm64.c
84 #define PMD_SECTION_MASK ((1UL << 40) - 1)

- Eric

Revision history for this message
Eric Desrochers (slashd) wrote :

Disregard my last comment (#15) ...

I was confused for a second between :

#define PHYS_MASK ((1UL << PHYS_MASK_SHIFT) - 1)
#define PMD_SECTION_MASK ((1UL << 40) - 1)

so PMD_SECTION_MASK fix is not found in debian yet.

I'll sponsor the debdiff over the weekend.

- Eric

Revision history for this message
Eric Desrochers (slashd) wrote :

[Groovy sponsor]

Nitpicking:

* Renamed "align_PMD_SECTION_MASK_with_PHYS_MASK.patch" 0003-align_PMD_SECTION_MASK_with_PHYS_MASK.patch to match his current little friend "0002-adapt-makefile-to-debian.patch"

* Added more detail in the debian/changelog to ease future reference in a simple look:
     * d/p/0003-align_PMD_SECTION_MASK_with_PHYS_MASK.patch (LP: #1869465)
+ - Fixing arm64 systems makedumpfile looping forever exhausting
+ all memory when filtering kernel core.

Eric Desrochers (slashd)
Changed in makedumpfile (Ubuntu Groovy):
assignee: nobody → Ioanna Alifieraki (joalif)
importance: Undecided → Medium
status: New → In Progress
tags: added: sts-sponsor-mfo
removed: sts-sponsor-mfo-and-slashd
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "lp1869465_groovy.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package makedumpfile - 1:1.6.7-1ubuntu3

---------------
makedumpfile (1:1.6.7-1ubuntu3) groovy; urgency=medium

  * d/p/0003-align_PMD_SECTION_MASK_with_PHYS_MASK.patch (LP: #1869465)
    - Fixing arm64 systems makedumpfile looping forever exhausting
      all memory when filtering kernel core.

 -- Ioanna Alifieraki <email address hidden> Thu, 04 Jun 2020 14:36:01 +0100

Changed in makedumpfile (Ubuntu Groovy):
status: In Progress → Fix Released
Mathew Hodson (mhodson)
affects: linux (Debian) → makedumpfile (Debian)
no longer affects: linux (Ubuntu)
no longer affects: linux (Ubuntu Xenial)
no longer affects: linux (Ubuntu Bionic)
no longer affects: linux (Ubuntu Eoan)
no longer affects: linux (Ubuntu Focal)
Mathew Hodson (mhodson)
no longer affects: linux (Ubuntu Groovy)
Changed in makedumpfile (Ubuntu Xenial):
importance: Undecided → Medium
Changed in makedumpfile (Ubuntu Bionic):
importance: Undecided → Medium
Changed in makedumpfile (Ubuntu Eoan):
importance: Undecided → Medium
Changed in makedumpfile (Ubuntu Focal):
importance: Undecided → Medium
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hi Jo, Eric,

Eric, thanks for uploading to Groovy!
(Sorry, I had made similar changes to groovy v2 debdiff -- no worries, but just wanted to let you know I checked those for groovy too, to not give you more trouble in helping here! :-)

The changes have landed in Groovy, so I just uploaded to Focal/Eoan/Bionic/Xenial.

cheers,
Mauricio

Changed in makedumpfile (Ubuntu Focal):
status: New → In Progress
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello chenrongwen, or anyone else affected,

Accepted makedumpfile into focal-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/makedumpfile/1:1.6.7-1ubuntu2.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-focal to verification-done-focal. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-focal. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in makedumpfile (Ubuntu Focal):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-focal
Changed in makedumpfile (Ubuntu Eoan):
status: New → Fix Committed
tags: added: verification-needed-eoan
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello chenrongwen, or anyone else affected,

Accepted makedumpfile into eoan-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/makedumpfile/1:1.6.6-2ubuntu2.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-eoan to verification-done-eoan. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-eoan. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in makedumpfile (Ubuntu Bionic):
status: New → Fix Committed
tags: added: verification-needed-bionic
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello chenrongwen, or anyone else affected,

Accepted makedumpfile into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/makedumpfile/1:1.6.5-1ubuntu1~18.04.5 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in makedumpfile (Ubuntu Xenial):
status: New → Fix Committed
tags: added: verification-needed-xenial
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Hello chenrongwen, or anyone else affected,

Accepted makedumpfile into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/makedumpfile/1:1.6.3-2~16.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (makedumpfile/1:1.6.6-2ubuntu2.1)

All autopkgtests for the newly accepted makedumpfile (1:1.6.6-2ubuntu2.1) for eoan have finished running.
The following regressions have been reported in tests triggered by the package:

makedumpfile/1:1.6.6-2ubuntu2.1 (ppc64el, i386)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/eoan/update_excuses.html#makedumpfile

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (makedumpfile/1:1.6.5-1ubuntu1~18.04.5)

All autopkgtests for the newly accepted makedumpfile (1:1.6.5-1ubuntu1~18.04.5) for bionic have finished running.
The following regressions have been reported in tests triggered by the package:

makedumpfile/1:1.6.5-1ubuntu1~18.04.5 (s390x)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/bionic/update_excuses.html#makedumpfile

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
chenrongwen (were0415) wrote :

makedumpfile 1.6.5-1ubuntu1~18.04.5
Tags: verification-done-bionic

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

The autopkgtests regressions are unrelated to this SRU.

Sure enough: the patch is for arm64 only, but the archs failing are i386, ppc64el, and s390x.

The "regressions" also happen with the version in -updates, in the same way as in -proposed.

makedumpfile [eoan/i386] [1]

 Version Triggers Date Duration Requester Result
 1:1.6.6-2ubuntu2 ... 2020-06-16 19:07:17 UTC 0h 05m 47s mfo fail log   artifacts   ♻
 1:1.6.6-2ubuntu2.1 ... 2020-06-16 11:51:34 UTC 0h 07m 23s mfo fail log   artifacts   ♻

makedumpfile [eoan/ppc64el] [2]

 Version Triggers Date Duration Requester Result
 1:1.6.6-2ubuntu2 ... 2020-06-16 18:56:11 UTC 0h 06m 31s mfo fail log   artifacts   ♻
 1:1.6.6-2ubuntu2.1 ... 2020-06-16 11:47:27 UTC 0h 05m 22s mfo fail log   artifacts   ♻

makedumpfile [bionic/s390x] [3]

 Version Triggers Date Duration Requester Result
 1:1.6.5-1ubuntu1~18.04.4 ... 2020-06-16 18:52:05 UTC 0h 03m 08s mfo fail log   artifacts   ♻
 1:1.6.5-1ubuntu1~18.04.5 ... 2020-06-16 11:46:00 UTC 0h 03m 06s mfo fail log   artifacts   ♻

[1] https://autopkgtest.ubuntu.com/packages/m/makedumpfile/eoan/i386
[2] https://autopkgtest.ubuntu.com/packages/m/makedumpfile/eoan/ppc64el
[3] https://autopkgtest.ubuntu.com/packages/m/makedumpfile/bionic/s390x

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

Also, ppc64/s390 are being tracked here: https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1851663
It's a common issue, and difficult to debug (if we run tests locally, we can't reproduce and they succeeed).

Regarding i386, we should remove tests for i386 in Eoan and Focal.
Cheers,

Guilherme

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Hey Guilherme,

Thanks for the pointers and confirming these are not regressions, but known issues. :)

Revision history for this message
Guilherme G. Piccoli (gpiccoli) wrote :

You're very welcome mfo, thanks for following this SRU in order to get that released soon =)

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :
Download full text (5.4 KiB)

The verification for makedumpfile used the vmcore file provided
by another user instead of /proc/vmcore (which is identical, as
it's a simple 'cp' copy of /proc/vmcore, per makedumpfile error.)

 $ ls -lh /home/ubuntu/201909170743/vmcore.201909170743
 -r-------- 1 ubuntu ubuntu 32G Sep 17 2019 /home/ubuntu/201909170743/vmcore.201909170743

 $ file /home/ubuntu/201909170743/vmcore.201909170743
 /home/ubuntu/201909170743/vmcore.201909170743: ELF 64-bit LSB core file ARM aarch64, version 1 (SYSV), SVR4-style

The reproducer system is an arm64 guest in our internal openstack
cloud with the same kernel version as the user (4.15.0-76-generic.)

 $ grep -ao -m1 'Linux version .* ' /home/ubuntu/201909170743/vmcore.201909170743
 Linux version 4.15.0-76-generic (buildd@bos02-arm64-060) (gcc version 7.4.0 (Ubuntu/Linaro 7.4.0-1ubuntu1~18.04.1)) #86-Ubuntu SMP Fri Jan 17 17:25:58 UTC 2020 (Ubuntu 4.15.0-76.86-generic

 $ uname -mrv
 4.15.0-76-generic #86-Ubuntu SMP Fri Jan 17 17:25:58 UTC 2020 aarch64

When using the original package version, makedumpfile fails with
error messages about a particular address, then kdump-tools (the
caller of makedumpfile) falls back to 'cp', as reported.

This takes a long time since it's a 32 GB file.

So, since the second step / invocation of makedumpfile, to store
the dmesg output on vmcore (makedumpfile --dump-dmesg), fails in
the same way (and for the very same particular address), that is,
an equivalent failure / symptom of the same root cause, then use
only that step, which runs fast regardless of failure or success.

These are the changes done to /usr/sbin/kdump-config (shell script
independent of makedumpfile binary/executable code.)

 # Constants
 #vmcore_file=/proc/vmcore
 vmcore_file=/home/ubuntu/201909170743/vmcore.201909170743
 ...
 function kdump_save_core()
 ...
  log_action_msg "running makedumpfile $MAKEDUMP_ARGS $vmcore_file $KDUMP_CORETEMP"
  #makedumpfile $MAKEDUMP_ARGS $vmcore_file $KDUMP_CORETEMP
  #ERROR=$?
  ERROR=0
  if [ $ERROR -ne 0 ] ; then
          log_failure_msg "$NAME: makedumpfile failed, falling back to 'cp'"
 ...

For documentation purposes,

With the user's vmcore, and still collecting the crashdump (i.e.,
first invocation of makedumpfile), the exact error reproduces:

 $ dpkg -s makedumpfile | grep -i version
 Version: 1:1.6.5-1ubuntu1~18.04.4

 $ echo 1 | sudo tee /proc/sys/kernel/sysrq && echo c | sudo tee /proc/sysrq-trigger
 ...
 [ 222.162389] sysrq: SysRq : Trigger a crash
 ...
 [ 222.185756] Call trace:
 [ 222.186091] sysrq_handle_crash+0x24/0x30
 [ 222.186628] __handle_sysrq+0xbc/0x1c0
 [ 222.187128] write_sysrq_trigger+0xb8/0x120
 [ 222.187690] proc_reg_write+0x80/0xc0
 [ 222.188182] __vfs_write+0x48/0x80
 [ 222.188639] vfs_write+0xac/0x1b0
 [ 222.189148] SyS_write+0x74/0xf0
 [ 222.189585] el0_svc_naked+0x30/0x34
 [ 222.190073] Code: 52800020 b90ca020 d5033e9f d2800001 (39000020)
 [ 222.190892] SMP: stopping secondary CPUs
 [ 222.193873] Starting crashdump kernel...
 [ 222.194414] Bye!
 ...
 [ 8.168635] kdump-tools[516]: Starting kdump-tools: * running makedumpfile -c -d 31 /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171229/dum...

Read more...

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Verification done for Bionic.

bionic-updates: failure.

$ dpkg -s makedumpfile | grep -i version
Version: 1:1.6.5-1ubuntu1~18.04.4

[ 8.369266] kdump-tools[513]: Starting kdump-tools: * running makedumpfile -c -d 31 /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171242/dump-incomplete
[ 8.382379] kdump-tools[513]: mv: cannot stat '/var/crash/202006171242/dump-incomplete': No such file or directory
[ 8.385529] kdump-tools[513]: * kdump-tools: saved vmcore in /var/crash/202006171242
[ 8.405479] kdump-tools[513]: * running makedumpfile --dump-dmesg /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171242/dmesg.202006171242
[ 8.422223] kdump-tools[513]: readmem: Can't convert a virtual address(6a2) to physical address.
[ 8.424872] kdump-tools[513]: readmem: type_addr: 0, addr:6a2, size:1032
[ 8.428291] kdump-tools[513]: validate_mem_section: Can't read mem_section array.
[ 8.429779] kdump-tools[513]: get_mem_section: Could not validate mem_section.
[ 8.432257] kdump-tools[513]: get_mm_sparsemem: Can't get the address of mem_section.
[ 8.436297] kdump-tools[513]: makedumpfile Failed.
[ 8.439351] kdump-tools[513]: * kdump-tools: makedumpfile --dump-dmesg failed. dmesg content will be unavailable
[ 8.442197] kdump-tools[513]: * kdump-tools: failed to save dmesg content in /var/crash/202006171242
[ 8.448570] kdump-tools[513]: Wed, 17 Jun 2020 12:42:16 +0000
[ 8.455630] kdump-tools[513]: Rebooting.
[ 8.514560] reboot: Restarting system

bionic-proposed: success.

$ dpkg -s makedumpfile | grep -i version
Version: 1:1.6.5-1ubuntu1~18.04.5

[ 7.628682] kdump-tools[517]: Starting kdump-tools: * running makedumpfile -c -d 31 /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171249/dump-incomplete
[ 7.642186] kdump-tools[517]: mv: cannot stat '/var/crash/202006171249/dump-incomplete': No such file or directory
[ 7.644850] kdump-tools[517]: * kdump-tools: saved vmcore in /var/crash/202006171249
[ 7.662102] kdump-tools[517]: * running makedumpfile --dump-dmesg /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171249/dmesg.202006171249
[ 7.695277] kdump-tools[517]: The dmesg log is saved to /var/crash/202006171249/dmesg.202006171249.
[ 7.697111] kdump-tools[517]: makedumpfile Completed.
[ 7.699346] kdump-tools[517]: * kdump-tools: saved dmesg content in /var/crash/202006171249
[ 7.711560] kdump-tools[517]: Wed, 17 Jun 2020 12:49:41 +0000
[ 7.720149] kdump-tools[517]: Rebooting.
[ 7.776421] reboot: Restarting system

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Verification done for Eoan:

eoan-updates: failure.

$ dpkg -s makedumpfile | grep -i version
Version: 1:1.6.6-2ubuntu2

[ 8.717056] kdump-tools[514]: Starting kdump-tools: * running makedumpfile -c -d 31 /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171254/dump-incomplete
[ 8.725052] kdump-tools[514]: mv: cannot stat '/var/crash/202006171254/dump-incomplete': No such file or directory
[ 8.727673] kdump-tools[514]: * kdump-tools: saved vmcore in /var/crash/202006171254
[ 8.748962] kdump-tools[514]: * running makedumpfile --dump-dmesg /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171254/dmesg.202006171254
[ 8.765998] kdump-tools[514]: readmem: Can't convert a virtual address(6a2) to physical address.
[ 8.767925] kdump-tools[514]: readmem: type_addr: 0, addr:6a2, size:1032
[ 8.772332] kdump-tools[514]: validate_mem_section: Can't read mem_section array.
[ 8.773832] kdump-tools[514]: get_mem_section: Could not validate mem_section.
[ 8.776258] kdump-tools[514]: get_mm_sparsemem: Can't get the address of mem_section.
[ 8.780311] kdump-tools[514]: makedumpfile Failed.
[ 8.781415] kdump-tools[514]: * kdump-tools: makedumpfile --dump-dmesg failed. dmesg content will be unavailable
[ 8.786166] kdump-tools[514]: * kdump-tools: failed to save dmesg content in /var/crash/202006171254
[ 8.790350] kdump-tools[514]: Wed, 17 Jun 2020 12:54:47 +0000
[ 8.800156] kdump-tools[514]: Rebooting.
[ 8.849905] reboot: Restarting system

eoan-proposed: success.

$ dpkg -s makedumpfile | grep -i version
Version: 1:1.6.6-2ubuntu2.1

[ 8.595861] kdump-tools[517]: Starting kdump-tools: * running makedumpfile -c -d 31 /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171257/dump-incomplete
[ 8.602814] kdump-tools[517]: mv: cannot stat '/var/crash/202006171257/dump-incomplete': No such file or directory
[ 8.605489] kdump-tools[517]: * kdump-tools: saved vmcore in /var/crash/202006171257
[ 8.625608] kdump-tools[517]: * running makedumpfile --dump-dmesg /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171257/dmesg.202006171257
[ 8.662375] kdump-tools[517]: The dmesg log is saved to /var/crash/202006171257/dmesg.202006171257.
[ 8.664378] kdump-tools[517]: makedumpfile Completed.
[ 8.668322] kdump-tools[517]: * kdump-tools: saved dmesg content in /var/crash/202006171257
[ 8.679634] kdump-tools[517]: Wed, 17 Jun 2020 12:57:16 +0000
[ 8.689315] kdump-tools[517]: Rebooting.
[ 8.743280] reboot: Restarting system

Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Verification done for Focal.

focal (-release): failure:

$ dpkg -s makedumpfile | grep -i version:
Version: 1:1.6.7-1ubuntu2

[ 8.465657] kdump-tools[516]: Starting kdump-tools: * running makedumpfile -c -d 31 /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171302/dump-incomplete
[ 8.478292] kdump-tools[516]: mv: cannot stat '/var/crash/202006171302/dump-incomplete': No such file or directory
[ 8.481075] kdump-tools[516]: * kdump-tools: saved vmcore in /var/crash/202006171302
[ 8.510048] kdump-tools[516]: * running makedumpfile --dump-dmesg /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171302/dmesg.202006171302
[ 8.527350] kdump-tools[516]: readmem: Can't convert a virtual address(6a2) to physical address.
[ 8.529301] kdump-tools[516]: readmem: type_addr: 0, addr:6a2, size:1032
[ 8.535281] kdump-tools[516]: validate_mem_section: Can't read mem_section array.
[ 8.536957] kdump-tools[516]: get_mem_section: Could not validate mem_section.
[ 8.539035] kdump-tools[516]: get_mm_sparsemem: Can't get the address of mem_section.
[ 8.543546] kdump-tools[516]: makedumpfile Failed.
[ 8.544785] kdump-tools[516]: * kdump-tools: makedumpfile --dump-dmesg failed. dmesg content will be unavailable
[ 8.547530] kdump-tools[516]: * kdump-tools: failed to save dmesg content in /var/crash/202006171302
[ 8.552599] kdump-tools[516]: Wed, 17 Jun 2020 13:02:28 +0000
[ 8.562610] kdump-tools[516]: Rebooting.
[ 8.613739] reboot: Restarting system

focal-proposed: success.

$ dpkg -s makedumpfile | grep -i version:
Version: 1:1.6.7-1ubuntu2.1

[ 8.578735] kdump-tools[505]: Starting kdump-tools: * running makedumpfile -c -d 31 /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171304/dump-incomplete
[ 8.603615] kdump-tools[505]: mv: cannot stat '/var/crash/202006171304/dump-incomplete': No such file or directory
[ 8.609175] kdump-tools[505]: * kdump-tools: saved vmcore in /var/crash/202006171304
[ 8.625685] kdump-tools[505]: * running makedumpfile --dump-dmesg /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171304/dmesg.202006171304
[ 8.660292] kdump-tools[505]: The dmesg log is saved to /var/crash/202006171304/dmesg.202006171304.
[ 8.662197] kdump-tools[505]: makedumpfile Completed.
[ 8.664300] kdump-tools[505]: * kdump-tools: saved dmesg content in /var/crash/202006171304
[ 8.677828] kdump-tools[505]: Wed, 17 Jun 2020 13:04:30 +0000
[ 8.686921] kdump-tools[505]: Rebooting.
[ 8.730789] reboot: Restarting system

tags: added: verification-done-bionic verification-done-eoan verification-done-focal
removed: verification-needed-bionic verification-needed-eoan verification-needed-focal
Revision history for this message
Mauricio Faria de Oliveira (mfo) wrote :

Verification done for Xenial:

xenial-updates: failure.

$ dpkg -s makedumpfile | grep -i version
Version: 1:1.6.3-2~16.04.2

[ 8.647356] kdump-tools[507]: Starting kdump-tools: * running makedumpfile -c -d 31 /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171359/dump-incomplete
[ 8.664692] kdump-tools[507]: mv: cannot stat '/var/crash/202006171359/dump-incomplete': No such file or directory
[ 8.667714] kdump-tools[507]: * kdump-tools: saved vmcore in /var/crash/202006171359
[ 8.692230] kdump-tools[507]: * running makedumpfile --dump-dmesg /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171359/dmesg.202006171359
[ 8.708653] kdump-tools[507]: readmem: Can't convert a virtual address(6a2) to physical address.
[ 8.710544] kdump-tools[507]: readmem: type_addr: 0, addr:6a2, size:1032
[ 8.712395] kdump-tools[507]: validate_mem_section: Can't read mem_section array.
[ 8.716294] kdump-tools[507]: get_mem_section: Could not validate mem_section.
[ 8.720306] kdump-tools[507]: get_mm_sparsemem: Can't get the address of mem_section.
[ 8.724320] kdump-tools[507]: The kernel version is not supported.
[ 8.725633] kdump-tools[507]: The makedumpfile operation may be incomplete.
[ 8.728284] kdump-tools[507]: makedumpfile Failed.
[ 8.731764] kdump-tools[507]: * kdump-tools: makedumpfile --dump-dmesg failed. dmesg content will be unavailable
[ 8.736343] kdump-tools[507]: * kdump-tools: failed to save dmesg content in /var/crash/202006171359
[ 8.738431] kdump-tools[507]: Wed, 17 Jun 2020 13:59:30 +0000
[ 8.746373] kdump-tools[507]: Rebooting.
[ 8.798875] reboot: Restarting system

xenial-proposed: success.

$ dpkg -s makedumpfile | grep -i version
Version: 1:1.6.3-2~16.04.3

[ 8.945076] kdump-tools[506]: Starting kdump-tools: * running makedumpfile -c -d 31 /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171401/dump-incomplete
[ 8.957889] kdump-tools[506]: mv: cannot stat '/var/crash/202006171401/dump-incomplete': No such file or directory
[ 8.960648] kdump-tools[506]: * kdump-tools: saved vmcore in /var/crash/202006171401
[ 8.978986] kdump-tools[506]: * running makedumpfile --dump-dmesg /home/ubuntu/201909170743/vmcore.201909170743 /var/crash/202006171401/dmesg.202006171401
[ 9.014422] kdump-tools[506]: The kernel version is not supported.
[ 9.015867] kdump-tools[506]: The makedumpfile operation may be incomplete.
[ 9.020309] kdump-tools[506]: The dmesg log is saved to /var/crash/202006171401/dmesg.202006171401.
[ 9.022123] kdump-tools[506]: makedumpfile Completed.
[ 9.024312] kdump-tools[506]: * kdump-tools: saved dmesg content in /var/crash/202006171401
[ 9.035973] kdump-tools[506]: Wed, 17 Jun 2020 14:01:41 +0000
[ 9.045322] kdump-tools[506]: Rebooting.
[ 9.096136] reboot: Restarting system

tags: added: verification-done-xenial
removed: verification-needed-xenial
tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package makedumpfile - 1:1.6.7-1ubuntu2.1

---------------
makedumpfile (1:1.6.7-1ubuntu2.1) focal; urgency=medium

  * d/p/0003-align_PMD_SECTION_MASK_with_PHYS_MASK.patch (LP: #1869465)
    Fix error on arm64 with 1TB+ of physical or firmware-mapped RAM.

 -- Ioanna Alifieraki <email address hidden> Thu, 04 Jun 2020 14:47:17 +0100

Changed in makedumpfile (Ubuntu Focal):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for makedumpfile has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package makedumpfile - 1:1.6.6-2ubuntu2.1

---------------
makedumpfile (1:1.6.6-2ubuntu2.1) eoan; urgency=medium

  * d/p/0004-align_PMD_SECTION_MASK_with_PHYS_MASK.patch (LP: #1869465)
    Fix error on arm64 with 1TB+ of physical or firmware-mapped RAM.

 -- Ioanna Alifieraki <email address hidden> Thu, 04 Jun 2020 15:00:16 +0100

Changed in makedumpfile (Ubuntu Eoan):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package makedumpfile - 1:1.6.5-1ubuntu1~18.04.5

---------------
makedumpfile (1:1.6.5-1ubuntu1~18.04.5) bionic; urgency=medium

  * d/p/0006-align_PMD_SECTION_MASK_with_PHYS_MASK.patch (LP: #1869465)
    Fix error on arm64 with 1TB+ of physical or firmware-mapped RAM.

 -- Ioanna Alifieraki <email address hidden> Thu, 04 Jun 2020 15:08:05 +0100

Changed in makedumpfile (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package makedumpfile - 1:1.6.3-2~16.04.3

---------------
makedumpfile (1:1.6.3-2~16.04.3) xenial; urgency=medium

  * d/p/0004-align_PMD_SECTION_MASK_with_PHYS_MASK.patch (LP: #1869465)
    Fix error on arm64 with 1TB+ of physical or firmware-mapped RAM.

 -- Ioanna Alifieraki <email address hidden> Thu, 04 Jun 2020 15:28:39 +0100

Changed in makedumpfile (Ubuntu Xenial):
status: Fix Committed → Fix Released
tags: removed: sts-sponsor-mfo
Changed in makedumpfile (Debian):
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.