crashkernel offset prevents kernel boot

Bug #1728115 reported by Thadeu Lima de Souza Cascardo
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Incomplete
Undecided
Unassigned
Artful
New
Undecided
Unassigned
Bionic
Incomplete
Undecided
Unassigned
makedumpfile (Ubuntu)
Fix Released
High
Thadeu Lima de Souza Cascardo
Artful
Fix Released
High
Thadeu Lima de Souza Cascardo
Bionic
Fix Released
High
Thadeu Lima de Souza Cascardo

Bug Description

[Impact]
On Power Systems, the kernel won't boot when given a crashkernel parameter with an offset/start address of 32M. No output will be shown, giving no clue to the user why the system has not booted, or what the problem is.

[Test Case]
Installing kdump-tools on artful, then booting the system. It won't boot. With the fix, it boots and the crash kernel is reserved.

[Regression Potential]
Some Power Systems might have problems loading the kernel at this address. LP#1567539 is not really clear if PowerNV systems won't kdump when using an address different from 32M. However, it has been requested from an IBM person to test it with 128M instead, and no particular problem was shown. It's possible that there was no reason at first to use 32M, and no problems will be found on either PowerNV or other systems on the field. On the other hand, it's possible they might break kdump. But, right now, those systems won't even boot the first kernel without this change.

----

The linux kernel won't boot when crashkernel parameter tells it to load a crash kernel at 32Mi on ppc64el on artful.

This happens because the artful kernel is too big. In fact, multiple requirements on the architecture lead to that:

Kernel memory at address 0 is reserved.
crashkernel must be at first RMO, so architecture puts it at 128Mi. However, kdump-tools currently puts it at @32Mi because of bug https://bugs.launchpad.net/ubuntu/+source/makedumpfile/+bug/1567539.
PACA and LPPACA need to be at the first RMO as well, and with 2048 CPUs, they take more than 5MB and 2MiB, respectively.

With the kernel now taking around 25MB from stext to _end, the kernel can't reserve enough memory for PACA or LPPACA right after boot, and it panics.

So, right after installing kdump-tools on artful, and rebooting, the kernel won't boot, with no sign of life as we haven't even started any console. Investigation for this issue took an entire day.

The fix would be setting the loading address to 128MiB, and start reducing size of PACA and maybe remove some of the requirements for the location of PACA and crash kernel.

I would not even set the loading address of the crash kernel in the parameter itself, and leave it to the kernel to decide it, which it already does and already would put it at 128Mi.

Cascardo.

Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1728115

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Changed in linux (Ubuntu):
status: Incomplete → Triaged
Changed in makedumpfile (Ubuntu):
status: New → Triaged
assignee: nobody → Thadeu Lima de Souza Cascardo (cascardo)
Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :
description: updated
Changed in linux (Ubuntu):
status: Triaged → In Progress
Changed in makedumpfile (Ubuntu):
status: Triaged → In Progress
Changed in linux (Ubuntu):
status: In Progress → New
Revision history for this message
Ubuntu Kernel Bot (ubuntu-kernel-bot) wrote :

This bug is missing log files that will aid in diagnosing the problem. While running an Ubuntu kernel (not a mainline or third-party kernel) please enter the following command in a terminal window:

apport-collect 1728115

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "makedumpfile_1.6.1-2ubuntu0.1.diff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :
Changed in makedumpfile (Ubuntu):
importance: Undecided → High
Changed in makedumpfile (Ubuntu Artful):
status: New → In Progress
importance: Undecided → High
assignee: nobody → Thadeu Lima de Souza Cascardo (cascardo)
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

I don't see this fix in bionic yet. Could anyone first release it there? Stable updates can only be backported if they're present in the devel series.

Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package makedumpfile - 1:1.6.2-1ubuntu1

---------------
makedumpfile (1:1.6.2-1ubuntu1) bionic; urgency=medium

  * KDUMP_CMDLINE_APPEND: add noirqdistrib to default command line. As it's
    only used by ppc64el, it's not required to be conditionally added.
    (LP: #1658733)
  * Set crashkernel for ppc64el to load at 128M instead of 32M. That allows
    larger kernels to boot. (LP: #1728115)

 -- Thadeu Lima de Souza Cascardo <email address hidden> Tue, 07 Nov 2017 12:23:33 +0000

Changed in makedumpfile (Ubuntu Bionic):
status: In Progress → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Thadeu, or anyone else affected,

Accepted makedumpfile into artful-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/makedumpfile/1:1.6.1-2ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-artful to verification-done-artful. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-artful. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in makedumpfile (Ubuntu Artful):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-artful
Revision history for this message
Chris Halse Rogers (raof) wrote :

This package has been waiting for testing in Artful for 90 days and is blocking the release of other makedumpfile fixes for Xenial and Artful. Please test this SRU!

bugproxy (bugproxy)
tags: added: architecture-ppc64le bugnameltc-165886 severity-medium targetmilestone-inin---
removed: verification-needed verification-needed-artful
Revision history for this message
Thadeu Lima de Souza Cascardo (cascardo) wrote :

ubuntu@artful:~$ dpkg -s kdump-tools | grep Version
Version: 1:1.6.1-2ubuntu0.1

System boots fine after installing kdump-tools. crash kernel is loaded.

ubuntu@artful:~$ dmesg | grep Reserving
[ 0.000000] Reserving 320MB of memory at 128MB for crashkernel (System RAM: 2048MB)

tags: added: verification-done verification-done-artful
bugproxy (bugproxy)
tags: removed: bugnameltc-165886 patch severity-medium verification-done verification-done-artful
tags: added: verification-done verification-done-artful
bugproxy (bugproxy)
tags: added: bugnameltc-165886 severity-medium
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package makedumpfile - 1:1.6.1-2ubuntu0.1

---------------
makedumpfile (1:1.6.1-2ubuntu0.1) artful; urgency=medium

  * KDUMP_CMDLINE_APPEND: add noirqdistrib to default command line. As it's
    only used by ppc64el, it's not required to be conditionally added.
    (LP: #1658733)
  * Set crashkernel for ppc64el to load at 128M instead of 32M. That allows
    larger kernels to boot. (LP: #1728115)

 -- Thadeu Lima de Souza Cascardo <email address hidden> Tue, 07 Nov 2017 12:23:33 +0000

Changed in makedumpfile (Ubuntu Artful):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for makedumpfile has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

bugproxy (bugproxy)
tags: added: targetmilestone-inin1710
removed: targetmilestone-inin---
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.