Boot fails with "initrd extends beyond end of memory" after upgrade to Hardy

Bug #219868 reported by Chris Samuel
4
Affects Status Importance Assigned to Milestone
initramfs-tools (Ubuntu)
Undecided
Unassigned
linux (Ubuntu)
Medium
Unassigned

Bug Description

I have an Olivetti Netstrada 7000 with 4 200MHz Intel Pentium Pro processors and 256MB of RAM which was running the server version of Feisty. I upgraded it to Gutsy with "sudo do-release-upgrade" and it worked fine, but when I then upgraded to Hardy with "sudo do-release-upgrade -d" I found it wouldn't boot.

The boot error is:

initrd extends beyond end of memory (0x0ffef173 > ox01000000)
ACPI: no DMI BIOS year, acpi=force is required to enable ACPI
Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block (0,0)

So it's complaining that 256MB > 16MB, which whilst being technically correct is not very useful.

The ACPI message is harmless, I get the same with the Gutsy kernel that still remains and (fortunately) still works!

I've attached the dmesg output from the Gutsy kernel and will attempt booting with mem=256M.

Revision history for this message
Chris Samuel (chris-csamuel) wrote :
Revision history for this message
Chris Samuel (chris-csamuel) wrote :

Unfortunately booting with mem=256M doesn't change anything.

Anything else you need from me ?

BTW: I filed this against initramfs-tools as I wasn't sure whether it was due to the size of the generated initrd or whether it is a kernel bug, apologies if I've misfiled.

Revision history for this message
Chris Samuel (chris-csamuel) wrote :

I've now added the kernel as an "also affected" package as looking at the kernel sources as it appears that the only way that the "initrd extends beyond end of memory" error can occur is if the Hardy kernel now gets the amount of LOWMEM wrong on my hardware.

The working Gutsy kernel says:

$ dmesg | grep -i lowmem
[ 0.000000] 256MB LOWMEM available.
[ 336.956158] lowmem : 0xc0000000 - 0xd0000000 ( 256 MB)

and the test in the 2.6.24 kernel that is failing is pretty simple, being:

                unsigned long end_of_lowmem = max_low_pfn << PAGE_SHIFT;

                if (ramdisk_end <= end_of_lowmem) {
                        reserve_bootmem(ramdisk_image, ramdisk_size);
                        initrd_start = ramdisk_image + PAGE_OFFSET;
                        initrd_end = initrd_start+ramdisk_size;
                } else {
                        printk(KERN_ERR "initrd extends beyond end of memory "
                               "(0x%08lx > 0x%08lx)\ndisabling initrd\n",
                               ramdisk_end, end_of_lowmem);
                        initrd_start = 0;
                }

So for the test to fail the Hardy kernel must get end_of_lowmem wrong.

The test is not functionally different from 2.6.22 (just cleaned up a bit) so it is unlikely that is the issue.

I have already purged and reinstalled the kernel (which recreated the initrd) and reinstalled grub, just in case. Also debsums doesn't show anything unusual.

Revision history for this message
Chris Samuel (chris-csamuel) wrote :

Having built myself a mainline 2.6.25 kernel with netconsole support and the necessary drivers for SCSI and XFS I can confirm that this happens upstream too.

This is an extract from dmesg from the working Gutsy kernel:

[ 0.000000] Linux version 2.6.22-14-server (buildd@terranova) (gcc version 4.1.3 20070929 (prerelease) (Ubuntu 4.1.2-16ubuntu2)) #1 SMP Tue Feb 12 08:27:05 UTC 2008 (Ubuntu 2.6.22-14.52-server)
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e801: 0000000000000000 - 000000000009f000 (usable)
[ 0.000000] BIOS-e801: 0000000000100000 - 0000000010000000 (usable)
[ 0.000000] 0MB HIGHMEM available.
[ 0.000000] 256MB LOWMEM available.

this is the same section of dmesg from the non-functional 2.6.25 kernel:

[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] Linux version 2.6.25-cs1 (chris@netstrada) (gcc version 4.2.3 (Ubuntu 4.2.3-2ubuntu7)) #1 SMP Sun Apr 27 01:31:12 EST 2008
[ 0.000000] BIOS-provided physical RAM map:
[ 0.000000] BIOS-e801: 0000000000000000 - 000000000009f000 (usable)
[ 0.000000] BIOS-e801: 0000000000100000 - 0000000001000000 (usable)
[ 0.000000] Initializing cgroup subsys cpuset
[ 0.000000] 16MB LOWMEM available.

So it is indeed getting the amount of LOWMEM wrong. :-(

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

Hi Chris,

Care to also create a bug report upstream at http://bugzilla.kernel.org . It is often the case that once a bug is escalated upstream there is a quick resolution through the help and support of the mainline kernel community. I'll go ahead and reassign this to our Ubuntu kernel team as well. Thanks.

Changed in linux:
assignee: nobody → ubuntu-kernel-team
importance: Undecided → Medium
status: New → Triaged
Revision history for this message
Chris Samuel (chris-csamuel) wrote :

Hi Leann,

I'm currently discussing this with H. Peter Anvin (the x86 boot code maintainer) by private email and have done some testing with him using the syslinux "meminfo.c32" plugin.

Latest comment from him is:

# Right... you have a system dependent on E801, and somehow E801 returns crap.
#
# I'm going to cook up a modified meminfo.c32 for you and see if we can't
# track this down.

So something that happened around 2.6.23/2.6.24 broke the way that the kernel was handling this corner case (maybe in the 32/64 bit merge) and stopped it working.

I'll report more when there is something definite from the new version that he's working on.

Revision history for this message
Leann Ogasawara (leannogasawara) wrote :

The Ubuntu Kernel Team is planning to move to the 2.6.27 kernel for the upcoming Intrepid Ibex 8.10 release. As a result, the kernel team would appreciate it if you could please test this newer 2.6.27 Ubuntu kernel. There are one of two ways you should be able to test:

1) If you are comfortable installing packages on your own, the linux-image-2.6.27-* package is currently available for you to install and test.

--or--

2) The upcoming Alpha5 for Intrepid Ibex 8.10 will contain this newer 2.6.27 Ubuntu kernel. Alpha5 is set to be released Thursday Sept 4. Please watch http://www.ubuntu.com/testing for Alpha5 to be announced. You should then be able to test via a LiveCD.

Please let us know immediately if this newer 2.6.27 kernel resolves the bug reported here or if the issue remains. More importantly, please open a new bug report for each new bug/regression introduced by the 2.6.27 kernel and tag the bug report with 'linux-2.6.27'. Also, please specifically note if the issue does or does not appear in the 2.6.26 kernel. Thanks again, we really appreicate your help and feedback.

Revision history for this message
Chris Samuel (chris-csamuel) wrote :

More than happy to test, will take a few days as the box has been moved and is currently uncabled. Not to mention being painfully slow. ;-)

Revision history for this message
Launchpad Janitor (janitor) wrote : Kernel team bugs

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

Unfortunately it seems this bug is still an issue. Can you confirm this issue exists with the most recent Jaunty Jackalope 9.04 release - http://www.ubuntu.com/news/ubuntu-9.04-desktop . If the issue remains in Jaunty, Please run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux-image-2.6.28-11-generic 219868

If you could also test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine this issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

Changed in linux (Ubuntu):
status: Triaged → Incomplete
tags: added: needs-kernel-logs needs-upstream-testing
Revision history for this message
Jeremy Foshee (jeremyfoshee) wrote :

This bug report was marked as Incomplete and has not had any updated comments for quite some time. As a result this bug is being closed. Please reopen if this is still an issue in the current Ubuntu release http://www.ubuntu.com/getubuntu/download . Also, please be sure to provide any requested information that may have been missing. To reopen the bug, click on the current status under the Status column and change the status back to "New". Thanks.

[This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: kj-expired
Changed in linux (Ubuntu):
status: Incomplete → Invalid
Revision history for this message
maximilian attems (maks-debian) wrote :

Not an initramfs-tools bug, thus closing here too.

Changed in initramfs-tools (Ubuntu):
status: New → Invalid
Revision history for this message
Chris Samuel (chris-csamuel) wrote :

I can confirm that this is an upstream bug in the Linux kernel and still affects all kernels up to and including 2.6.38.*. A fix has now been merged into 2.6.39 but it's a very trivial single letter change.

The git commit is 39b68976ac653cfdc7f872a293e8b7928de2dcc6 in Linus's mainline tree and H. Peter Anvin describes it as:

# When we use BIOS function e801 to probe memory, we should use ax/bx
# (or cx/dx) as a pair, not mix and match. This was a typo during the
# translation from assembly code, and breaks at least one set of
# machines in the field (which return cx = dx = 0).

The commit in Linus's tree is here:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=39b68976ac653cfdc7f872a293e8b7928de2dcc6

Hope this helps!

Changed in linux (Ubuntu):
status: Invalid → Confirmed
Revision history for this message
Brad Figg (brad-figg) wrote : Unsupported series, setting status to "Won't Fix".

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in a supported series, please file a new bug.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers