natty kernel does not boot on ec2 t1.micro
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | linux (Ubuntu) |
High
|
Stefan Bader | ||
| | Natty |
High
|
Stefan Bader | ||
Bug Description
This bug has been split off of bug 669496.
instances of size t1.micro on EC2 do not boot with the natty kernel.
This is true both of i386 and amd64.
I've just tested with instances of:
us-east-1 ami-dece38b7 ebs/ubuntu-
us-east-1 ami-d4ce38bd ebs/ubuntu-
There is no console output past the grub messages.
ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: linux-image-
Regression: Yes
Reproducible: Yes
ProcVersionSign
Uname: Linux 2.6.37-8-virtual x86_64
AlsaDevices: Error: command ['ls', '-l', '/dev/snd/'] failed with exit code 2: ls: cannot access /dev/snd/: No such file or directory
AplayDevices: Error: [Errno 2] No such file or directory
Architecture: amd64
ArecordDevices: Error: [Errno 2] No such file or directory
CurrentDmesg:
Date: Tue Dec 7 18:27:29 2010
Ec2AMI: ami-d4ce38bd
Ec2AMIManifest: (unknown)
Ec2Availability
Ec2InstanceType: m1.large
Ec2Kernel: aki-427d952b
Ec2Ramdisk: unavailable
Lspci:
Lsusb: Error: command ['lsusb'] failed with exit code 1:
ProcEnviron:
PATH=(custom, user)
LANG=en_US.UTF-8
SHELL=/bin/bash
ProcKernelCmdLine: root=LABEL=
ProcModules: acpiphp 19089 0 - Live 0xffffffffa0000000
SourcePackage: linux
| Scott Moser (smoser) wrote : | #1 |
| Scott Moser (smoser) wrote : | #2 |
| Changed in linux (Ubuntu): | |
| assignee: | nobody → Stefan Bader (stefan-bader-canonical) |
| importance: | Undecided → High |
| milestone: | none → natty-alpha-2 |
| status: | New → Confirmed |
| tags: | added: kernel-series-unknown |
| Stefan Bader (smb) wrote : | #3 |
| tags: |
added: regression-release removed: regression-update |
| Stefan Bader (smb) wrote : | #4 |
Some updates here: the good news is that I am able to reproduce this on a local CentOS based installation. Bad news so far is that the DomU crashes so quickly that I get no output at all, even when directly attaching to the console on "xm create".
But at least I found a lead. The crashes happen if the guest memory is less than 1G and not dividable by 4. So 615M crashes, but 616 will boot (or 612 and so on). There is also a visible change in the memory layout presented to Linux. While previously the max_pfn was directly used to create an e820 map, there is now some additional 8M added in the data returned by the memory hypercall. I cannot say right now whether that directly relates to the crash or not but one can see that starting a guest with mem=616, Linux will report 624M of memory. There is a lot of shifting around and recalculating going on which I have yet to understand.
| Scott Moser (smoser) wrote : | #5 |
@Stefan,
just for reference, could you attach your xen config for this instance ? I'd like to recreate.
| Stefan Bader (smb) wrote : | #6 |
name = "NattyServerMic
kernel = "/root/
memory = 616
vcpus = 1
disk = [ 'file:/
vif = [ '' ]
Not sure the vif really would work like this. I seem to have problems getting the boot completed (currently got the cloud-init stuff disabled as I have no magic meta server).
| Stefan Bader (smb) wrote : | #7 |
One further step finally. Using 'on_crash = "coredump-destroy"' and after creating /var/xen/dump, I was able to extract the following from the dump file:
<6>[ 0.000000] ACPI in unprivileged domain disabled
<3>[ 0.000000] max_pfn used = 26700(26700000)
<3>[ 0.000000] Xen: map base 0 + 26f00000
<3>[ 0.000000] Xen: map end = 26f00000
<3>[ 0.000000] map size reduzed to 26700000
<3>[ 0.000000] delta = 800000, extra_pages = 2048
<3>[ 0.000000] extra_mem_start = 26700000
<3>[ 0.000000] Xen: reserve c166f000c15d2000 - 800
<6>[ 0.000000] released 0 pages of unused memory
<3>[ 0.000000] Xen: extra_limit = 159488
<3>[ 0.000000] Xen: adding 2048 extra pages at 644874240
<6>[ 0.000000] BIOS-provided physical RAM map:
<6>[ 0.000000] Xen: 0000000000000000 - 00000000000a0000 (usable)
<6>[ 0.000000] Xen: 00000000000a0000 - 0000000000100000 (reserved)
<6>[ 0.000000] Xen: 0000000000100000 - 0000000026f00000 (usable)
<6>[ 0.000000] NX (Execute Disable) protection: active
<6>[ 0.000000] DMI not present or invalid.
<7>[ 0.000000] e820 update range: 0000000000000000 - 0000000000010000 (usable
) ==> (reserved)
<7>[ 0.000000] e820 remove range: 00000000000a0000 - 0000000000100000 (usable)
<6>[ 0.000000] last_pfn = 0x26f00 max_arch_pfn = 0x1000000
<6>[ 0.000000] Scanning 0 areas for low memory corruption
<7>[ 0.000000] initial memory mapped : 0 - 01fff000
<6>[ 0.000000] init_memory_
<7>[ 0.000000] 0000000000 - 0026f00000 page 4k
<7>[ 0.000000] kernel direct mapping tables up to 26f00000 @ 1ec4000-1fff000
<1>[ 0.000000] BUG: unable to handle kernel NULL pointer dereference at (null)
<1>[ 0.000000] IP: [<c0107397>] xen_set_
<4>[ 0.000000] *pdpt = 0000000000000000 *pde = 0000000000000000
<0>[ 0.000000] Oops: 0003 [#1] SMP
<0>[ 0.000000] last sysfs file:
<4>[ 0.000000] Modules linked in:
<4>[ 0.000000]
<4>[ 0.000000] Pid: 0, comm: swapper Not tainted 2.6.37-12-virtual #26+lp686692v3 /
<4>[ 0.000000] EIP: e019:[<c0107397>] EFLAGS: 00010046 CPU: 0
<4>[ 0.000000] EIP is at xen_set_
<4>[ 0.000000] EAX: 00000000 EBX: c1fe7800 ECX: 00000000 EDX: c0848000
<4>[ 0.000000] ESI: 00000003 EDI: 00000000 EBP: c0849e14 ESP: c0849e04
<4>[ 0.000000] DS: e021 ES: e021 FS: 00d8 GS: 00e0 SS: e021
<0>[ 0.000000] Process swapper (pid: 0, ti=c0848000 task=c084f060 task.ti=c0848000)
<0>[ 0.000000] Stack:
<4>[ 0.000000] c1fe7800 c1fe7800 00000003 00000000 c0849e30 c08aa7ca 00000fff fffff003
<4>[ 0.000000] e6700000 00000000 00026700 c0849e38 c01362be c0849e8c c08b9961 c0849e64
<4>[ 0.000000] 46cf9ef8 00026701 c1fe7800 c0a3f998 00000133 00000100 00026f00 00000000
<0>[ 0.000000] Call Trace:
<4>[ 0.000000] [<c08aa7ca>] ? xen_set_
<4>[ 0.000000] [<c01362be>] ? set_pte+0xe/0x10
<4>[ 0.000000] [<c08b9961>] ? kernel_
<4>[ 0.000000] [<c06122b6>] ? init_memory_
<4>[ 0.000000] [<c08ac037>] ? setup_arch+
<4>[ 0.000000] [<c010798e>] ? __raw_callee_
<4>[ 0.000000] [...
| Stefan Bader (smb) wrote : | #8 |
I think I see the issue now. When xen sets up the p2m tree, it does a loop from 0 to max_pfn-1, incrementing by the number of p2m mappings in the leaf. If max_pfn is a multiple of 4M this works out. But if not, we need an additional leaf being initialized (which is only partially used).
I need to think about how to make this work best. Maybe the end_pfn needs to be rounded up to the next multiple of P2M_PER_PAGE. And the next question would be how many places need to be touched as there is at least another place which sets up the corresponding pfn to mfn mapping...
| Stefan Bader (smb) wrote : | #9 |
<3>[ 0.000000] smb: pfn=266ff calling set_pte(c1fe77f8, 6b3003)
<3>[ 0.000000] smb: pfn=26700 calling set_pte(c1fe7800, 3)
<1>[ 0.000000] BUG: unable to handle kernel NULL pointer dereference at (null)
This was seen with some annotation. Basically pfn_pte for the last pfn returns an invalid pte.
| Stefan Bader (smb) wrote : | #10 |
Ok, so it was the right place but a completely wrong explanation. The problem is not that the last part of pointers is missed but that it is not. The problem is that the kernel is given a flat array of address pointers by the domain constructor along with the number of pointer in that array. With recent changes, the Xen kernel code tries to map this into a 3-level tree structure, where the leaves contain a part of that array. To conserve memory, the 2nd level points directly at parts of the flat array, which is ok as long as the whole 4k area is containing valid pointers. But for memory assignments which are not a multiple of 4MB (or 2MB for 64bit) the last leaf would contain some undefined pointers instead of invalid markers.
The attached patch assumes that it is not good to meddle with the memory at the end of the external array, so if there is a final leaf that would only be partially filled, it allocates a new page, initializes it and then copies the valid pointers from the original array.
| tags: | added: patch |
| Stefan Bader (smb) wrote : | #11 |
With that patch applied I was able to successfully boot t1.micro instances with a 2.6.37 kernel:
ubuntu@
Linux ip-10-112-5-120 2.6.37-12-virtual #26+686692v2 SMP Thu Jan 20 11:30:38 UTC 2011 x86_64 GNU/Linux
ubuntu@
t1.micro
ubuntu@
x86_64
ubuntu@
Linux ip-10-117-61-4 2.6.37-12-virtual #26+686692v2 SMP Thu Jan 20 11:33:17 UTC 2011 i686 GNU/Linux
ubuntu@
t1.micro
ubuntu@
i686
Next step will be to send this upstream to see whether it is an acceptable approach or not.
| Changed in linux (Ubuntu Natty): | |
| status: | Confirmed → In Progress |
| Changed in linux (Ubuntu Natty): | |
| status: | In Progress → Fix Committed |
| Scott Moser (smoser) wrote : | #12 |
I'm still unable to boot i386 instances. I tested
us-east-1 ami-5c3fcf35 canonical ebs/ubuntu-
It resulted in no console output and unreachable instance in t1.micro.
So, i386 is still broken on t1.micro (the same ami does boot on m1.small).
However, x86_64 is functional. I just verified
us-east-1 ami-2e3fcf47 canonical ebs/ubuntu-
$ uname -r
2.6.38-1-virtual
$ uname -m
x86_64
$ ec2metadata --instance-type
t1.micro
$ dpkg -S /boot/vmlinuz-
linux-image-
| Changed in linux (Ubuntu Natty): | |
| milestone: | natty-alpha-2 → natty-alpha-3 |
| Scott Moser (smoser) wrote : | #13 |
This was fix-released by Stefan in 2.6.38-1.28. Alpha2 boots in amd64 in t1.micro. We've opened bug 710754 to address the i386 issue.
| Changed in linux (Ubuntu Natty): | |
| status: | Fix Committed → Fix Released |
| Andy Whitcroft (apw) wrote : | #14 |
This bug was fixed in the package linux - 2.6.38-1.27
---------------
linux (2.6.38-1.27) natty; urgency=low
[ Andy Whitcroft ]
* ubuntu: AUFS -- update aufs-update to track new locations of headers
* ubuntu: AUFS -- update to c5021514085a5d9
* SAUCE: ensure root is ready before running usermodehelpers in it
* correct the Vcs linkage to point to natty
* rebase to linux tip e78bf5e6cbe837d
* [Config] update configs following rebase
e78bf5e6cbe
* SAUCE: Yama: follow changes to generic_permission
* ubuntu: compcache -- follow changes to bd_claim/bd_release
* ubuntu: iscsitarget -- follow changes to open_bdev_exclusive
* ubuntu: ndiswrapper -- fix interaction between __packed and packed
* ubuntu: AUFS -- update to 806051bcbeec277
* update package version to match payload version
* rebase to e6f597a1425b5af
* rebase to v2.6.38-rc1
* [Config] updateconfigs following rebase to v2.6.38-rc1
* SAUCE: x86 fix up jiffies/jiffies_64 handling
* rebase to linus tip 2b1caf6ed7b888c
* [Config] updateconfigs following rebase to
2b1caf6ed7b
* [Config] disable CONFIG_
* ubuntu: AUFS -- suppress benign plink warning messages
- LP: #621195
* [Config] CONFIG_NR_CPUS=256 for amd64 -server flavour
* rebase to v2.6.38-rc2
* rebase to mainline d315777b32a4696
* rebase to c723fdab8aa728d
* [Config] update configs following rebase to
c723fdab8aa
* [Config] disable CONFIG_AD7152 to fix FTBS on armel versatile
* [Config] disable CONFIG_AD7150 to fix FTBS on armel versatile
* [Config] disable CONFIG_RTL8192CE to fix FTBS on armel omap
* [Config] disable CONFIG_MANTIS_CORE to fix FTBS on armel versatile
[ Kees Cook ]
* SAUCE: kernel: make /proc/kallsyms mode 400 to reduce ease of attacking
[ Stefan Bader ]
* Temporarily disable RODATA for virtual i386
- LP: #699828
[ Tim Gardner ]
* [Config] CONFIG_
- LP: #683690
* [Config] CONFIG_
* update bnx2 firmware files in d-i/firmware/
[ Upstream Kernel Changes ]
* Revert "drm/radeon/bo: add some fallback placements for VRAM only
objects."
* packaging: make System.map mode 0600
* thinkpad_acpi: Always report scancodes for hotkeys
- LP: #702407
* sched: tg->se->load should be initialised to tg->shares
* Input: sysrq -- ensure sysrq_enabled and __sysrq_enabled are consistent
* brcm80211: include linux/slab.h for kfree
* pch_dma: add include/slab.h for kfree
* i2c-eg20t: include linux/slab.h for kfree
* gpio/ml_ioh_gpio: include linux/slab.h for kfree
* tty: include linux/slab.h for kfree
* winbond: include linux/delay.h for mdelay et al
[ Upstream Kernel Changes ]
* mark the start of v2.6.38 versioning
* rebase v2.6.37 to v2.6.38-rc2 + c723fdab8aa728d
- LP: #689886
- LP: #702125
- LP: #608775
- LP: #215802
...
| Matt Wilson (msw-amazon) wrote : | #15 |
The permanent fix for this is likely in PV-GRUB. See: https:/


Not the solution yet, unfortunately, but looking at bug #667796, we found that XEN_MAX_ DOMAIN_ MEMORY limits the memory a domU is reporting. Looking at Natty, this has actually changed to a fixed config option of 128GB. But this went with a quite big change to the mmu code and only changing the value back to 70 is not enough to make it work again. But at least the following commit may be a start to look at:
commit 58e05027b530ff0 81ecea68e38de8d 59db8f87e0
Author: Jeremy Fitzhardinge <email address hidden>
Date: Fri Aug 27 13:28:48 2010 -0700
xen: convert p2m to a 3 level tree
Make the p2m structure a 3 level tree which covers the full possible
physical space.
The p2m structure contains mappings from the domain's pfns to system-wide
mfns. The structure has 3 levels and two roots. The first root is for
the domain's own use, and is linked with virtual addresses. The second
is all mfn references, and is used by Xen on save/restore to allow it to
update the p2m mapping for the domain.
At boot, the domain builder provides a simple flat p2m array for all the to_machine( )
initially present pages. We construct the two levels above that using
the early_brk allocator. After early boot time, set_phys_
will allocate any missing levels using the normal kernel allocator
(at GFP_KERNEL, so it must be called in a normal blocking context).
Because the early_brk() API requires us to pre-reserve the maximum amount XEN_MAX_ DOMAIN_ MEMORY
of memory we could allocate, there is still a CONFIG_
config option, but its only negative side-effect is to increase the
kernel's apparent bss size. However, since all unused brk memory is
returned to the heap, there's no real downside to making it large.