System crash due to 4M large page used in the kernel

Bug #253875 reported by james
2
Affects Status Importance Assigned to Milestone
Moblin Kernel
New
Undecided
Unassigned

Bug Description

Problem description:
The system will hang when the system memory is used up by running the ttmtest application in the attached file continuously.

HW: Crownbeach D1 board
SW: Moblin hardy (2.6.24) and gusty(2.6.22) + GFX driver (psb D1_Build_7, 2.1.0.32L.0019)

Reproduce steps:
1. boot the system
2. enter the terminal in the X windows
3. cd ttmtest/src
4. ./test-script 300
5. The system will hang totally after several minutes

More detailed info:
1. The ttmtest application will request the Buffer Object in the drm kernel module, which will request the pages with PAGE_KERNEL_NOCACHE attribute from kernel. This application and test script are used to simulate the Video decoding in the MID platform. (When decoding many clips continuously on moblin gusty or hardy, the system will hang also)
2. When using "mem=nopentium" in the boot option or recompile the kernel by enabling CONFIG_DEBUG_RODATA, then the system will not hang in the above testing.
3. Both "mem=nopentium" and CONFIG_DEBUG_RODATA will disable the 4M large page and use small(4k) page for kernel codes/data. So the hang is due to the 4M kernel page is used.
4. The hang doesn't happen on Pouslbo VV D1 board. The only difference between Poulsbo VV D1 board and Crownbeach D1 board is the CPU. Crownbeach D1 board uses Silverthorne Atom processor. It seems that the 4M page mechanism in Moblin kernel (both 2.6.22 and 2.6.24) doesn't collaborate well with the Silverthorne Atom processor.

Next steps needed on Moblin kernel:
1. Disable the 4M large page in the Moblin images. But it may reduce the performance.
Or
2. Figure out the root cause why the 4M large page mechanism of moblin kernel doesn't work on Crownbeach D1 board. There is one known 2M/4M issue in the Atom processor, seeing "Using 2M/4M pages when A20M# is asserted may result in incorrect
address translations" in http://download.intel.com/design/processor/specupdt/319978.pdf. This may be why the 4M large page doesn't work and one hint to fix this issue in the kernel.

Revision history for this message
james (james-xu) wrote :
Revision history for this message
C. Scott Ananian (cscott-litl) wrote :

Intel errata AAE44 and AAE46 for atom processors can cause problems with large pages on Atom processors. Upstream kernel patches http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=ad5ca55f6bdb47c957b681c7358bb3719ba4ee82 and http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=211b3d03c7400f48a781977a50104c9d12f4e229 workaround these (to some extent), although Intel's errata doc (http://download.intel.com/design/processor/specupdt/319536.pdf) mentions that a BIOS fix may be possible.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.