Page allocation failure on Pandaboard and Beagle XM

Bug #746137 reported by Tobin Davis on 2011-03-31
34
This bug affects 5 people
Affects Status Importance Assigned to Milestone
jasper-initramfs (Ubuntu)
Undecided
Unassigned
Natty
Undecided
Unassigned
Precise
Undecided
Unassigned
Quantal
Undecided
Unassigned
linux (Ubuntu)
High
Unassigned
Natty
Undecided
Unassigned
Precise
Undecided
Unassigned
Quantal
High
Unassigned
linux-ti-omap4 (Ubuntu)
High
Paolo Pisati
Natty
High
Unassigned
Precise
Undecided
Paolo Pisati
Quantal
High
Paolo Pisati

Bug Description

During the course of testing, I have been seeing a lot of page allocation failures on all omap4 images. Not sure what the cause is, but it doesn't appear to be related to any specific application. This also doesn't appear to affect the execution of the application.

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: linux-image-2.6.38-1207-omap4 2.6.38-1207.10
ProcVersionSignature: User Name 2.6.38-1207.10-omap4 2.6.38
Uname: Linux 2.6.38-1207-omap4 armv7l
Architecture: armel
Date: Wed Mar 30 16:59:55 2011
ProcEnviron:
 LANGUAGE=
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-ti-omap4
UpgradeStatus: No upgrade log present (probably fresh install)

Related branches

Tobin Davis (gruemaster) wrote :
description: updated
Tobin Davis (gruemaster) wrote :
Tobin Davis (gruemaster) wrote :
tags: added: iso-testing
Ricardo Salveti (rsalveti) wrote :

Can you check if this still happens after adding "vm.min_free_kbytes = 8192" to your /etc/sysctl.conf?

From the logs it seems that the usb hub is trying to allocate big chucks of memory. That could be explained by using the rootfs on a usb disk or transferring a lot of data over ethernet.

Oliver Grawert (ogra) on 2011-03-31
Changed in linux-ti-omap4 (Ubuntu):
importance: Undecided → High
Paolo Pisati (p-pisati) wrote :

with this snippet i can reliably reproduce the problem:

-------------8<-------------8<-------------8<-------------8<-------------

PKG="kubuntu-desktop openoffice.org fakeroot build-essential kexec-tools kernel-wedge libncurses5 libncurses5-dev libelf-dev asciidoc binutils-dev gimp chromium texlive-latex-recommended"
while [ 1 ]; do
    sudo rm /var/cache/apt/archives/*
    sudo aptitude install -dy $PKG
done

-------------8<-------------8<-------------8<-------------8<-------------

however adding "vm.min_free_kbytes = 8192" to sysctl.conf seems to fix the issue.

Paolo Pisati (p-pisati) wrote :

it seems i spoke too early:

natty: http://pastebin.ubuntu.com/591172/
maverick: http://pastebin.ubuntu.com/591194/

both while doing network activity and with vm.min_free_kbytes = 8192

Marcin reported another trace that looks identical: https://bugs.launchpad.net/ubuntu/+source/linux-ti-omap4/+bug/690370/comments/28

He got his while doing a kernel compilation on a usb hdd.

IMO this and lp690370 share the same root problem.

The work around I've been using is

      Create /etc/modprobe.d/smscnonturbo.conf with the following contents

options smsc95xx turbo_mode=N

then reboot to get that option.

This probably does nasty things to performance but it gets rid of those errors reliably for me.

Dave

Ricardo Salveti (rsalveti) wrote :

Yes, this is directly related with the turbo_mode and the amount of memory needed when doing a RX transaction.

The two workarounds we have are:
 * Disable turbo_mode at the driver but affecting performance directly
 * Increase the amount of minimum free system memory in the kernel by writing the value at /etc/sysctl.conf

For the second option 8192 seems to fix most of the cases, as I've being building kernel the whole weekend and got only a few allocation errors. Setting it to 12288 seems to make it more than enough, as I didn't get any allocation error while using it.

As Beagle doesn't have a lot of memory already, I'd preferably go with 8192, that can fix most of the cases and still without decreasing ethernet performance.

This workaround could then be set by jasper.

Ricardo Salveti (rsalveti) wrote :

And remember that this workaround should be set for both OMAP 3 and OMAP 4, as both Beagle XM and Panda uses the same SMSC device/driver.

summary: - Page allocation failure on omap4
+ Page allocation failure on Pandaboard and Beagle XM
Ricardo Salveti (rsalveti) wrote :

Merge proposal for jasper including the sysctl workaround with 8192 kbytes:
 * https://code.launchpad.net/~rsalveti/jasper-initramfs/746137/+merge/57104

Ricardo Salveti (rsalveti) wrote :

Also, this workaround is currently being used by our Beagle XM builders, and they seem to be running quite well now. Lamont can give more details if needed.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package jasper-initramfs - 0.54

---------------
jasper-initramfs (0.54) natty; urgency=low

  * Workaround to increase kernel min_free_kbytes to avoid page allocation
    failures (LP: #746137)
 -- Ricardo Salveti de Araujo <email address hidden> Mon, 11 Apr 2011 00:31:33 -0300

Changed in jasper-initramfs (Ubuntu Natty):
status: New → Fix Released
Bryan Wu (cooloney) wrote :

With the workaround in Jasper, I didn't see any page allocation failure errors on Panda with 1208.12 kernel.

-Bryan

Changed in linux-ti-omap4 (Ubuntu Natty):
status: New → Won't Fix
Changed in linux-ti-omap4 (Ubuntu):
status: New → Won't Fix
Matt Zimmerman (mdz) wrote :

With 2.6.38-1209-omap4 and the default min_free_kbytes=8192, I still see page allocation failures when there's a lot of ethernet traffic. I've increased it to 12288 to see if that makes a difference.

Matt Zimmerman (mdz) wrote :

With 12288, I still see both page allocation failures and "eth0: kevent 2 may have been dropped" errors.

Oliver Grawert (ogra) wrote :

since we now dropped jasper-initramfs from quantal on we dont have a place to put the sysctl.d hack anymore which causes this bug to show up again in quantal images but along that we got a hint to replace GFP_ATOMIC by GFP_KERNEL in the relevant kmallocs which worked for the rt2x00 wifi driver in the ac100 armhf kernel. re-opening against linux-ti-omap4 for quantal ...

Changed in linux-ti-omap4 (Ubuntu Quantal):
status: Won't Fix → New
Oliver Grawert (ogra) wrote :

the same issue is seen on omap3 beagleboard images, so also opening against the linux package.

Changed in linux (Ubuntu Natty):
status: New → Invalid
Changed in linux (Ubuntu Quantal):
importance: Undecided → High

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 746137

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Oliver Grawert (ogra) on 2012-07-24
Changed in linux-ti-omap4 (Ubuntu Quantal):
milestone: none → ubuntu-12.10-beta-1
Changed in linux (Ubuntu Quantal):
milestone: none → ubuntu-12.10-beta-1
Oliver Grawert (ogra) on 2012-07-24
tags: added: rls-q-incoming
Adam Conrad (adconrad) wrote :

bug #992786 implies that this needs to be higher for sane performance. Testing here with 32768 works well.

I'm not sure having this as an installer hack is sane, though I'm trying to sort out where it's better put. procps owns that directory, but having procps have subarch specific magic is a bit icky.

Oliver Grawert (ogra) wrote :

well, how about testing the possible kernel fix mentioned above instead of hacking around the issue in userspace ?

Adam Conrad (adconrad) wrote :

I'm all for kernel fixes, if we can sort out a decent testing strategy to make sure it (a) fixes the bug, and (b) doesn't cause a regression. Would make our buildds pretty happy if we backported it to precise too.

Changed in linux-ti-omap4 (Ubuntu Quantal):
status: New → Incomplete
Oliver Grawert (ogra) on 2012-09-03
Changed in linux-ti-omap4 (Ubuntu Quantal):
status: Incomplete → Confirmed
Changed in jasper-initramfs (Ubuntu Quantal):
status: Fix Released → Invalid
Changed in linux (Ubuntu Quantal):
status: Incomplete → Confirmed
milestone: ubuntu-12.10-beta-1 → ubuntu-12.10-beta-2
Changed in linux-ti-omap4 (Ubuntu Quantal):
milestone: ubuntu-12.10-beta-1 → ubuntu-12.10-beta-2
Oliver Grawert (ogra) wrote :

jasper carries the fix for this since natty

Changed in jasper-initramfs (Ubuntu Precise):
status: New → Fix Released
Changed in jasper-initramfs (Ubuntu Precise):
status: Fix Released → Invalid
status: Invalid → Fix Released

Removing the rls-q-incoming since this bug has been nominated and milestoned to stay on the radar.

tags: removed: rls-q-incoming
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in linux (Ubuntu Precise):
status: New → Confirmed
Changed in linux-ti-omap4 (Ubuntu Precise):
status: New → Confirmed
Tim Gardner (timg-tpi) on 2012-09-26
Changed in linux-ti-omap4 (Ubuntu Quantal):
assignee: nobody → Paolo Pisati (p-pisati)
Changed in linux-ti-omap4 (Ubuntu Precise):
assignee: nobody → Paolo Pisati (p-pisati)
Paolo Pisati (p-pisati) wrote :

in Q/omap4 we enforced vm.min_free_kbytes=32K at kernel level, while in P/omap4 we had the installer tune that value for us (and put a file in /etc/sysctl.d after the installation): i've been compiling the last two days with no problem on my panda so far, closing here.

Changed in linux-ti-omap4 (Ubuntu Precise):
status: Confirmed → Fix Released
Changed in linux-ti-omap4 (Ubuntu Quantal):
status: Confirmed → Fix Released

Closing the linux tasks as this was against linux-ti-omap4.

Changed in linux (Ubuntu Precise):
status: Confirmed → Invalid
Changed in linux (Ubuntu Quantal):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers