Random freezes with kernel 2.6.38-*

Bug #732172 reported by pablomme on 2011-03-09
34
This bug affects 6 people
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Undecided
Unassigned

Bug Description

I've been experiencing random freezes (with the caps lock LED blinking, which I think means "kernel panic") with both 2.6.38-5 and 2.6.38-6 - I installed natty at *-5, so I don't know when this might have started, but 2.6.35 in maverick didn't suffer from this.

The chunk of /var/log/kern.log corresponding to the last freeze is the following, which seems truncated halfway through the backtrace:

Mar 9 17:29:16 nanomme kernel: [ 3399.299373] kswapd0: page allocation failure. order:2, mode:0x4020
Mar 9 17:29:16 nanomme kernel: [ 3399.299389] Pid: 35, comm: kswapd0 Tainted: P 2.6.38-6-generic #34-Ubuntu
Mar 9 17:29:16 nanomme kernel: [ 3399.299398] Call Trace:
Mar 9 17:29:16 nanomme kernel: [ 3399.299406] <IRQ> [<ffffffff81113414>] ? __alloc_pages_nodemask+0x604/0x840
Mar 9 17:29:16 nanomme kernel: [ 3399.299449] [<ffffffff81151d0b>] ? kmalloc_large_node+0x6b/0xc0
Mar 9 17:29:16 nanomme kernel: [ 3399.299464] [<ffffffff811571f1>] ? __kmalloc_node_track_caller+0x151/0x1a0
Mar 9 17:29:16 nanomme kernel: [ 3399.299480] [<ffffffff814c436d>] ? dev_alloc_skb+0x1d/0x40
Mar 9 17:29:16 nanomme kernel: [ 3399.299495] [<ffffffff814c3b83>] ? __alloc_skb+0x83/0x170
Mar 9 17:29:16 nanomme kernel: [ 3399.299510] [<ffffffff814c436d>] ? dev_alloc_skb+0x1d/0x40
Mar 9 17:29:16 nanomme kernel: [ 3399.299567] [<ffffffffa0132cf6>] ? rtl8192_rx_normal+0x2c6/0x470 [r8192se_pci]
Mar 9 17:29:16 nanomme kernel: [ 3399.299993] [<ffffffffa0c7d73b>] ? _nv022219rm+

Freezes occur between once every couple of days and a couple every day.

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: linux-image-2.6.38-6-generic 2.6.38-6.34
Regression: Yes
Reproducible: No
ProcVersionSignature: Ubuntu 2.6.38-6.34-generic 2.6.38-rc7
Uname: Linux 2.6.38-6-generic x86_64
NonfreeKernelModules: nvidia
AlsaVersion: Advanced Linux Sound Architecture Driver Version 1.0.23.
Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: NVidia [HDA NVidia], device 0: ALC269 Analog [ALC269 Analog]
   Subdevices: 1/1
   Subdevice #0: subdevice #0
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: pablo 1244 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'NVidia'/'HDA NVidia at 0xf9f78000 irq 22'
   Mixer name : 'Nvidia MCP79/7A HDMI'
   Components : 'HDA:10ec0269,104383ce,00100004 HDA:10de0007,10de0101,00100100'
   Controls : 16
   Simple ctrls : 8
Date: Wed Mar 9 18:08:09 2011
Frequency: Once every few days.
HibernationDevice: RESUME=UUID=7d8c516b-37a7-4ed3-9742-8bec10965939
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Alpha amd64 (20110301.1)
Lsusb:
 Bus 004 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 003 Device 001: ID 1d6b:0001 Linux Foundation 1.1 root hub
 Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 001 Device 002: ID 13d3:5126 IMC Networks
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
MachineType: ASUSTeK Computer INC. 1201N
ProcEnviron:
 LANGUAGE=en_GB:en
 PATH=(custom, user)
 LANG=en_GB.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.38-6-generic root=UUID=56551fbd-03c7-48af-b7f4-c8f165bc66ed ro quiet splash vt.handoff=7
RelatedPackageVersions:
 linux-restricted-modules-2.6.38-6-generic N/A
 linux-backports-modules-2.6.38-6-generic N/A
 linux-firmware 1.48
SourcePackage: linux
UpgradeStatus: Upgraded to natty on 2011-03-03 (5 days ago)
dmi.bios.date: 04/29/2010
dmi.bios.vendor: American Megatrends Inc.
dmi.bios.version: 0326
dmi.board.asset.tag: To Be Filled By O.E.M.
dmi.board.name: 1201N
dmi.board.vendor: ASUSTeK Computer INC.
dmi.board.version: x.xx
dmi.chassis.asset.tag: 0x00000000
dmi.chassis.type: 10
dmi.chassis.vendor: ASUSTeK Computer INC.
dmi.chassis.version: x.x
dmi.modalias: dmi:bvnAmericanMegatrendsInc.:bvr0326:bd04/29/2010:svnASUSTeKComputerINC.:pn1201N:pvrx.x:rvnASUSTeKComputerINC.:rn1201N:rvrx.xx:cvnASUSTeKComputerINC.:ct10:cvrx.x:
dmi.product.name: 1201N
dmi.product.version: x.x
dmi.sys.vendor: ASUSTeK Computer INC.

pablomme (pablomme) wrote :
Brad Figg (brad-figg) on 2011-04-07
Changed in linux (Ubuntu):
status: New → Confirmed
pablomme (pablomme) wrote :

This continues to happen at random on my laptop - haven't seen it on my desktop yet, so it's likely hardware-specific. Any hints on gathering a backtrace for a kernel panic? I've seen instructions that require a serial port, but unless "serial" includes USB, that won't be possible on my laptop..

gamma62 (gamma62) wrote :

I have the same problem, random freeze, now the kern.log is an evidence

Linux version 2.6.38-8-generic (buildd@allspice)
(gcc version 4.5.2 (Ubuntu/Linaro 4.5.2-8ubuntu3) )
#42-Ubuntu SMP Mon Apr 11 03:31:24 UTC 2011
(Ubuntu 2.6.38-8.42-generic 2.6.38.2)

(prefix removed)

 __alloc_pages_slowpath: 12 callbacks suppressed
 swapper: page allocation failure. order:2, mode:0x4020
 Pid: 0, comm: swapper Tainted: P 2.6.38-8-generic #42-Ubuntu
 Call Trace:
  <IRQ> [<ffffffff811147c4>] ? __alloc_pages_nodemask+0x604/0x840
  [<ffffffff811155eb>] ? free_compound_page+0x1b/0x20
  [<ffffffff8115306b>] ? kmalloc_large_node+0x6b/0xc0
  [<ffffffff811585a1>] ? __kmalloc_node_track_caller+0x151/0x1a0
  [<ffffffff814c5c5d>] ? dev_alloc_skb+0x1d/0x40
  [<ffffffff814c5473>] ? __alloc_skb+0x83/0x170
  [<ffffffff814c5c5d>] ? dev_alloc_skb+0x1d/0x40
  [<ffffffffa0539cf6>] ? rtl8192_rx_normal+0x2c6/0x470 [r8192se_pci]
  [<ffffffffa053a2d1>] ? rtl8192_irq_rx_tasklet+0x21/0x70 [r8192se_pci]
  [<ffffffff8106cc43>] ? tasklet_action+0x73/0x120
  [<ffffffff8106d538>] ? __do_softirq+0xa8/0x1c0
  [<ffffffff81030518>] ? ack_apic_level+0x78/0x1a0
  [<ffffffff8100cf1c>] ? call_softirq+0x1c/0x30
  [<ffffffff8100ea45>] ? do_softirq+0x65/0xa0
  [<ffffffff8106d755>] ? irq_exit+0x85/0x90
  [<ffffffff815caec6>] ? do_IRQ+0x66/0xe0
  [<ffffffff815c3213>] ? ret_from_intr+0x0/0x15
  <EOI> [<ffffffff8136080f>] ? arch_local_irq_enable+0x8/0xd
  [<ffffffff8108ec35>] ? sched_clock_idle_wakeup_event+0x15/0x20
  [<ffffffff813615d4>] ? acpi_idle_enter_bm+0x219/0x251
  [<ffffffff814a3bda>] ? cpuidle_idle_call+0xaa/0x1b0
  [<ffffffff8100a266>] ? cpu_idle+0xa6/0xf0
  [<ffffffff815a9205>] ? rest_init+0x75/0x80
  [<ffffffff81acac8b>] ? start_kernel+0x3f5/0x400
  [<ffffffff81aca388>] ? x86_64_start_reservations+0x132/0x136
  [<ffffffff81aca253>] ? zap_identity_mappings+0x3e/0x41
  [<ffffffff81aca458>] ? x86_64_start_kernel+0xcc/0xdb

gamma62 (gamma62) wrote :
gamma62 (gamma62) wrote :

the crash is related to the r8192se_pci kernel module

pablomme (pablomme) wrote :

That is my feeling too, but I don't see any proof that that is the case. How do you know it's to do with the wireless module?

gamma62 (gamma62) wrote :

not az exact prrof, in the mathematical sense, but this not the matter

here is the backtrace
  [<ffffffff814c5c5d>] ? dev_alloc_skb+0x1d/0x40
  [<ffffffff814c5473>] ? __alloc_skb+0x83/0x170
  [<ffffffff814c5c5d>] ? dev_alloc_skb+0x1d/0x40
  [<ffffffffa0539cf6>] ? rtl8192_rx_normal+0x2c6/0x470 [r8192se_pci]
  [<ffffffffa053a2d1>] ? rtl8192_irq_rx_tasklet+0x21/0x70 [r8192se_pci]

maybe also with drivers/net/wireless/rtlwifi/pci.c ?

recently I have used 2.6.32 kernels, but recompiled and I have not seen any freeze,
the binary module could be different
and the relevant kernel code maybe different as well
switching off wifi could be a survival action

pablomme (pablomme) wrote :

Yes, I've seen the presence of "rtl8192" in the backtraces, but that does not mean much - it's just a kernel oops, which may or may not be related to the kernel panic that happens at some later point. The backtrace from the kernel panic itself would not appear in the dmesg, since all basic kernel functionality - including filesystem drivers - have stopped at that point.

That said, I'm under the impression that the rtl8192se driver is of rather bad quality (apart from unnecessarily noisy in the kernel logs..). There is a newer driver available at
http://218.210.127.131/downloads/downloadsView.aspx?Langid=1&PNid=21&PFid=48&Level=5&Conn=4&DownTypeID=3&GetDown=false&Downloads=true
which is version 0019, as opposed to version 0017 currently in Ubuntu. I gave this one a try in maverick in an attempt to fix connection stability issues, but it did not seem any better than the version in Ubuntu (which I think was 0015 in maverick). But perhaps it fixes the kernel panics, so I will give it another try.

More interestingly, in the downloads page above there are drivers for other cards which use a new architecture, which allegedly fixes problems with the earlier versions. The architecture seems to unify the drivers for different cards. You can get for example the rtl8188ce driver, which supposedly supports 8192se too. I tried compiling this one under natty, but it hangs my computer. However at some point in the near future the new driver might be made to work correctly with the 8192se, so there's hope...

pablomme (pablomme) wrote :

Just a quick comment in case this helps someone looking for workarounds. To build a dkms package of the 0019 driver:
- download the sources from the link in the previous post
- unpack the tarball somewhere
- place the attached dkms.conf in the main directory of the extracted tarball
- do "sudo apt-get install build-essential debhelper dkms"
- from a terminal, change into the extracted directory and run "dkms mkdeb --source-only"
- then install the .deb (which will have been placed in the parent directory) using 'dpkg -i', double-clicking, or whatever you wish
Reboot to start using the 0019 driver. The advantage of the dkms .deb package is, of course, that it can be easily uninstalled, and that it will be recompiled automatically with every kernel upgrade.

pablomme, thank you for reporting this bug and helping make Ubuntu better. This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? If so, could you please capture the oops following https://wiki.ubuntu.com/KernelTeam/KernelTeamBugPolicies#Capturing_OOPs ? As well, can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux <replace-with-bug-number>

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

Changed in linux (Ubuntu):
status: Confirmed → Incomplete
Launchpad Janitor (janitor) wrote :

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.