kernel oops in usbserial on ppc with appletouch loaded (palm hotsync attempt)

Bug #39518 reported by Jason 'vanRijn' Kasper on 2006-04-14
18
Affects Status Importance Assigned to Milestone
linux-source-2.6.15 (Ubuntu)
Low
Ben Collins

Bug Description

It does not happen the first time I hotSync my Treo 650, but somewhere in subsequent pilot hotsync attempts, I am getting a kernel oops with the below kernel trace. This operation is not normally prone to error like this, so I suspect either the appletouch driver or some oddity regarding running on a PPC machine. Also, I believe that something might be going wrong in the suspend/resume cycle that is putting the kernel into a funky state, since I am able to hotSync reliably after a clean boot.

[ 6058.278540] Unable to handle kernel paging request for data at address 0x00000084
[ 6058.278561] Faulting instruction address: 0xf27578bc
[ 6058.278573] Oops: Kernel access of bad area, sig: 11 [#3]
[ 6058.278578] Modules linked in: appletouch visor usbserial radeon drm hci_usb rfcomm l2cap bluetooth cpufreq_powersave cpufreq_stats cpufreq_userspace cp
ufreq_ondemand cpufreq_conservative ipv6 af_packet arc4 ieee80211_crypt_wep ext3 jbd dm_mod md_mod sr_mod snd_powermac snd_pcm_oss snd_mixer_oss snd_pcm sn
d_timer snd soundcore snd_page_alloc sbp2 scsi_mod apm_emu joydev ohci1394 pcmcia ieee1394 bcm43xx ieee80211softmac ieee80211 ieee80211_crypt sungem sungem
_phy usbhid yenta_socket rsrc_nonstatic pcmcia_core uninorth_agp agpgart tsdev evdev reiserfs ehci_hcd ohci_hcd usbcore ide_disk ide_cd cdrom i2c_keywest c
apability commoncap
[ 6058.278652] NIP: F27578BC LR: C01768E0 CTR: F2757870
[ 6058.278660] REGS: e74a1d20 TRAP: 0300 Not tainted (2.6.15-20-powerpc)
[ 6058.278666] MSR: 00009032 <EE,ME,IR,DR> CR: 80004442 XER: 00000000
[ 6058.278680] DAR: 00000084, DSISR: 40000000
[ 6058.278686] TASK = e983b350[8163] 'pilot-xfer' THREAD: e74a0000
[ 6058.278692] GPR00: 00000000 E74A1DD0 E983B350 F275C0D8 F275C018 F275BC70 7FCC2760 402C7413
[ 6058.278707] GPR08: 402C7413 EB1B0600 FFFFFFE7 EB1B0600 20004448 1001E29C 00000000 00000000
[ 6058.278723] GPR16: 00000000 00000000 10010000 10010000 20000000 00000004 7FCC36C4 7FCC386B
[ 6058.278737] GPR24: 00000000 1001660C EBA6C000 F2760000 7FCC2760 E9931F00 402C7413 00000000
[ 6058.278753] NIP [F27578BC] serial_ioctl+0x4c/0x110 [usbserial]
[ 6058.278778] LR [C01768E0] tty_ioctl+0x3c0/0xf50
[ 6058.278797] Call Trace:
[ 6058.278802] [E74A1DD0] [E9BD9B20] 0xe9bd9b20 (unreliable)
[ 6058.278813] [E74A1DF0] [C01768E0] tty_ioctl+0x3c0/0xf50
[ 6058.278824] [E74A1ED0] [C0097D14] do_ioctl+0x84/0x90
[ 6058.278834] [E74A1EE0] [C0097DAC] vfs_ioctl+0x8c/0x4b0
[ 6058.278844] [E74A1F10] [C0098264] sys_ioctl+0x94/0xb0
[ 6058.278853] [E74A1F40] [C00115DC] ret_from_syscall+0x0/0x4c
[ 6058.278865] --- Exception: c01 at 0xfe5459c
[ 6058.278875] LR = 0xfed8518
[ 6058.278880] Instruction dump:
[ 6058.278916] 3ca0f276 3c80f276 7fc7f378 38a5bc70 7cdc3378 3884c018 801bfb84 83e3096c
[ 6058.278931] 3c60f276 3863c0d8 2f800000 409e00a8 <801f0084> 3940ffed 2f800000 409e0048
[ 6058.278947]

Hm. Okay, that time, I couldn't even get an initial successful sync. This crash occurred on the first time I tried to sync.... Slightly different call stack this time...

[ 9952.003335] Unable to handle kernel paging request for data at address 0x00000084
[ 9952.003356] Faulting instruction address: 0xf27565ec
[ 9952.003368] Oops: Kernel access of bad area, sig: 11 [#1]
[ 9952.003374] Modules linked in: visor usbserial nls_utf8 hfsplus radeon drm hci_usb rfcomm l2cap bluetooth cpufreq_powersave cpufreq_stats cpufreq_userspace cpufreq_ondemand cpufreq_conservative ipv6 af_packet arc4 ieee80211_crypt_wep ext3 jbd dm_mod md_mod sr_mod snd_powermac snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc sbp2 scsi_mod apm_emu joydev appletouch pcmcia bcm43xx ohci1394 usbhid ieee80211softmac ieee80211 ieee80211_crypt ieee1394 sungem sungem_phy uninorth_agp agpgart yenta_socket rsrc_nonstatic pcmcia_core tsdev evdev reiserfs ehci_hcd ohci_hcd usbcore ide_disk ide_cd cdrom i2c_keywest capability commoncap
[ 9952.003449] NIP: F27565EC LR: C01799C8 CTR: F27565B0
[ 9952.003458] REGS: ea769d30 TRAP: 0300 Not tainted (2.6.15-20-powerpc)
[ 9952.003464] MSR: 00009032 <EE,ME,IR,DR> CR: 24242822 XER: 20000000
[ 9952.003478] DAR: 00000084, DSISR: 40000000
[ 9952.003484] TASK = edc9ad50[6584] 'kpilotDaemon' THREAD: ea768000
[ 9952.003490] GPR00: 00000000 EA769DE0 EDC9AD50 F275B0C0 F275B018 F275AC34 C0281B78 00000004
[ 9952.003505] GPR08: 00000004 F27565B0 00000001 00000001 44244828 100543C8 00000000 00001000
[ 9952.003519] GPR16: E51CA2B4 00000082 E51CA2AC E51CA2B0 0000000D 00000000 E51CA2A4 00000104
[ 9952.003534] GPR24: E51CA2A8 E51CA2AC 00000000 00000000 00000000 EE3EAC60 F2760000 00000000
[ 9952.003549] NIP [F27565EC] serial_chars_in_buffer+0x3c/0xc0 [usbserial]
[ 9952.003576] LR [C01799C8] normal_poll+0x198/0x1e0
[ 9952.003595] Call Trace:
[ 9952.003601] [EA769DE0] [0024AA25] 0x24aa25 (unreliable)
[ 9952.003612] [EA769DF0] [C01799C8] normal_poll+0x198/0x1e0
[ 9952.003623] [EA769E10] [C0175E48] tty_poll+0x88/0xb0
[ 9952.003633] [EA769E30] [C0098E70] do_select+0x220/0x420
[ 9952.003644] [EA769EC0] [C009973C] sys_select+0x2bc/0x4c0
[ 9952.003654] [EA769F20] [C0004EEC] ppc_select+0x3c/0xf0
[ 9952.003667] [EA769F40] [C00115DC] ret_from_syscall+0x0/0x4c
[ 9952.003680] --- Exception: c01 at 0xf933e24
[ 9952.003690] LR = 0xfc08144
[ 9952.003694] Instruction dump:
[ 9952.012170] 3ca0f276 3884b018 38a5ac34 90010014 bfc10008 3fc0f276 801eeb84 2f800000
[ 9952.012191] 83e3096c 3c60f276 3863b0c0 409e0048 <801f0084> 7fe3fb78 3920ffea 2f800000
[ 9952.012208]

I have just compiled 2.6.17-rc4 from kernel.org and I do not see this same behavior. So, maybe this was in 2.6.15 or in the patches that ubuntu has made to it?

Doug Winter (doug-isotoma) wrote :

I'm having this problem as well, but without appletouch loaded. There is also a debian bug #357193 filed against this. I suspect this is a problem in the stock kernel and has been fixed in later versions.

d selby (kbmaniac) wrote :

Hi all,

Upgraded kernel to 2.6.17.1 and Palm syncing now works perfectly. So it appears that the default kernel is a bit flaky on the USB front

Dave

Ben Collins (ben-collins) wrote :

Unless you compile the stock 2.6.15, then your tests aren't proving anything other than maybe 2.6.17 fixed the bug (which may be present in stock 2.6.15 kernels).

Gerv (gerv) wrote :

Ben: is there a plan for getting Palm USB syncing working in Dapper? (I assume it has to be made to work at some point, otherwise people with such devices are going to be out of luck for the next five years...) Is there a plan to rev. the kernel? Or to figure out the problem and backport a patch?

Is there anything any of us can do to speed up the fix?

Gerv

Simon Wong (wongy) wrote :
Download full text (4.6 KiB)

I am having similar trouble using Intel chipset (Pentium M).

Happens after a few susscessful syncs:

Jul 23 13:28:13 localhost kernel: [17189280.432000] usb 2-2: new full speed USB device using uhci_hcd and address 15
Jul 23 13:28:13 localhost kernel: [17189280.576000] visor 2-2:1.0: Handspring Visor / Palm OS converter detected
Jul 23 13:28:13 localhost kernel: [17189280.576000] usb 2-2: Handspring Visor / Palm OS converter now attached to ttyUSB2
Jul 23 13:28:13 localhost kernel: [17189280.576000] usb 2-2: Handspring Visor / Palm OS converter now attached to ttyUSB3
Jul 23 13:28:14 localhost kernel: [17189281.600000] e0f02972
Jul 23 13:28:14 localhost kernel: [17189281.600000] PREEMPT SMP
Jul 23 13:28:14 localhost kernel: [17189281.600000] Modules linked in: acpi_sbs i2c_acpi_ec i2c_core battery ac ibm_acpi thermal fan button ipw2100 ieee80211 e1000 michael_mic ieee80211_crypt_tkip usbhid visor usbserial arc4 ieee80211_crypt_wep binfmt_misc rfcomm l2cap bluetooth radeon drm uinput ppdev ipv6 speedstep_centrino cpufreq_userspace cpufreq_stats freq_table cpufreq_powersave cpufreq_ondemand cpufreq_conservative video tc1100_wmi sony_acpi pcc_acpi hotkey dev_acpi container deflate zlib_deflate twofish serpent aes blowfish des sha256 sha1 crypto_null af_key dm_mod md_mod nvram lp ide_scsi scsi_mod af_packet irtty_sir sir_dev nsc_ircc joydev ieee80211_crypt irda pcspkr pcmcia crc_ccitt tsdev parport_pc parport yenta_socket rsrc_nonstatic pcmcia_core psmouse serio_raw snd_intel8x0 snd_ac97_codec snd_ac97_bus snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc intel_agp agpgart shpchp pci_hotplug evdev ext3 jbd ide_generic ehci_hcd uhci_hcd usbcore ide_cd cdrom ide_disk piix generic pro
Jul 23 13:28:14 localhost kernel: essor capability commoncap vga16fb vgastate fbcon tileblit font bitblit softcursor
Jul 23 13:28:14 localhost kernel: [17189281.600000] CPU: 0
Jul 23 13:28:14 localhost kernel: [17189281.600000] EIP: 0060:[pg0+548043122/1069167616] Not tainted VLI
Jul 23 13:28:14 localhost kernel: [17189281.600000] EFLAGS: 00210246 (2.6.15-26-686)
Jul 23 13:28:14 localhost kernel: [17189281.600000] EIP is at serial_ioctl+0x32/0x100 [usbserial]
Jul 23 13:28:14 localhost kernel: [17189281.600000] eax: 00000000 ebx: 00000000 ecx: c5d2b840 edx: bfbcf628
Jul 23 13:28:14 localhost kernel: [17189281.600000] esi: 00005401 edi: bfbcf628 ebp: c5d2b840 esp: d153fef8
Jul 23 13:28:14 localhost kernel: [17189281.600000] ds: 007b es: 007b ss: 0068
Jul 23 13:28:14 localhost kernel: [17189281.600000] Process gpilotd (pid: 16382, threadinfo=d153e000 task=ddd50a90)
Jul 23 13:28:14 localhost kernel: [17189281.600000] Stack: c0171bf9 c5d2b88c da18b560 c5d2b840 d153ff44 c5d2b840 fffffff8 da7b3000
Jul 23 13:28:14 localhost kernel: [17189281.600000] 00005401 c0242bfb da7b3000 c5d2b840 00005401 bfbcf628 bfbcf628 c5d2b840
Jul 23 13:28:14 localhost kernel: [17189281.600000] da7b3000 c5d2b840 bfbcf628 00005401 00000011 c01880b3 da18b4a8 c5d2b840
Jul 23 13:28:14 localhost kernel: [17189281.600000] Call Trace:
Jul 23 13:28:14 localhost kernel: [17189281.600000] [__dentry_open+265/640] __dentry_open+0x109/0x280
Ju...

Read more...

I also observe this problem on the 386 architecture (2.6.15.7 on Athlon XP). I think it may explain a lot of problems people are having with Palm syncing on USB. I investigated, and the root cause appears to be a race condition during open/close of the USB device. A brief description of the problem, along with the official patch, is here:

http://www.kernel.org/pub/linux/kernel/people/gregkh/usb/2.6/2.6.15/usbserial-race-condition-fix.patch

I have a patch against Dapper 2.6.15.7 attached (it is a little different - seems gregkh's patch is against a later version?). Applying it causes my Treo 600 sync process to go from "almost never" working to "always" working.

I'd like to echo the previous reporters comment about this being an important issue for a Dapper update... It was really frustrating for me, personally, and I think there are a lot of Palm users out there who would benefit.

Thanks,
Jeff Trull

Changed in linux-source-2.6.15:
importance: Medium → High
status: Unconfirmed → Confirmed
Chuck Short (zulcss) wrote :

Added to my git tree.

Changed in linux-source-2.6.15:
status: Confirmed → Fix Committed
Changed in linux-source-2.6.15:
assignee: nobody → ubuntu-kernel-team
Martin Pitt (pitti) wrote :

Patch breaks ABI, but otherwise looks appropriate for an SRU. We should consider this for the Dapper point release if the kernel for it needs to break the ABI anyway.

Changed in linux-source-2.6.15:
importance: High → Low
status: Fix Committed → In Progress
Changed in linux-source-2.6.15:
status: In Progress → Fix Committed
assignee: ubuntu-kernel-team → ben-collins
Martin Pitt (pitti) wrote :

Backing this out for dapper.2, since it is the only thing that breaks ABI. Given the pain ABI changes inflict, it's not worth it this round.

Changed in linux-source-2.6.15:
milestone: ubuntu-6.06.2 → gutsy-updates
status: Fix Committed → In Progress
Martin Pitt (pitti) wrote :

Putting back, since the kernel needs to bump the ABI anyway.

Changed in linux-source-2.6.15:
milestone: gutsy-updates → ubuntu-6.06.2
Martin Pitt (pitti) wrote :

linux-source-2.6.15 (2.6.15-51.63) dapper-proposed; urgency=low

  * Fix kernel-versions for ABI bump
  * Fix for kernel crash on lvremove
    - LP: #103729
  * e1000: Disable MSI by default. Allow it to be enabled with module param.
    Some chip implementations seem to not work well with MSI.
    - LP: #56885
  * tg3: Backport from 2.6.16.y
    - LP: #72696
  * Add r1000 to nic-modules
    - LP: #81782
  * Add bnx2 to nic-modules
    - LP: #73647
  * usb-serial: Fix oops with pilot-link
    - LP: #39518
  * megaraid: Move AMI/Megaraid3 IDs from megaraid_mbox.ko to megaraid.ko
    - LP: #57233

 -- Ben Collins <email address hidden> Tue, 23 Oct 2007 16:57:09 -0400

Please test and give feedback here.

Changed in linux-source-2.6.15:
status: In Progress → Fix Committed
Henrik Nilsen Omma (henrik) wrote :

Brian writes:
"I tried to recreate the crash a fair bit today and was unable to recreate the original bug report. I tried performing hot syncs throught
the day and even hibernated once and then hot synced. "

Brian Murray (brian-murray) wrote :

I performed multiple hot syncs using 2.6.15-51.63 and never had a kernel oops.

Martin Pitt (pitti) wrote :

Still works fine with the updated kernel, so considering verified. Kernel is in -updates now.

Changed in linux-source-2.6.15:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers