2.6.22-13 freezes during boot (related somehow to orinoco_cs)

Bug #149997 reported by Barry K. Nathan on 2007-10-06
Affects Status Importance Assigned to Milestone
linux-source-2.6.22 (Ubuntu)
Nominated for Gutsy by Barry K. Nathan

Bug Description

Binary package hint: linux-source-2.6.22

This is the freeze-at-boot that's been reported in the later posts in bug #125832. (In my case, it's the internal WiFi in my Toshiba Portege 3500). I've taken a picture of the screen with my digital camera. Let me see if I can attach it or something... (Give me a minute.)

Barry K. Nathan (barryn) wrote :

Well, at least, this includes as much of the panic as I could get. I started up with "vga=ask" and chose the 80x60 option, but something during the boot process undoes that (at least partially) before the panic happens...

kstockmann (klaustockmann) wrote :

Same bug with my laptop. I don't know what it means, but I have a Orinoco Wireless LAN PC card (3.3 Volt) connected to a Texas Instruments PCI-1410 Cardbus Controller (PCMCIA-Adapter). Anyhow, my Laptop can't boot since the last Update today.

Barry K. Nathan (barryn) wrote :

Oh, by the way, here's how I'm reproducing this bug, starting from scratch:

1. Plug the computer into a wired network connection.
2. Install Gutsy Tribe 5.
3. Reboot into the newly installed system.
4. Run Update Manager.
5. Click "Check."
6. Perform a "partial upgrade" when prompted.
7. Once the partial upgrade finishes, reboot.
8. The new kernel fails to boot all the way.

I haven't tried installing Gutsy beta then applying updates (yet). If that reproduces it, that would be quicker. However, I *know* the above procedure reproduces it for me.

David Tomaschik (matir) wrote :

A system going from feisty -> gutsy tribe 4 -> continuous updates to current exhibits the same behavior. The call trace is the same on mine as well. This is a fairly major bug.

Matthew Garrett (mjg59) wrote :

Just to clarify something - for those of you using external cards, this only happens if the card is inserted at boot, right?

Changed in linux-source-2.6.22:
importance: Undecided → Critical
status: New → Confirmed

No; it happens as soon as the card is inserted. Without the card inserted (Orinoco PCMCIA card), boot sequence completes. As soon as the card is inserted, there is a hard lockup. No keyboard, no mouse, etc

David Tomaschik (matir) wrote :

I have a SUSPICION (untested) as to the source of this bug. In commit 0fd77efbd8364d6a2625bbc4b84c1f3c26387655, the following code appears:
 static int __init
+ int status;
printk(KERN_DEBUG "%s\n", version);
- return pcmcia_register_driver(&orinoco_driver);
+ status = pcmcia_register_driver(&orinoco_driver);
+ if (status >= 0)
+ status = pcmcia_register_driver(&orinoco_overlap_driver);
+ return status;

I think the (status >= 0) line should simply be (status != 0), as status==0 SHOULD indicate a successful register of the driver with the orinoco_driver array. Any other value would be an error code from bus_add_driver, driver_register, or pcmcia_register_driver. Anyone with more experience who can take a look at this?

Barry K. Nathan (barryn) wrote :

My steps for reproducing the bug also work if you install Gutsy beta instead of Tribe 5 in step 1 -- I just tested this.

kstockmann (klaustockmann) wrote :

My problem is that the card is not removable, at least not without opening the whole notebook. So how can I fix the problem, since I can't boot and therefore can't get an update?

Barry K. Nathan (barryn) wrote :

A couple of seconds before the kernel actually starts booting, you should be able to press Esc. Then you should get a menu that lets you choose kernels.

kstockmann (klaustockmann) wrote :

Meanwhile I solved the problem temporarily this way: I booted from grub using the older kernel and from there removed the new kernel using the adept. So far it worked. Let's see if I'll get the next update.

Michael R. Head (burner) wrote :

orinoco_cs causes kernel backtraces the first time I insert the card and hangs the kernel on the second insert.

Michael R. Head (burner) wrote :

Here's the syslog I get when I insert a orinoco_cs-supported card.

Barry K. Nathan (barryn) wrote :

Linux does not have this bug. Linux with 2.6.22-13.40's orinoco_cs.c does have this bug. So, I'm pretty sure that the bug is in orinoco_cs.c.

At this point, I don't think the fix is as simple as what David suggests, but I'm still in the process of trying to understand the situation. I'll probably try testing David's suggestion anyway, if for no other reason than to just see what happens.

Barry K. Nathan (barryn) wrote :

David's suggestion happens to work, but I think it leaves the driver broken in a more subtle manner (ISTM that a pcmcia_register_driver error could change the set of supported devices for no rational reason -- and even if that's not the case, IMO there are other reasons why the code doesn't quite make sense).

The 2.6.22-13.40 changes to orinoco_cs.c are rather broken, but IMO it would be best to try to fix them rather than just revert them altogether. I'm not 100% sure what the best approach would be. My current plan is to create 2 or 3 patches (later today) and let the Ubuntu kernel maintainers decide which approach would be best.

David Tomaschik (matir) wrote :

My understanding is that the orinoco_driver set of PCMCIA IDs is exported to allow auto-detection of those devices as belonging to orinoco_cs. IOW, if a device with an ID in orinoco_driver is hotplugged, orinoco_cs will automatically be loaded to support this device. Devices in orinoco_overlap_driver are also (theoretically) supported by hostap, and thus not advertised as orinoco_cs supported. So for those devices, a manual modprobe must be done. For those devices, the first call to pcmcia_register_driver will fail, and then the driver retries with orinoco_overlap_driver. Are there any known instances where the change from >= to != would cause a failure? I'm rebuilding a kernel package now to test on a couple of systems with orinoco wireless cards. I don't have any hostap+orinoco cards to test with.

Barry K. Nathan (barryn) wrote :

"For those devices, the first call to pcmcia_register_driver will fail"

No, the first call is still going to succeed. Remember, you can load a module (and therefore register the driver) for a device which is not yet present!

Barry K. Nathan (barryn) wrote :

I'm using the orinoco_cs driver with this patch (applies against 2.6.22-13.40) to send this attachment, so it seems to work.

FWIW, if the first pcmcia_register_driver call succeeds but the second one fails ,then removing the module will probably cause a kernel panic. I think I'll try to make another patch to hopefully avoid that, but this patch in its current form is worth looking at anyway.

David Tomaschik (matir) wrote :

A valid point. I think a fix that involves only one call to pcmcia_register_driver is the best option. I think the original bug/panic was caused by attempting to register twice when the first call succeeded. My != prevents the second call, but does disable the second set of devices. Is it possible to supply a different struct to MODULE_DEVICE_TABLE and pcmcia_register_driver, but have the table in pcmcia_register_driver be the union of the current orinoco_driver and orinoco_overlap_driver? I'm going to try that here in a minute.

Kyle McMartin (kyle) wrote :

Hi, your patch looks (mostly) correct, and is similar to what I've put into gutsy. Thanks!

Kyle McMartin (kyle) wrote :

fwiw my patch is at people.ubuntu.com/~kyle/orinoco_cs_fix.diff

Brian Murray (brian-murray) wrote :

Assigning to the kernel team per their bug policy.

Changed in linux-source-2.6.22:
assignee: nobody → ubuntu-kernel-team
Kyle McMartin (kyle) wrote :

linux-source-2.6.22 (2.6.22-14.41) gutsy; urgency=low

  [Ben Collins]

  * ubuntu/d-i: Add niu to nic-modules

  [Kyle McMartin]

  * vesafb is not for ia64
  * remove CONFIG_NIU from places it shouldn't be
  * fix orinoco_cs oops
    - LP: #149997

  [Upstream Kernel Changes]

  * [SPARC64]: Allow userspace to get at the machine description.
  * [SPARC64]: Niagara-2 optimized copies.
  * [SPARC64]: Do not touch %tick_cmpr on sun4v cpus.
  * [SPARC64]: SMP trampoline needs to avoid %tick_cmpr on sun4v too.
  * [SPARC64]: Create a HWCAP_SPARC_N2 and report it to userspace on
  * [MATH-EMU]: Fix underflow exception reporting.
  * [SPARC64]: Need to clobber global reg vars in switch_to().
  * [MATH]: Fix typo in FP_TRAPPING_EXCEPTIONS default setting.
  * [SUNVDC]: Use slice 0xff on VD_DISK_TYPE_DISK.
  * [SPARC64]: Fix type and constant sizes wrt. sun4u IMAP/ICLR handling.
  * [SPARC64]: Enable MSI on sun4u Fire PCI-E controllers.
  * [SPARC64]: Fix several bugs in MSI handling.
  * [SPARC64]: Fix booting on V100 systems.
  * [SPARC64]: Fix lockdep, particularly on SMP.
  * [SPARC64]: Warn user if cpu is ignored.
  * [SUNSAB]: Fix several bugs.
  * [SUNSAB]: Fix broken SYSRQ.
  * [SPARC64]: Fix missing load-twin usage in Niagara-1 memcpy.
  * [SPARC64]: Don't use in/local regs for ldx/stx data in N1 memcpy.
  * [SPARC64]: Fix domain-services port probing.
  * [SPARC64]: VIO device addition log message level is too high.
  * [SPARC64]: check fork_idle() error
  * [SPARC64]: Fix 'niu' complex IRQ probing.
  * [NIU]: Add Sun Neptune ethernet driver.

 -- Kyle McMartin <email address hidden> Tue, 09 Oct 2007 00:38:16 +0000

Changed in linux-source-2.6.22:
status: Confirmed → Fix Released
ɞєᾐ ἂ. (talkingwires) wrote :

The fix worked for my system. Thanks!

Per a decision made by the Ubuntu Kernel Team, bugs will longer be assigned to the ubuntu-kernel-team in Launchpad as part of the bug triage process. The ubuntu-kernel-team is being unassigned from this bug report. Refer to https://wiki.ubuntu.com/KernelTeamBugPolicies for more information. Thanks.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers