Comment 39 for bug 61235

Revision history for this message
Andrew Pines (apines) wrote :

I believe we have found a real solution to this problem. Please try it out and post your results so we can verify how well it works. We found two issues which seem to contribute to this problem:

1) The EHCI controller has a setting for number of retries (1, 2, 3, or endless). The kernel sets it to 3. Try setting it to 0 (endless retries). This setting is in drivers/usb/host/ehci-hcd.c, line 107:
     #define EHCI_TUNE_CERR 3 /* 0-3 qtd retries; 0 == don't stop */
Simply change the 3 to a 0. This will cause it to retry failed transactions endlessly until they succeed. I don't know if this will have an adverse effect on non-mass storage devices (the only USB devices we have in our system are flash drives).

2) The BIOS may not be setting the Frame Length Adjustment register. This one is a little more obscure. See page 10 of the EHCI specification, available on Intel's web site. Our motherboard (ECS P4M800PRO-M478) is not setting a value for this register. ECS seemed to not know what we were talking about when we asked about it. The kernel should not have to set this but it can, by inserting the following code into the top of ehci_pci_setup() in drivers/usb/host/ehci-pci.c (about line 74):

    {
        u8
            fladj;

        pci_read_config_byte(pdev,0x61,&fladj);
        printk("FLADJ was=%02X\n",fladj);
        pci_write_config_byte(pdev,0x61,0x00);
        pci_read_config_byte(pdev,0x61,&fladj);
        printk("FLADJ is now=%02X\n",fladj);
    }

The FLADJ register's address is 0x61. This will set the value to 0x00 (worked well for us, may be totally wrong for you). Since this is a function of the specific motherboard you'll probably have to find the right number empirically. You can verify if your motherboard is writing its own value by booting once to set your value then rebooting (without powering off) and noting if the "was=" in the logs is what you previously set or if it's something different. If it's different then the BIOS wrote to it and probably knows the correct value and you should leave this alone. The valid range is 0x00 through 0x3F. The default is 0x20.

Build the kernel in your usual fashion, give it a spin, and let me know how it goes. I hope this helps.

     -Andrew