bnx2 firmware missing

Bug #842560 reported by Christoph B on 2011-09-06
144
This bug affects 23 people
Affects Status Importance Assigned to Milestone
OEM Priority Project
High
Chris Van Hoof
Oneiric
High
Unassigned
Precise
High
Unassigned
linux (Ubuntu)
Medium
Andy Whitcroft
Oneiric
Medium
Andy Whitcroft
Precise
Medium
Andy Whitcroft
udev (Ubuntu)
Medium
Andy Whitcroft
Oneiric
Medium
Andy Whitcroft
Precise
Medium
Andy Whitcroft

Bug Description

SRU justification:
udev in Ubuntu 11.10 has regressed support for certain hardware configurations, resulting in firmware failing to be loaded at boot time due to a race condition when shutting down udev in the initramfs. This causes network interfaces to fail to come up in a usable state on systems previously supported, and also causes long boot-time delays on these same systems.

Regression potential:
The nature of the fix introduces the possibility that some systems that are not affected by this bug will have their boot slowed down as a result of udev being forced to process more events before it's able to exit. However, since only events with a timeout requirement will actually be dispatched for processing, the impact here should be minimal.

Test case:
1. boot oneiric on a system such as a Dell PowerEdge 2950 which has two NICs that use the bnx2 driver and require the bnx2-mips-06-6.2.1.fw firmware
2. observe that the boot has a 60-second delay and that there is an error in dmesg that the firmware fails to load for one of the NICs
3. install the udev package from oneiric-proposed
4. reboot 6 times to confirm that the system now boots up with no delay and that there is no error in dmesg even after multiple reboots

Installing the latest Beta 1 of Ubuntu 11.10 (downloaded iso on 09-06-2011) fails after the first reboot with this error:

bnx2: Can't load firmware file "bnx2/bnx2-mips-06-6.2.1.fw"
udevd[107]: '/sbin/modprobe -bv pci:v000014E4d0000164Csv00001028sd000001B2bc02sc00i00' [172] terminated by signal 9 (Killed)

Earlier stable versions of Ubuntu Server (11.04) work fine on this machine.

For further investigations i collected some additional information for you. (Attachments)

Are there any other needed information i can supply?

Christoph B (christoph-bittig) wrote :
Christoph B (christoph-bittig) wrote :
summary: - Ubuntu 10.11 Beta1 fails to install on Dell PowerEdge 2950
+ Ubuntu 11.1 Beta1 fails to install on Dell PowerEdge 2950
summary: - Ubuntu 11.1 Beta1 fails to install on Dell PowerEdge 2950
+ Ubuntu 11.10 Beta1 fails to install on Dell PowerEdge 2950
description: updated

Thank you for taking the time to report this bug and helping to make Ubuntu better. The bug lacks some additional information for the developers though. Please execute the following command, as it will automatically gather debugging information, in a terminal:
apport-collect 842560
When reporting bugs in the future please use apport by using 'ubuntu-bug' and the name of the package affected. You can learn more about this functionality at https://wiki.ubuntu.com/ReportingBugs.

Changed in ubuntu:
status: New → Incomplete

If you don't get a graphical session but you can login to a command line, check in the provided link how to create the files that apport creates and attach them to the bug report.

Christoph B (christoph-bittig) wrote :

I don't get a graphical session or command line access so i can't use apport or ubuntu-bug for reporting this bug.

Are there any other possibilities to provide you more information?

I'm changing the package to update-manager since it seems first a problem of the uprgrade.

If you can start a Live session (CD or USB) in any version, try to attach the logs found in /var/log/dist-upgrade of the volume where you tried the install.

It could be also useful to know if you can run a 11.10 live session.

affects: ubuntu → update-manager (Ubuntu)
Kiall Mac Innes (kiall) wrote :

@Walter: "I'm changing the package to update-manager since it seems first a problem of the uprgrade."

I'm getting this error after a fresh install of 11.10 Beta 2.. Strangely, Another identical server had no issues. I'm wondering if if they have different firmware revisions..

Kiall Mac Innes (kiall) wrote :

Also - This is on Dell PowerEdge 1950

Kiall Mac Innes (kiall) wrote :

This looks related to Bug #818177 and #870324.. This is likely a duplicate.

Kiall, I aggree. I'm duplication with bug #818177 because there I think this report will get more attention, as that bug is already triaged. And the messages of install failure are the same as in that bug.

John Miller (jmiller-1) wrote :

I just installed 11.10 today using do-release-upgrade, and I'm still encountering this issue. I'm not sure whether it should be reported in this bug or in 818177 (first time launchpad visitor here), but that bug is marked as fixed now.

Booting normally, I get a few messages flashing by ("udev: timeout: kiling /sbin/modprobe", etc), then there are messages from services indicating the filesystem is read-only, and then the system freezes.
Booting in single-user mode, the same messages go by, but I'm able to get a command prompt, re-mount the root as RW, etc. Strangely enough, eth0 (which is a broadcom card) shows up and is working correctly in SUM.

Attached the dmesg output. Hardware is a Dell PowerEdge 2950.

Any help would be appreciated, let me know if I can provide any further information, thanks!

Denis Yakimov (dnskmv) wrote :

Hi
I got this problem on HP ProLiant DL 360 g5 when I install Ubuntu 10.11

when initializing eth0 a got message (dmesg log) about not loaded firmware bnx2/bnx2-mips-06.6.2.1.fw
But this file is present.
Please inform me when this bug is will resolved.

now i will install old 10.04

Thanks!

Tom Ellis (tellis) wrote :

Removing this bug as a duplicate of the udev issue, this firmware issue is a separate problem which I originally reported within the udev bug (I thought it was related bug appears not to be).

Another user on the udev bug has mentioned this too.

Tom Ellis (tellis) wrote :

Another user in the udev bug reported the same issue on vanilla 11.10:
--
I'm encountering this issue on a Dell PowerEdge 2950 (pretty common server model, we have a ton of them here). Clean install of 11.10 will not boot properly, same with installing 11.04 and then upgrading. It appears to be running the latest udev (173-0ubuntu4).

[ 62.944049] bnx2: Can't load firmware file "bnx2/bnx2-mips-06-6.2.1.fw"
[ 62.944081] bnx2 0000:05:00.0: PCI INT A disabled
[ 62.944095] bnx2: probe of 0000:05:00.0 failed with error -2

Apologies if I'm missing something, and let me know if I can provide any further info.

Tom Ellis (tellis) on 2011-10-20
summary: - Ubuntu 11.10 Beta1 fails to install on Dell PowerEdge 2950
+ bnx2 firmware missing
Tom Ellis (tellis) wrote :

On another Oneiric server I have running 3.0.0-12-server, I can see this firmware present on the system post-install:

$ ls -l /lib/firmware/bnx2/bnx2-mips-06-6.2.1.fw
-rw-r--r-- 1 root root 92792 2011-08-23 09:23 /lib/firmware/bnx2/bnx2-mips-06-6.2.1.fw
$ dpkg -S /lib/firmware/bnx2/bnx2-mips-06-6.2.1.fw
linux-firmware: /lib/firmware/bnx2/bnx2-mips-06-6.2.1.fw

The linux-firmware package is at version 1.60

It could also be a problem with a rename, perhaps inside the initrd?

mkdir /tmp/initrd && cd /tmp/initrd
gzip -cd /boot/initrd.img-3.0.0-12-server | cpio -idmv
$ find . -name bnx2-mips-06-6.2.1.fw
./lib/firmware/3.0.0-12-server/bnx2/bnx2-mips-06-6.2.1.fw

So, according to that it's also in the initrd....

Changed in update-manager (Ubuntu):
status: Incomplete → Invalid
Tom Ellis (tellis) on 2011-10-20
Changed in linux (Ubuntu):
status: New → Confirmed
Joseph Salisbury (jsalisbury) wrote :

@Christopher

We would like to collect some additional information about your system. From a terminal, please run the following:

apport-collect 842560

tags: added: oneiric regression-release
Netmatters (netmatters) wrote :

Not sure if this will help much but this is the behavior that I get on an HP DL380 G5 running Oneiric with 3.0.0-12-server:

I've got 2 interfaces connected and configured yet only eth1 comes up.

I can confirm all that Tom Ellis see's above as well as the following:

$ cat /etc/udev/rules.d/70-persistent-net.rules

# PCI device 0x14e4:0x164c (bnx2)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1a:4b:4c:ae:1e", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

# PCI device 0x14e4:0x164c (bnx2)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1a:4b:4c:ae:1c", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"

$ dmesg | grep bnx
[ 1.085804] bnx2: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.1.6 (Mar 7, 2011)
[ 1.085851] bnx2 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 61.920053] bnx2: Can't load firmware file "bnx2/bnx2-mips-06-6.2.1.fw"
[ 61.920093] bnx2 0000:03:00.0: PCI INT A disabled
[ 61.920109] bnx2: probe of 0000:03:00.0 failed with error -2
[ 61.920149] bnx2 0000:05:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 64.329467] bnx2 0000:05:00.0: eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem fa000000, IRQ 17, node addr 00:1a:4b:4c:ae:1c
[ 64.375917] bnx2 0000:05:00.0: irq 46 for MSI/MSI-X
[ 64.496012] bnx2 0000:05:00.0: eth1: using MSI
[ 67.656482] bnx2 0000:05:00.0: eth1: NIC Copper Link is Up, 1000 Mbps full duplex
[ 461.728088] bnx2 0000:05:00.0: irq 46 for MSI/MSI-X
[ 461.912009] bnx2 0000:05:00.0: eth1: using MSI
[ 464.761214] bnx2 0000:05:00.0: eth1: NIC Copper Link is Up, 1000 Mbps full duplex
[ 471.404087] bnx2 0000:05:00.0: irq 46 for MSI/MSI-X
[ 471.604008] bnx2 0000:05:00.0: eth1: using MSI
[ 474.539930] bnx2 0000:05:00.0: eth1: NIC Copper Link is Up, 1000 Mbps full duplex

$ dmesg | grep udev
[ 1.045498] udevd[93]: starting version 173
[ 64.086957] udevd[312]: starting version 173
[ 64.360430] udevd[322]: renamed network interface eth0 to eth1

I was just about to upgrade a 10 node cloud all running G5's!! :(

Joseph Salisbury (jsalisbury) wrote :

@Christopher

There is a newer version of the kernel(3.0.0-12.20) currently in the release pocket than the one you tested when this issue was found. Please test again with the newer kernel and indicate in the bug if this issue still exists or not.

tags: added: kernel-request-3.0.0-12.20
Joseph Salisbury (jsalisbury) wrote :

@John Miller

Can you see if the firmware exists on your system:
ls -l /lib/firmware/bnx2/*

It would also be great if you can attach your kern.log file.

Changed in linux (Ubuntu):
importance: Undecided → Medium
Netmatters (netmatters) wrote :

My host was just installed, and is running 3.0.0-12.20 and this issue still exists.

Joseph Salisbury (jsalisbury) wrote :

@Netmatters

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . If so, please test the release candidate kernel versus the daily build:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.1-rc10-oneiric/

tags: added: kernel-key
Changed in linux (Ubuntu):
importance: Medium → High
Nick (k-launchpadspamsme) wrote :

I just tried installing the latest ubuntu-11.10-server-amd64 on an HP DL380 G5 and after completing installation it fails to boot. When booting in recovery mode, input locks up at the Recovery Menu (requires a power cycle) and I get these errors.

[ 122.848027] bnx2: Can't load firmware file "bnx2/bnx2-mips-06-6.2.1.fw"
[ 122.848068] bnx2 0000 :05 :00.0: PCI INT A disabled
[ 122.848089] bnx2: probe of 0000 :05 :00.0 failed with error -2
[ 184.032024] bnx2: Can't load firmware file "bnx2/bnx2-mips-06-6.2.1.fw"
[ 184.032353] bnx2 0000 :0f :00.0: PCI INT A disabled
[ 184.032369] bnx2: probe of 0000 :0f :00.0 failed with error -2

Netmatters (netmatters) wrote :

Hi Joseph,

Ok, so I just installed the RC kernel as suggested above and can confirm that this bug still exists:-

$ uname -a
Linux proxyauth0 3.1.0-030100rc10-generic #201110200610 SMP Thu Oct 20 10:11:32 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

$ lspci | grep -i ethernet
03:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
05:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)

$ lsmod | grep bnx
bnx2 86844 0

$dmesg | grep -i bnx
[ 1.434629] bnx2: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.1.11 (July 20, 2011)
[ 1.434661] bnx2 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 1.920634] bnx2 0000:03:00.0: eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem f8000000, IRQ 16, node addr 00:1a:4b:4c:ae:1e
[ 1.920680] bnx2 0000:05:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 62.944052] bnx2: Can't load firmware file "bnx2/bnx2-mips-06-6.2.1.fw"
[ 62.944092] bnx2 0000:05:00.0: PCI INT A disabled
[ 62.944107] bnx2: probe of 0000:05:00.0 failed with error -2
[ 66.259391] bnx2 0000:03:00.0: irq 46 for MSI/MSI-X
[ 66.361239] bnx2 0000:03:00.0: eth0: using MSI
[ 69.553993] bnx2 0000:03:00.0: eth0: NIC Copper Link is Up, 1000 Mbps full duplex

$ cat /etc/udev/rules.d/70-persistent-net.rules | grep eth0
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1a:4b:4c:ae:1e", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

$ cat /etc/udev/rules.d/70-persistent-net.rules | grep eth1
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1a:4b:4c:ae:1c", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"

$ ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:1a:4b:4c:ae:1e
          inet addr:192.168.102.2 Bcast:0.0.0.0 Mask:255.255.255.0
          inet6 addr: fe80::21a:4bff:fe4c:ae1e/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
          RX packets:793 errors:0 dropped:0 overruns:0 frame:0
          TX packets:595 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:75109 (75.1 KB) TX bytes:77569 (77.5 KB)
          Interrupt:16 Memory:f8000000-f8012800

$ifconfig eth1
eth1: error fetching interface information: Device not found

I'm going to try and install the daily build now and will report back the results.

Netmatters (netmatters) wrote :

Ok, I have just installed the 'daily build' kernel and can confirm that this bug still exists:-

$ uname -a
Linux proxyauth0 3.1.0-030100rc10-generic #201110200610 SMP Thu Oct 20 10:11:32 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

$ dmesg | grep -i bnx
[ 1.442551] bnx2: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.1.11 (July 20, 2011)
[ 1.442580] bnx2 0000:03:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 1.925067] bnx2 0000:03:00.0: eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem f8000000, IRQ 16, node addr 00:1a:4b:4c:ae:1e
[ 1.925129] bnx2 0000:05:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[ 62.944050] bnx2: Can't load firmware file "bnx2/bnx2-mips-06-6.2.1.fw"
[ 62.944089] bnx2 0000:05:00.0: PCI INT A disabled
[ 62.944104] bnx2: probe of 0000:05:00.0 failed with error -2
[ 65.971304] bnx2 0000:03:00.0: irq 46 for MSI/MSI-X
[ 66.096032] bnx2 0000:03:00.0: eth0: using MSI
[ 69.294254] bnx2 0000:03:00.0: eth0: NIC Copper Link is Up, 1000 Mbps full duplex

$ lsmod | grep bnx
bnx2 86844 0

Netmatters (netmatters) wrote :

Just correcting my last post since it was still booted into the RC kernel.

Booted into the daily build kernel correctly this time and the issue still exists:-

# uname -a
Linux proxyauth0 3.1.0-030100rc10-generic #201110200610 SMP Thu Oct 20 10:11:32 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

With all the different kernels I've tried the results are the same, only one interface gets enabled, the kernel just fails to load the driver for the other interface.

Joseph Salisbury (jsalisbury) wrote :

This issue appears to be an upstream bug, since you tested the latest upstream kernel. We will want to file an upstream bug, once bugzilla.kernel.org is available. An alternative would be to email the maintainer for this subsystem directly.

Changed in linux (Ubuntu):
status: Confirmed → Triaged
Tim Gardner (timg-tpi) on 2011-10-24
Changed in linux (Ubuntu Oneiric):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux (Ubuntu Precise):
assignee: nobody → Tim Gardner (timg-tpi)
status: Triaged → In Progress
Tim Gardner (timg-tpi) wrote :

This really seems like a driver bug. I've got an AMD server with 2 bnx2 NICS, so I'll see if I can replicate the problem.

Netmatters (netmatters) wrote :

There are probably more Dell 2950 and HP DL380's installed in the data-centres than any other 2U server.

Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in update-manager (Ubuntu Oneiric):
status: New → Confirmed
Tim Gardner (timg-tpi) wrote :

Would someone who has physical console access perform the following, then attach the output of dmesg:

sudo modprobe -r bnx2
sudo modprobe bnx2

Contrary to my post in #27, I now think the driver looks right. This is possibly a race with udev getting killed. IIRC upstart nukes some processes after initrd has run and just before the rootfs is mounted.

David Bierce (cppe-david) wrote :

This output kind of matches what you think would be happening. Note the 1 minute hang waiting for the network driver. The output of the screen is filled with udev messages but I can't find them logged anywhere.

David Bierce (cppe-david) wrote :

There is the output to dmesg when the driver is removed then readded.

Tim Gardner (timg-tpi) wrote :

David - your modprobe.txt results seem pretty conclusive that this bug is really a udev race with upstart and initramsfs. I'm assigning it to the foundations team accordingly.

affects: update-manager (Ubuntu Oneiric) → udev (Ubuntu Oneiric)
Changed in udev (Ubuntu Oneiric):
milestone: none → oneiric-updates
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Tim Gardner (timg-tpi) on 2011-10-27
Changed in linux (Ubuntu Oneiric):
assignee: Tim Gardner (timg-tpi) → nobody
status: In Progress → Invalid
Changed in linux (Ubuntu Precise):
assignee: Tim Gardner (timg-tpi) → nobody
status: In Progress → Invalid
Steve Langasek (vorlon) wrote :

> IIRC upstart nukes some processes after initrd has run and just before the
> rootfs is mounted.

No, it does not. the /usr/share/initramfs/scripts/init-bottom/udev script signals udev to quit with 'udevadm control --exit', which causes udev to signal its workers and wait up to 60 seconds for them to finish up.

The logs for this (apparently widely reproducible) bug show that a udev thread spends a full 60 seconds waiting for the bnx2 firmware to be loaded, and at the end it times out and kills the process, leading to these messages:

[ 2.218105] bnx2 0000:09:00.0: eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem f4000000, IRQ 16, node addr 00:1c:23:bd:ed:e3
[ 2.218178] bnx2 0000:05:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
<snip>
[ 62.944047] bnx2: Can't load firmware file "bnx2/bnx2-mips-06-6.2.1.fw"
[ 62.944087] bnx2 0000:05:00.0: PCI INT A disabled
[ 62.944107] bnx2: probe of 0000:05:00.0 failed with error -2

I don't know what this worker thread is doing while it's supposed to be loading this firmware. I also don't have any hardware to reproduce this on. Were you able to reproduce it with your AMD server?

Steve Langasek (vorlon) wrote :

Remarking 'confirmed' for the kernel, because AFAICS there's a kernel bug here somewhere if loading the firmware is hanging for 60 seconds.

Changed in linux (Ubuntu Precise):
status: Invalid → Confirmed
Changed in linux (Ubuntu Oneiric):
importance: Undecided → High
status: Invalid → Confirmed
Steve Langasek (vorlon) wrote :

I've uploaded a patched udev package, 173-0ubuntu4+ppa2, to my ppa that should help with debugging this. It should be available in an hour or so.

Once it is, can those who are able to reproduce this issue please install the udev package from <https://launchpad.net/~vorlon/+archive/ppa> by running 'sudo apt-add-repository ppa:vorlon && sudo apt-get update && sudo apt-get
install udev', then reboot, and attach the files /dev/.udev.ps.log and /dev/.udev.initramfs.log to this bug report?

David Bierce (cppe-david) wrote :

From a Dell Poweredge 2950

Hi David,

On Fri, Oct 28, 2011 at 04:02:33AM -0000, David Bierce wrote:
> >From a Dell Poweredge 2950

> ** Attachment added: "udev.initramfs.log"
> https://bugs.launchpad.net/ubuntu/+source/udev/+bug/842560/+attachment/2576113/+files/udev.initramfs.log

Thanks for this. Was there also a /dev/.udev.ps.log? Can you attach that
file as well?

--
Steve Langasek Give me a lever long enough and a Free OS
Debian Developer to set it on, and I can move the world.
Ubuntu Developer http://www.debian.org/
<email address hidden> <email address hidden>

After two reboots, that file doesn't exist. Just the udev.initramfs.log file.

Steve Langasek (vorlon) wrote :

David, could you also include the dmesg output from this particular boot, so we can be absolutely certain that the firmware loading is failing the same way as before?

Steve Langasek (vorlon) wrote :

> After two reboots, that file doesn't exist. Just the udev.initramfs.log file.

Please verify the version of the udev package you have installed. It should be version 173-0ubuntu4+ppa2 - version 173-0ubuntu4+ppa1 didn't include the code to output the process list.

David Bierce (cppe-david) wrote :
  • udev.initramfs.log Edit (232.0 KiB, application/octet-stream; x-unix-mode=0644; name="udev.initramfs.log")
  • udev.ps.log Edit (4.9 KiB, application/octet-stream; x-unix-mode=0644; name="udev.ps.log")

Had the older package. Here are both files.

Steve Langasek (vorlon) wrote :

  216 0 4208 S /sbin/modprobe -bv pci:v000014E4d0000164Csv00001028s

Thanks. To me that looks pretty conclusive that this is a kernel issue - this modprobe should be returning immediately, but instead it hangs out for 60 seconds until being killed.

Not a udev bug, at least; either a kernel bug or a module-init-tools bug.

AlexC (swissalex90) wrote :

Hi,
I've just upgraded to 11.10 (from 11.04 via do-release-upgrade) and have this same error (only eth1 seems to get recognized, not eth0). Symptoms all as reported above and on an HP Proliant ML370 G5.

Have access to terminal and happy to offer any logs / output as needed to help resolve.

AlexC (swissalex90) wrote :

Hi,
Is there any update on this bug please ? Any idea when it will be looked at ? (it seems to be still unassigned).

Thanks
Alex

mturilli (mturilli) wrote :

Hi,

We experienced this issue on Dell PowerEdge 2950 servers. We have addressed it by modifying the /usr/share/initramfs-tools/scripts/init-bottom/udev file as described in https://bugs.launchpad.net/ubuntu/+source/udev/+bug/818177 .

In our case, at the end of the installation of Ubuntu 11.10, before rebooting, we have:
1. switched on console 2,
2. chroot into terget/,
3. edited the udev file adding:

. /scripts/functions
wait_for_udev

before the:

# Stop udevd

command,
4. rebuilt the initrd image with:

update-initramfs -u

5. finished the installation rebooting the system.

Tim Gardner (timg-tpi) wrote :

AlexC - its getting worked on by the foundations team. As it is a fairly complex issue its taken awhile to determine the root cause. That, and UDS kind of distracted us for a few days.

AlexC (swissalex90) wrote :

@Tim, ok thanks very much - I just couldn't tell from the statuses at the top of this page whether it was actually being looked at or was still to be assigned.

@matteo - thank you very very much, that works for me too. I have my system up and running again now (and this thread bookmarked :).

thanks to you both.

Phil Re (philr) wrote :

Same issue here on a DL380G5 after updating from 11.04 to 11.10.
Is there a workaround other then installing another NIC?
I'll be happy to provide any further information or logs.

Edgar (edgar-nl) wrote :

Same here with a DL360G5. AND:
After a reboot eth0 does't exist. When I "rmmod bnx2" and "modprobe bnx2" everthing works fine...

Marc Kolly (makuser) wrote :

Have a DL360G5 too.
1. If i reboot the server with the "reboot" command and nic1 and nic2 are connected to the network, i have eth0 and eth1 on my 11.10.
2. But if i do a reboot with only nic1, I only have eth1 which is not working, because nothing is connected.
3. After a reboot with only nic2 i have eth1, but not eth0.
4. But whenever i turn off the server with the ACPI Button, and turn it on again, i have only eth1. Also if i connect both nics as in 1.

I will try rmmod bnx2 and modprobe bnx2 in a few hours.

Edgar (edgar-nl) wrote :

Exact the same here! (@Marc-André)
And don't forget to script the rmmod and modprobe when U work from your workstation ;-)

Edgar (edgar-nl) wrote :

@matteo (#46) your solution/work-around also works for us, thanx!

Ameet Paranjape (ameetp) on 2011-11-30
Changed in linux (Ubuntu Oneiric):
assignee: nobody → Canonical Kernel Team (canonical-kernel-team)
Steve Langasek (vorlon) on 2011-12-01
Changed in udev (Ubuntu Precise):
assignee: nobody → Andy Whitcroft (apw)
Chris Van Hoof (vanhoof) on 2011-12-05
Changed in oem-priority:
status: New → Confirmed
Chris Van Hoof (vanhoof) on 2011-12-05
Changed in oem-priority:
assignee: nobody → Chris Van Hoof (vanhoof)
importance: Undecided → High
tags: added: kernel-da-key
Ralf Heiringhoff (frosty-geek) wrote :

I can confirm that applying the fix from comment #46 fixed our Issues with only 1 Interface of the 2 onboard NICs working after reboot.

We ran into the Problem first after upgrading from Lucid (10.04) to Oneiric (11.10) aswell as doing a clean install with Oneiric.

Hardware: DL 385 G2 (same as HP DL385 G5)

----------------cut-------------
Dec 12 13:38:12 d01-spi-1 kernel: [ 1.651916] bnx2: Broadcom NetXtreme II Gigabit Ethernet Driver bnx2 v2.1.6 (Mar 7, 2011)
Dec 12 13:38:12 d01-spi-1 kernel: [ 1.651948] bnx2 0000:04:00.0: PCI INT A -> GSI 41 (level, low) -> IRQ 41
Dec 12 13:38:12 d01-spi-1 kernel: [ 2.135913] bnx2 0000:04:00.0: eth0: Broadcom NetXtreme II BCM5708 1000Base-T (B2) PCI-X 64-bit 133MHz found at mem f8000000, IRQ 41, node addr 00:19:bb:ca:6a:88
Dec 12 13:38:12 d01-spi-1 kernel: [ 2.136152] bnx2 0000:42:00.0: PCI INT A -> GSI 34 (level, low) -> IRQ 34
Dec 12 13:38:12 d01-spi-1 kernel: [ 62.944053] bnx2: Can't load firmware file "bnx2/bnx2-mips-06-6.2.1.fw"
Dec 12 13:38:12 d01-spi-1 kernel: [ 62.944085] bnx2 0000:42:00.0: PCI INT A disabled
Dec 12 13:38:12 d01-spi-1 kernel: [ 62.944126] bnx2: probe of 0000:42:00.0 failed with error -2
Dec 12 13:38:13 d01-spi-1 kernel: [ 66.367555] bnx2 0000:04:00.0: irq 68 for MSI/MSI-X
Dec 12 13:38:13 d01-spi-1 kernel: [ 66.477097] bnx2 0000:04:00.0: eth1: using MSI
Dec 12 13:38:16 d01-spi-1 kernel: [ 69.651513] bnx2 0000:04:00.0: eth1: NIC Copper Link is Up, 1000 Mbps full duplex, receive & transmit flow control ON
----------------cut-------------

----------------cut-------------
root@d01-spi-1:~# ethtool -i eth0
driver: bnx2
version: 2.1.6
firmware-version: bc 1.9.6
bus-info: 0000:04:00.0
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
----------------cut-------------

----------------cut-------------
root@d01-spi-1:~# lspci | grep -i Ethernet
04:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
42:00.0 Ethernet controller: Broadcom Corporation NetXtreme II BCM5708 Gigabit Ethernet (rev 12)
----------------cut-------------

----------------cut-------------
Linux d01-spi-1 3.0.0-14-server #23-Ubuntu SMP Mon Nov 21 20:49:05 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux
----------------cut-------------

----------------cut-------------
ii udev 173-0ubuntu4 rule-based device node and kernel event manager
ii linux-image-server 3.0.0.14.16 Linux kernel image on Server Equipment.
----------------cut-------------

Download full text (4.4 KiB)

@Netmatters I'm having the exact same problem on a HP 360 G5 with Ubuntu Server 64 11.10.

Everything was working fine until Ubuntu Desktop 64 11.04.

During the installation (not an upgrade) of Ubuntu Server 64 11.10, I manually set up a static IP address, gateway, netmask and DNS server for eth0 but in the end no network connection was available as expected.

These are two screenshots taken during boot with a camera:

http://imageshack.us/photo/my-images/443/img20111215115606a.jpg/

http://imageshack.us/photo/my-images/163/img20111215115633.jpg/

ifconfig:
lo Link encap:Local Loopback
          inet addr:127.0.0.1 Mask:255.0.0.0
          inet6 addr: ::1/128 Scope:Host
          UP LOOPBACK RUNNING MTU:16436 Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 B) TX bytes:0 (0.0 B)

70-persistent-net.rules:
# This file was automatically generated by the /lib/udev/write_net_rules
# program, run by the persistent-net-generator.rules rules file.
#
# You can modify it, as long as you keep each rule on a single
# line, and change only the value of the NAME= key.

# PCI device 0x14e4:0x164c (bnx2)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1a:4b:ce:3e:16", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth0"

# PCI device 0x14e4:0x164c (bnx2)
SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="00:1a:4b:ce:3e:b4", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"

sudo lshw -class network
  *-network UNCLAIMED
       description: Ethernet controller
       product: NetXtreme II BCM5708 Gigabit Ethernet
       vendor: Broadcom Corporation
       physical id: 0
       bus info: pci@0000:03:00.0
       version: 12
       width: 64 bits
       clock: 66MHz
       capabilities: pcix pm vpd msi cap_list
       configuration: latency=64 mingnt=64
       resources: memory:f8000000-f9ffffff memory:80100000-801007ff
  *-network DISABLED
       description: Ethernet interface
       product: NetXtreme II BCM5708 Gigabit Ethernet
       vendor: Broadcom Corporation
       physical id: 0
       bus info: pci@0000:05:00.0
       logical name: eth1
       version: 12
       serial: 00:1a:4b:ce:3e:b4
       capacity: 1Gbit/s
       width: 64 bits
       clock: 66MHz
       capabilities: pcix pm vpd msi bus_master cap_list rom ethernet physical tp 10bt 10bt-fd 100bt 100bt-fd 1000bt-fd autonegotiation
       configuration: autonegotiation=on broadcast=yes driver=bnx2 driverversion=2.1.6 duplex=half firmware=bc 1.9.6 latency=64 link=no mingnt=64 multicast=yes port=twisted pair
       resources: irq:19 memory:fa000000-fbffffff memory:80200000-802007ff

lsmod:
Module Size Used by
psmouse 73882 0
serio_raw 13166 0
hpilo 17399 0
usbhid 47198 0
ipmi_si 53548 0
hid 95463 1 usbhid
radeon 1015949 1
ipmi_msghandler 45838 1 ipmi_si
ttm 76805 1 radeon
drm_kms_helper 42558 1 radeon
i50...

Read more...

I confirm that applying the fix from comment #46 fixed our issue too (see my previous comment).

Andy Whitcroft (apw) on 2011-12-16
Changed in udev (Ubuntu Precise):
status: Invalid → In Progress
importance: Undecided → Medium
Changed in udev (Ubuntu Oneiric):
assignee: Canonical Foundations Team (canonical-foundations) → Andy Whitcroft (apw)
importance: Undecided → Medium
status: Confirmed → In Progress
Changed in linux (Ubuntu Precise):
assignee: nobody → Andy Whitcroft (apw)
importance: High → Medium
status: Confirmed → In Progress
Changed in linux (Ubuntu Oneiric):
assignee: Canonical Kernel Team (canonical-kernel-team) → Andy Whitcroft (apw)
importance: High → Medium
status: Confirmed → In Progress
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package udev - 175-0ubuntu3

---------------
udev (175-0ubuntu3) precise; urgency=low

  [ Andy Whitcroft ]
  * debian/udev.initramfs-bottom: ignore timeout errors from udevadm we
    expect (and indeed requested) in certain failure modes. If we do not we
    will bail out early and not transfer /dev into /root which is always
    fatal leading to unbootable machines. (LP: #818177)
  * avoid-exit-deadlock-for-timely-events: avoid deadlock when exiting
    by continuing to handle events with timeliness requirements.
    The timeliness requirement will be violated if we ignore them which
    is highly undesirable. Also these events are typically dependant
    events and may well block the events we are waiting on leading to
    boot delays and uninitialised devices. (LP: #842560)
  * debian/udev.initramfs-bottom: increase the client-side timeout to
    better cope with potential timeout extension issues in udev. We very
    much would prefer udev to time itself out and guarantee to have
    completed than take action ourselves. Very worst case the timeout may
    be doubled from the default of 60s so increase ours accordingly. Note,
    we should only ever trip this timeout when we are already in severe
    trouble. (LP: #818177)
 -- Steve Langasek <email address hidden> Fri, 16 Dec 2011 11:15:39 -0800

Changed in udev (Ubuntu Precise):
status: In Progress → Fix Released
Steve Langasek (vorlon) on 2011-12-17
description: updated
Steve Langasek (vorlon) on 2011-12-17
description: updated
Mugur (mugurd) wrote :

I am in asimilar situation. We have plenty of HP Proliants 380G5 and a few DELL Poweredge 1950s. On HPs we have done fresh install of 11.10 server from Debian Etch. Ethernet cards did not recognised but rmmod bnx2, modprobe bnx2 trick resolved the problem. On first Poweredege 1950 a do-release-upgrade cycle from 10.10 to 11.10 worked pretty well besides one ethernet NIC did not worked (eth1 was working and eth0 was not) same rmmod bnx2, modprobe bnx2 trick resolves the problem here. I was cancelling second Poweredge 1950 upgrade since our Jira/Confluence+svn was working on it. It is identical in terms of hardware with the first. After upgrading 10.10 to 11.04 everything was fine (incl. ethernets). After 11.04 to 11.10
upgrade, system does not boot, it is in the same situation the message

udevd[107]: '/sbin/modprobe -bv pci:v000014E4d0000164Csv00001028sd000001B2bc02sc00i00' [172] terminated by signal 9 (Killed)

 message is on screen.
I am on kernel 3.0.0-14-server

This was yesterday and today I have tried to open the system with a live CD. Mount the file system add the "blacklist bnx2" line to /etc/modprobe.d/blacklist.conf but no luck on first reboot.

No offense but I understand that developers really work hard and release a new version of udev but to "Precise". Unfortunately ordinary users such as me are on "Oneiric". I will try to install udev by hand and report situation.
Thanks for all
Happy new year.

Mugur (mugurd) wrote :

Manual installation of packages from precise solved the problem for oneiric. I have installed udev 1.75 ( depending on newer versions of libacl and libudev ) by "dpkg -i PACKAGE". As you can understand I am not a system admin and not a linux pro. Furthermore our small server farm lies on our intranet and I try to manage a mirror repo for this configuration. I apologise for my last comment's last paragraph. But being unable to see the newest package on your offline repo and try to download teh packages and possible dependencies was not elegant.

Again happy new year.

tags: removed: kernel-da-key kernel-key

Hello Christoph, or anyone else affected,

Accepted udev into oneiric-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in udev (Ubuntu Oneiric):
status: In Progress → Fix Committed
tags: added: verification-needed
Max Lapshin (max-maxidoors) wrote :

Updating to udev173 from oneiric-proposed really helped me. Thanks!

Martin Pitt (pitti) on 2012-01-19
tags: added: verification-done
removed: verification-needed
Kiall Mac Innes (kiall) wrote :

I can confirm udev 173 from oneiric-proposed has fixed the issue.. Thanks :)

Chris Van Hoof (vanhoof) on 2012-01-22
Changed in oem-priority:
status: Confirmed → In Progress
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package udev - 173-0ubuntu4.1

---------------
udev (173-0ubuntu4.1) oneiric-proposed; urgency=low

  [ Andy Whitcroft ]
  * debian/udev.initramfs-bottom: ignore timeout errors from udevadm we
    expect (and indeed requested) in certain failure modes. If we do not we
    will bail out early and not transfer /dev into /root which is always
    fatal leading to unbootable machines. (LP: #818177)
  * avoid-exit-deadlock-for-timely-events: avoid deadlock when exiting
    by continuing to handle events with timeliness requirements.
    The timeliness requirement will be violated if we ignore them which
    is highly undesirable. Also these events are typically dependant
    events and may well block the events we are waiting on leading to
    boot delays and uninitialised devices. (LP: #842560)
  * debian/udev.initramfs-bottom: increase the client-side timeout to
    better cope with potential timeout extension issues in udev. We very
    much would prefer udev to time itself out and guarantee to have
    completed than take action ourselves. Very worst case the timeout may
    be doubled from the default of 60s so increase ours accordingly. Note,
    we should only ever trip this timeout when we are already in severe
    trouble. (LP: #818177)
 -- Steve Langasek <email address hidden> Fri, 16 Dec 2011 14:10:02 -0800

Changed in udev (Ubuntu Oneiric):
status: Fix Committed → Fix Released
Changed in oem-priority:
status: In Progress → Fix Released
dino99 (9d9) on 2013-05-18
Changed in linux (Ubuntu Oneiric):
status: In Progress → Invalid
Changed in linux (Ubuntu Precise):
status: In Progress → Fix Released
Changed in linux (Ubuntu):
status: In Progress → Fix Released
dino99 (9d9) wrote :

That issue seems fixed since lp:494052

Jaap Hoetmer (jaap.hoetmer) wrote :

Hi.

FYI This exact problem now appears to have returned in Ubuntu 14.04. I have a server that was upgraded from 12.04 to 14.04 and now refuses to start the network, indicating

Can't load firmware file bnx2/bnx2-mips-06-6.2.3.fw

The network interface is a Broadcom BCM5708.

I am going to swap it for an Intel NIC.

Gus Hoppes (g-style) wrote :

Hi. I just did a do-release-upgrade from Ubuntu Server 12.04 to 14.04 just as Jaap did. I got the same issue. The network card will not start. Its on a Dell PowerEdge 2950 with a static address on eth0 only (dual nics).

RTNETLINK answers: no such process
[61709.025401] bnx2: can't load firmware file bnx2/bnx2-mips-06-6.2.3.fw
no such file or directory

Is there a fix I can try?

Kai-Heng Feng (kaihengfeng) wrote :

@Gus,

Please file a new bug report.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers