Ubuntu

netcfg/choose_interface=auto fails to find the right interface

Reported by Steve Atwell on 2011-02-04
40
This bug affects 7 people
Affects Status Importance Assigned to Milestone
netcfg (Ubuntu)
Medium
Unassigned

Bug Description

Binary package hint: netcfg

Under some circumstances, netcfg may not be able to find the right interface to run dhclient on when netcfg/choose_interface is set to auto. It looks like the way choose_interface=auto works is that ethtool finds the lowest numbered interface that reports a link, and runs dhclient on that interface. If no interface with a link is found, it tries only eth0.

I'm hitting a problem on a number of servers that have one or two Broadcom BCM5708 interfaces *and* two Intel gigabit interfaces. If the network connection is plugged in to the BCM5708, the install will often fail to find a network with netcfg/choose_interface=auto.

The problem is that the BCM5708 doesn't report link up until you try to send traffic over it. So none of the interfaces on the server report having a link, and netcfg tries dhcp on just eth0. Depending on the order the network modules have been loaded, eth0 may be the BCM5708 or it may be the Intel. If eth0 is the Intel, d-i attempts to run dhclient on the wrong interface, and it fails.

I think a reasonable solution to this problem would be for netcfg to attempt dhclient on all interfaces until one succeeds. Or perhaps it should do this only when no interfaces report a link. Either way, I don't think we can rely entirely on link status, because not all NIC report this correctly.

Would you be able to retry this with a current natty installation image
(netboot links at http://cdimage.ubuntu.com/netboot/natty/)? netcfg's
link-detection behaviour was substantially changed upstream in version
1.60, and I would be interested to know if this addresses your problem.

Steve Atwell (satwell) wrote :
Download full text (7.0 KiB)

Natty doesn't seem to work any better, unfortunately.

This is on a Dell PowerEdge 2950 with two BCM5708 onboard NICs and a dual-port Intel 82571EB expansion card. PCI IDs of the network controllers:

~ # lspci -n | grep ' 0200:'
05:00.0 0200: 14e4:164c (rev 11)
09:00.0 0200: 14e4:164c (rev 11)
0c:00.0 0200: 8086:105e (rev 06)
0c:00.1 0200: 8086:105e (rev 06)

~ # uname -rvm
2.6.38-3-generic #30-Ubuntu SMP Thu Feb 10 00:33:26 UTC 2011 x86_64

And the relevant bits from the installer syslog:

Feb 15 01:33:07 netcfg[1590]: INFO: Starting netcfg v.1.60ubuntu2 (built 20110208-1933)
Feb 15 01:33:07 kernel: [ 17.470248] e1000e 0000:0c:00.0: irq 105 for MSI/MSI-X
Feb 15 01:33:07 kernel: [ 17.530073] e1000e 0000:0c:00.0: irq 105 for MSI/MSI-X
Feb 15 01:33:07 kernel: [ 17.530634] ADDRCONF(NETDEV_UP): eth0: link is not ready
Feb 15 01:33:08 netcfg[1590]: INFO: ethtool-lite: eth0 is disconnected.
Feb 15 01:33:08 netcfg[1590]: INFO: ethtool-lite: eth0 is disconnected.
Feb 15 01:33:08 netcfg[1590]: INFO: ethtool-lite: eth0 is disconnected.
Feb 15 01:33:08 netcfg[1590]: INFO: ethtool-lite: eth0 is disconnected.
Feb 15 01:33:09 netcfg[1590]: INFO: ethtool-lite: eth0 is disconnected.
Feb 15 01:33:09 netcfg[1590]: INFO: ethtool-lite: eth0 is disconnected.
Feb 15 01:33:09 netcfg[1590]: INFO: ethtool-lite: eth0 is disconnected.
Feb 15 01:33:09 netcfg[1590]: INFO: ethtool-lite: eth0 is disconnected.
Feb 15 01:33:10 netcfg[1590]: INFO: ethtool-lite: eth0 is disconnected.
Feb 15 01:33:10 netcfg[1590]: INFO: ethtool-lite: eth0 is disconnected.
Feb 15 01:33:10 netcfg[1590]: INFO: ethtool-lite: eth0 is disconnected.
Feb 15 01:33:10 netcfg[1590]: INFO: ethtool-lite: eth0 is disconnected.
Feb 15 01:33:10 netcfg[1590]: INFO: found no link on interface eth0.
Feb 15 01:33:10 netcfg[1590]: INFO: eth0 is not a wireless interface. Continuing.
Feb 15 01:33:11 kernel: [ 21.060277] e1000e 0000:0c:00.1: irq 106 for MSI/MSI-X
Feb 15 01:33:11 kernel: [ 21.120075] e1000e 0000:0c:00.1: irq 106 for MSI/MSI-X
Feb 15 01:33:11 kernel: [ 21.120612] ADDRCONF(NETDEV_UP): eth1: link is not ready
Feb 15 01:33:11 netcfg[1590]: INFO: ethtool-lite: eth1 is disconnected.
Feb 15 01:33:11 netcfg[1590]: INFO: ethtool-lite: eth1 is disconnected.
Feb 15 01:33:12 netcfg[1590]: INFO: ethtool-lite: eth1 is disconnected.
Feb 15 01:33:12 netcfg[1590]: INFO: ethtool-lite: eth1 is disconnected.
Feb 15 01:33:12 netcfg[1590]: INFO: ethtool-lite: eth1 is disconnected.
Feb 15 01:33:12 netcfg[1590]: INFO: ethtool-lite: eth1 is disconnected.
Feb 15 01:33:13 netcfg[1590]: INFO: ethtool-lite: eth1 is disconnected.
Feb 15 01:33:13 netcfg[1590]: INFO: ethtool-lite: eth1 is disconnected.
Feb 15 01:33:13 netcfg[1590]: INFO: ethtool-lite: eth1 is disconnected.
Feb 15 01:33:13 netcfg[1590]: INFO: ethtool-lite: eth1 is disconnected.
Feb 15 01:33:14 netcfg[1590]: INFO: ethtool-lite: eth1 is disconnected.
Feb 15 01:33:14 netcfg[1590]: INFO: ethtool-lite: eth1 is disconnected.
Feb 15 01:33:14 netcfg[1590]: INFO: found no link on interface eth1.
Feb 15 01:33:14 netcfg[1590]: INFO: eth1 is not a wireless interface. Continuing.
Feb 15 01:33:14 kernel: [ 24.491652] bnx2 0000:09:00.0: irq 107 for MSI/MSI-X
Feb 15 ...

Read more...

Changed in netcfg (Ubuntu):
assignee: nobody → Canonical Foundations Team (canonical-foundations)
Robbie Williamson (robbiew) wrote :

I'm thinking if netcfg properly supported mac address designation, then we wouldn't need to hack the "=auto" approach to handle nics that don't report link status. Would the workaround in https://bugs.launchpad.net/ubuntu/+source/netcfg/+bug/56679 help until we resolve this?

Robbie Williamson (robbiew) wrote :

BTW, the debian bug to address the solution proposed in the bug above (with a patch provided) is http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=615600.

Colin Watson (cjwatson) wrote :

I'm told out of band that the fix for bug 56679 should be sufficient to resolve this in practice. The link detection bug hasn't gone away, so I'm going to leave this bug open, but I'm unassigning it since it doesn't sound as though it needs to be treated with any particular priority. Leave a comment if I'm wrong ...

Changed in netcfg (Ubuntu):
status: New → Confirmed
importance: Undecided → Medium
assignee: Canonical Foundations Team (canonical-foundations) → nobody
Adam Koczur (aerradon) wrote :

Colin, with all the respect, but I think it rather should be treated with some higher priority. I am currently trying to deploy a batch of new servers and because of this issue, the process cannot be fully automated - it keeps asking me to choose networking interface, no matter what value is assigned to 'd-i netcfg/choose_interface select'. It might not be a problem for someone how deploys one box a year. Not to say, that the Red Hat guy, who is sitting next to me, keeps laughing saying how professional and enterprise grade Ubuntu is. I think debian installer should be fixed / finished properly, at some point, as the disk partitioner is broken, too. I know it is the other issue, but try to preseed some more complex partition schema. This is actually edging to depressing...

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.