10.04 beta1 iSCSI-on-root fails on boot

Bug #546964 reported by sub.mesa on 2010-03-25
18
This bug affects 3 people
Affects Status Importance Assigned to Milestone
initramfs-tools (Ubuntu)
High
Colin Watson

Bug Description

Binary package hint: open-iscsi

This is a follow-up of bug 546944; which focuses on issues with the installer. This bug focuses on boot problems after having used the installer cd to install to the iSCSI volume. After that, the vmlinuz and initrd.img on the iSCSI install are copied to the server and pxelinux works to boot the kernel and initrd image.

However, during boot it came across a problem with detecting my network interface eth0:

(..)
Begin: Loading essential drivers...
Done.
Begin: Running /scripts/init-premount...
Done.
Begin: Mounting root file system...
Begin: Running /scripts/local-top ...
ipconfig: eth0: SIOCGIFINDEX: No such device
ipconfig: no devices to configure
/scripts/local-top/iscsi: .: line 421: can't open /tmp/net-eth0.conf
[ 1.63xxxxx] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
[ 1.63xxxxx] r8169 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[ 1.63xxxxx] eth0: RTL8168d/8111d a ..., <mac>, XID .. IRQ 27
(..)
Begin: Retrying network config
ipconfig: /sys/class/net/eth0: SIOCGIFINDEX: No such device
ipconfig: no devices to configure
/scripts/local-top/iscsi: .: line 421: Can't open /tmp/net-/sys/class/net/eth0.conf
(..)

The sequence between the last two (..) is repeated in a loop, until it drops to Busybox. Please note the "/tmp/net-/" directory in the last error message.

When checking the /sys/class/net/eth0/ directory within Busybox, i can see that it got a symlink "device" pointing to "../../../0000:01:00.0"; the same numeric code as in my log pasted above. Modprobe also appears to have the realtek modules. But ifconfig doesn't yield any output.

Since i'm doing this iSCSI install on a new system, i'm not sure if this is a regression since 9.10. But the ethernet worked fine when installing over the network and fetching repositories from the internet during installation.

It appears to me that the modules for my network interfaces are dynamically loaded, but perhaps not in time? Because, shortly after the first error messages encountered, it detects my eth0 interface. At least in the logs, since ifconfig doesn't detect it.

So it appears the iscsi scripts have some issue that prevents them from connecting to my iSCSI server, as i can see no trace of that; perhaps a simple usleep() here and there would do wonders? ;-)

sub.mesa (sub-mesa) wrote :

I just tested again with a server nightly, but the same problem is still present as of April 5th, 2010 lucid nightly.

This time i tested within Virtualbox, which uses pcnet ethernet driver; i hope the screenshot may provide a clue of what's going wrong here.

jkao (jmk17) wrote :

I'm encountering the same issue with a Broadcom network card using the tg3 driver. Just as reported above, the installation onto iscsi worked without issue, but the iscsi boot process in the initrd seems to go into a loop doing network configuration.

In my case, I also have encrypted swap configured and after looping for some time, they system attempts to drop me into busybox and prompts me for the password, but then hangs after attempting mountall.

Another thing to note is that I have /boot and / on a local USB disk and am using iscsi for /usr, /var, and /opt only.

jkao (jmk17) wrote :

After playing with this some more, I managed to find a workaround.

Clue clue is the log line:

ipconfig: /sys/class/net/eth0: SIOCGIFINDEX: No such device

Which indicates that the DEVICE variable is somehow getting changed from eth0 (as set in the initramfs.conf) to /sys/class/net/eth0, which is garbage.

I haven't worked out exactly where this gets screwed up, but if you set your grub command to explicitly pass in ip like:

ip=:::::eth0:dhcp

This seems to take a different branch in configure_networks() that correctly configures the interface.

Leaving ip unset or doing ip=dhcp results in the reported failure.

Colin Watson (cjwatson) wrote :

Whoops! Thanks for the analysis - I'll get this fixed for final.

affects: open-iscsi (Ubuntu) → initramfs-tools (Ubuntu)
Changed in initramfs-tools (Ubuntu):
assignee: nobody → Colin Watson (cjwatson)
importance: Undecided → High
status: New → In Progress
status: In Progress → Fix Committed
Download full text (3.4 KiB)

Great news!
Thanks Colin!

Will try out a nightly as soon as possible.

Kind regards,
Jason Edwards

On Wed, Apr 21, 2010 at 12:51 PM, Colin Watson <email address hidden>wrote:

> Whoops! Thanks for the analysis - I'll get this fixed for final.
>
> ** Package changed: open-iscsi (Ubuntu) => initramfs-tools (Ubuntu)
>
> ** Changed in: initramfs-tools (Ubuntu)
> Importance: Undecided => High
>
> ** Changed in: initramfs-tools (Ubuntu)
> Status: New => In Progress
>
> ** Changed in: initramfs-tools (Ubuntu)
> Assignee: (unassigned) => Colin Watson (cjwatson)
>
> ** Changed in: initramfs-tools (Ubuntu)
> Status: In Progress => Fix Committed
>
> --
> 10.04 beta1 iSCSI-on-root fails on boot
> https://bugs.launchpad.net/bugs/546964
> You received this bug notification because you are a direct subscriber
> of the bug.
>
> Status in “initramfs-tools” package in Ubuntu: Fix Committed
>
> Bug description:
> Binary package hint: open-iscsi
>
> This is a follow-up of bug 546944; which focuses on issues with the
> installer. This bug focuses on boot problems after having used the installer
> cd to install to the iSCSI volume. After that, the vmlinuz and initrd.img on
> the iSCSI install are copied to the server and pxelinux works to boot the
> kernel and initrd image.
>
> However, during boot it came across a problem with detecting my network
> interface eth0:
>
> (..)
> Begin: Loading essential drivers...
> Done.
> Begin: Running /scripts/init-premount...
> Done.
> Begin: Mounting root file system...
> Begin: Running /scripts/local-top ...
> ipconfig: eth0: SIOCGIFINDEX: No such device
> ipconfig: no devices to configure
> /scripts/local-top/iscsi: .: line 421: can't open /tmp/net-eth0.conf
> [ 1.63xxxxx] r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded
> [ 1.63xxxxx] r8169 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> [ 1.63xxxxx] eth0: RTL8168d/8111d a ..., <mac>, XID .. IRQ 27
> (..)
> Begin: Retrying network config
> ipconfig: /sys/class/net/eth0: SIOCGIFINDEX: No such device
> ipconfig: no devices to configure
> /scripts/local-top/iscsi: .: line 421: Can't open
> /tmp/net-/sys/class/net/eth0.conf
> (..)
>
> The sequence between the last two (..) is repeated in a loop, until it
> drops to Busybox. Please note the "/tmp/net-/" directory in the last error
> message.
>
> When checking the /sys/class/net/eth0/ directory within Busybox, i can see
> that it got a symlink "device" pointing to "../../../0000:01:00.0"; the same
> numeric code as in my log pasted above. Modprobe also appears to have the
> realtek modules. But ifconfig doesn't yield any output.
>
> Since i'm doing this iSCSI install on a new system, i'm not sure if this is
> a regression since 9.10. But the ethernet worked fine when installing over
> the network and fetching repositories from the internet during installation.
>
> It appears to me that the modules for my network interfaces are dynamically
> loaded, but perhaps not in time? Because, shortly after the first error
> messages encountered, it detects my eth0 interface. At least in the logs,
> since ifconfig doesn't detect it.
>
> So it appears the iscsi scripts have some issue that prevents them from
...

Read more...

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package initramfs-tools - 0.92bubuntu75

---------------
initramfs-tools (0.92bubuntu75) lucid; urgency=low

  * When specifying a network interface by MAC address, set DEVICE to (e.g.)
    eth0 rather than /sys/class/net/eth0 (LP: #546964).
 -- Colin Watson <email address hidden> Wed, 21 Apr 2010 12:40:42 +0100

Changed in initramfs-tools (Ubuntu):
status: Fix Committed → Fix Released
sub.mesa (sub-mesa) wrote :

Confirmed fixed. I just downloaded the new RC that was released today. It still had the same bug, however by setting ip=:::::eth0:dhcp i managed to boot. After i updated using apt-get update && apt-get dist-upgrade, the initramfs package was updated and wrote a new initrd.img. That one also boots with the ip=bootp that i was using originally.

So this bug seems to be fixed; though not included in RC yet (use nightly or wait for final release).

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers