boot order wrong for iscsi

Bug #227848 reported by Jesper Krogh on 2008-05-07
20
Affects Status Importance Assigned to Milestone
Release Notes for Ubuntu
Undecided
Colin Watson
open-iscsi (Ubuntu)
High
Unassigned
Hardy
High
Unassigned
Intrepid
High
Unassigned

Bug Description

Binary package hint: open-iscsi

The boot order is still wrong in some cases for iscsi.

iscsi probably waits for networking to start (due to upstart and dependencies) but it does not wait long enough, so "link" are available before.

Trace from bootup:
root@hest:~# dmesg | perl -ane 'print if $_ =~ /(eth|bond|iscsi)/i; '
[ 145.733504] e1000: eth0: e1000_probe: Intel(R) PRO/1000 Network Connection
[ 146.052622] e1000: eth1: e1000_probe: Intel(R) PRO/1000 Network Connection
[ 146.372001] e1000: eth2: e1000_probe: Intel(R) PRO/1000 Network Connection
[ 146.701361] e1000: eth3: e1000_probe: Intel(R) PRO/1000 Network Connection
[ 154.506735] Driver 'sd' needs updating - please use bus_type methods
[ 154.506382] Driver 'sr' needs updating - please use bus_type methods
[ 163.088171] eth4: NIU Ethernet 00:14:4f:bb:23:a8
[ 163.088176] eth4: Port type[XMAC] mode[10G:FIBER] XCVR[XPCS] phy[xgf]
[ 164.305607] eth5: NIU Ethernet 00:14:4f:bb:23:a9
[ 164.305611] eth5: Port type[XMAC] mode[10G:FIBER] XCVR[XPCS] phy[xgf]
[ 165.450618] Loading iSCSI transport class v2.0-724.
[ 165.461844] iscsi: registered transport (tcp)
[ 165.505437] iscsi: registered transport (iser)
[ 169.129554] Ethernet Channel Bonding Driver: v3.2.3 (December 6, 2007)
[ 169.129563] bonding: MII link monitoring set to 100 ms
[ 169.297933] e1000: eth0: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 169.321894] bonding: bond0: enslaving eth0 as a backup interface with an up link.
[ 169.447624] e1000: eth1: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 169.461239] bonding: bond0: enslaving eth1 as a backup interface with an up link.
[ 169.597327] e1000: eth2: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 169.610971] bonding: bond0: enslaving eth2 as a backup interface with an up link.
[ 169.747033] e1000: eth3: e1000_watchdog: NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX/TX
[ 169.760631] bonding: bond0: enslaving eth3 as a backup interface with an up link.
[ 170.102554] ADDRCONF(NETDEV_UP): bond0: link is not ready
[ 177.340214] ADDRCONF(NETDEV_CHANGE): bond0: link becomes ready
[ 188.001939] bond0: no IPv6 routers present

So the iscsi is tried to start at 165 and first at 177 the link is ready.

Soren Hansen (soren) wrote :

I'm not too hot on the idea of changing the ordering.
If you change node.conn[0].timeo.login_timeout to a higher value (60 seconds should be more than enough) in /etc/iscsi/iscsid.conf, does that solve it for you?

Changed in open-iscsi:
status: New → Incomplete
Soren Hansen (soren) wrote :

Oh, can I see your /etc/network/interfaces, by the way?

Jesper Krogh (jesper) wrote :

Quite simple.

# The loopback network interface
auto lo
iface lo inet loopback

# The primary network interface
auto bond0
iface bond0 inet static
        slaves eth0 eth1 eth2 eth3
        address 10.194.133.23
        netmask 255.255.254.0
        gateway 10.194.133.254

A suggestion was to put in "udevadm settle" in open-iscsi start().

But that only solves the problem partially.. since the subsequent checkfs and mount is not done either.

Summa: iscsi is still broken in Hardy.

Soren Hansen (soren) wrote :

open-iscsi start already has a call to "udevadm settle", so that's clearly insufficient.

The problem, I believe, come from the fact that open-iscsi is brought up (S25 in /etc/rcS.d) before networking (S40 in /etc/rcS.d).

It is my understanding that udev bring interfaces up as they are discovered, which mean that "bare" interfaces (eth0, eth1, etc) happen to be brought up before /etc/rcS.d/S25open-iscsi is run, in which case it is all fine. However, if you depend on an interface not being brought up by udev to reach the iSCSI target (such as a bonded interface, as it is the case above), then open-iscsi indeed come up before there is a path to its target and fail.

I tried setting node.conn[0].timeo.login_timeout to a large value, as Soren suggested earlier, but it did not change anything for me.

Dewi (dewi) wrote :

Having similar major problems utilizing iSCSI effectively with Heron. I have changed the Boot script order in rcS.d so iSCSI is now S44 instead of S25 ( starts iscsi AFTER networking ) and I now have iSCSI drives available to the system. However, it's proving difficult to get the boot order right for initiation and mounting of iSCSI drives to be automatic and effective as part of the boot process.
For example, if we want to hold service data on SAN iSCSI drives ( e.g. SAMBA file shares, LDAP or MySQL databases) we need to have the iSCSI drives fully mounted and available at the service startup scripts kick in. Currently with the default setup of iSCSI, this does not happen.
Continuing to investigate and experiment , but any further tips to effectively get iSCSI usable as a system drive would be appreciated.

Dewi (dewi) wrote :

I have a fairly consistent boot process now that mounts iSCSI drives. Here are the changes I made to the default init scripts symlinks order:
rc0.d : changed K25open-iscsi to K44open-icsi
rc6.d: changed K25open-iscsi to K44open-iscsi
rcS.d: Changed S40networking to S23networking

in /etc/init.d, I added a sleep 10 as the last statement of the "start" block of code in open-iscsi.

this delays the boot process a bit, but allows the iSCSI disks to initialize fully before the fsmount stuff kicks in.

I also have seen the wron startup ordering. But I only have moved the open-iscsi from S25 to S47 in rcS.d. It must start _after_ networking and not before. Changing network-ordering seems not to be a good idea IMHO.

Everything is working with this simple change.

I did the same as Christian Roessner, above: moved the open-iscsi link from S25 to S42 in rcS.d. This is the preferred method, as - he also pointed out - change the network order it's NOT a good idea.

Also, like Etienne Goyer stated, if you are using an "ordinary" interface (ethX), udev may load things in the right order. I'm using a bridged interface (br0) and so udev doesn't bring it up, making necessary the change above. Maybe that's the case with bonded interfaces as well.

But, in my opinion, it doesn't matter if one is using ordinary interfaces or not - open-iscsi MUST be started AFTER networking, since it depends on the network to do its job. It's a matter of logic....

So, the open-iscsi installation script should be changed to reflect the solutions found here - change its number from S25 to something above S40.

Dustin Kirkland  (kirkland) wrote :

Patch attached.

:-Dustin

Changed in open-iscsi:
status: Incomplete → In Progress

The above solution is not entirely right, as the open-iscsi init script will be run after /etc/rcS.d/mountall.sh, which mean that filesystem in /etc/fstab that reside on iSCSI will not be available when mountall.sh is ran.

Debian have fixed two related bugs:

http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=423851
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=438542

The solution might be to sync from Debian. Package 2.0.869~rc4-1 is supposed to include modification to the init script that would fix this problem (and related problem of having the iscsi initiator killed before the filesystem are unmounted). However, it requires filesystem on iscsi to be marked with the _netdev option in fstab. I can test that early next week (unfortunately not able to for the next few days).

Rick Clark (dendrobates) on 2008-10-09
Changed in open-iscsi:
importance: Undecided → Critical
milestone: none → ubuntu-8.10

The patch sent by Dustin already changes the open-iscsi script from S25 to S47, and (as we saw here) it seems that this is enough to mount any storage volume after the network is up.

However, the original open-iscsi script position (S25) is before S45 (waitnfs) and S46 (mountnfs-bootclean), and in a case where one uses ordinary interfaces (as said in two posts above), udev makes everything work and so any NFS filesystems get mounted. But now (with the patch), open-iscsi will load AFTER the NFS scripts.

Wouldn't be better to move the open-iscsi script to a value between 40 and 45 (40 < open-iscsi < 45), BEFORE the NFS scripts, allowing to be mounted over iSCSI volumes ?

That's still too late for local iscsi filesystems.

:-Dustin

Dustin and myself worked on the bug today. We where able to fix the bug with the following changes:

1. Add the _netdev option to filesystem on iscsi target in /etc/fstab. We cannot do that retroactively for people who already have entries for file system on iscsi target in fstab, so this will need to be documented in the release note for open-iscsi (README.Debian). I filed bug #284107 to make sure we had the option from now on in the installer.

2. Change the boot order of open-iscsi from 25 to anything above 40, as networking is started at 40. Upstream Debian went 45.

3. Add the following bit of code to the open-iscsi init script, at the end of the start() function:

----------------------------------------------------------------------
        log_daemon_msg "Mounting network filesystems"
        MOUNT_RESULT=1
        if mount -a -O _netdev >/dev/null 2>&1; then
                MOUNT_RESULT=0
                break
        fi
        log_end_msg $MOUNT_RESULT
----------------------------------------------------------------------

This will ensure that the filesystems on iscsi target, marked with option _netdev in fstab, will get mounted.

I have tested the above successfully. I have also tested without the _netdev option, with other file system marked with _netdev, with the iscsi target unreachable, etc, and it seem to behave correctly.

Also, the networking init script have a bit of code in check_network_file_systems() to ensure that there is no network file systems or block device currently in use before bringing network down. I think it should also look for iscsi target.

I forgot to mention that the bit of code to add to open-iscsi init script is from the Debian open-iscsi 2.0.869 package in unstable. So, we will get it for jaunty in the next sync anyway.

Dustin Kirkland  (kirkland) wrote :

I'm lowering this bug from Critical to High, and un-milestoning it for Intrepid release.

We're not going to be able to fix this comprehensively, and in a way that does not introduce regressions in time for Intrepid. Rather, I will prepare some text for the Intrepid release notes, helping awareness of the issue.

:-Dustin

Changed in open-iscsi:
importance: Critical → High
milestone: ubuntu-8.10 → none
Dustin Kirkland  (kirkland) wrote :

Suggestion of text for the Intrepid release note (from Etienne Goyer):

File systems hosted on iSCSI targets may not be mounted automatically at
boot time, even if they have an entry in /etc/fstab, if a bridged or
bonded Ethernet interface is required to reach the iSCSI target. As a
work-around, you would have to restart the open-iscsi service and
manually mount the file system in question after system boot, once the
required network interface have been brought up. Systems equipped with
a plain Ethernet interface are not affected.

:-Dustin

Colin Watson (cjwatson) wrote :

Thanks, Dustin; I've included this text in the release notes, along with a link to this bug.

Changed in ubuntu-release-notes:
assignee: nobody → kamion
status: New → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (5.4 KiB)

This bug was fixed in the package open-iscsi - 2.0.870.1-0ubuntu1

---------------
open-iscsi (2.0.870.1-0ubuntu1) jaunty; urgency=low

  * New upstream release:
   - Support 2.6.26/27 kernels (LP: #289470).
   - Fix iscsid shutdown (LP: #181188).
  * Merge from Debian. Remaining Ubuntu changes:
   - Note: Debian version isn't 870~rc3 but 869.2 which leads to a big
     .diff.gz file. Only files in debian/ have been considered for this merge
     since debian hasn't patched the source.
   - debian/control, debian/rules, debian/open-iscsi-udeb*:
     + Add open-iscsi-udeb.
   - debian/open-iscsi.dirs:
     + rename dirs to open-iscsi.dirs because of open-iscsi-udeb addition.
     + drop network/if-up.d/ directory since we're using symlinks instead.
     + utilities installed in /bin,/sbin rather than /usr/bin,/usr/sbin.
   - debian/open-iscsi.init:
     + utilities installed in /bin,/sbin rather than /usr/bin,/usr/sbin.
     + lsb log messages.
     + Don't generate initiatorname name (moved to postinst).
     + Support for being called from ifup/ifdown scripts.
     + Refactor start functions:
       - move daemon start to startdaemon function.
       - call udevadm settle rather then udevsettle (which doesn't do anything
         useful).
       - don't exit if the daemon is already running during sanitychecks.
         Instead check in startdaemon if the daemon needs to be started.
       - only start automatic targets if necessary.
       - add waitfordevices function: called during rcS, waits for all
         automatic targets to be ready. Iscsi targets are considered as
         local block devices - they don't need to be marked with _netdev in
         fstab. (LP: #227848)
       - start targets if not run from rcS.
     + Check status of iscsid daemon in addition to listing the iscsi sessions
       when status action is called.
     + Add iscsid to the list of processes that should not be killed by
       sendsigs during shutdown (LP: #192080).
     + Add starttargets, stoptargets and restarttargets to usage message.
   - debian/rules:
     + Install utilities /bin,/sbin rather than /usr/bin,/usr/sbin.
     + Start open-iscsi at S25 (waiting for devices created by ifupdown
       calls and before local filesystems are checked and mounted)
     + stop at S41 (after local filesystems are unmounted). Don't use
       umountiscsi.sh script from debian. (LP: #192080).
   - debian/initiatorname.iscsi, debian/open-iscsi.postinst:
     + Generate the random initiatorname during postinstall rather than in the
     init script.
     + Don't ship a default initiatorname.iscsi file.
   - debian/open-iscsi.postinst:
     + fix init script ordering on upgrades.
   - debian/open-iscsi.links:
     + symlink open-iscsi init script in if-up.d and if-down.d directory so
       that the iscsi subsystem is started/stopped when network interfaces
       are configured.
   - debian/open-iscsi.postrm:
     + delete iscsi configuration when the package is purged.
   - utils/iscsi_discovery: De-bashify iscsi_discovery.
   - usr/idbm.c: Fix build failure with new glibc. LP: #299843.
  * Dropped:
   - Exit without error if /sys is not available. Otherwise, it's ...

Read more...

Changed in open-iscsi:
status: In Progress → Fix Released
Steve Langasek (vorlon) on 2009-01-18
Changed in open-iscsi:
importance: Undecided → High
milestone: none → ubuntu-8.04.3
status: New → Confirmed
Steve Langasek (vorlon) on 2009-01-26
Changed in open-iscsi:
assignee: nobody → mathiaz
Mathias Gug (mathiaz) on 2009-01-26
Changed in open-iscsi:
assignee: mathiaz → nobody
Giuliastro (gyesspam) wrote :

Is this going to be fixed in Hardy as well? Thanks in advance.

Colin Watson (cjwatson) wrote :

I wouldn't recommend including this in Hardy, since it's known to break wpasupplicant (see bug 44194) and we haven't even sorted that out in Jaunty yet.

Chris Puttick (cputtick) wrote :

Note: Hardy 8.04.2 2.6.24-18-virtual #1 SMP (running on KVM virtual machine with dual guest CPU)

/etc/network/interfaces (relevant part)

auto eth1
iface eth1 inet static
        address 192.168.20.50
        netmask 255.255.255.224

ISCSI auto connect and mount works with no modifications using the following:

In /etc/iscsi/iscsid.conf set

node.startup = automatic (manual is the default setting)

In /etc/iscsi/nodes/<iscsi target name>/<ip address>/default set

node.conn[0].startup = automatic (manual is the default setting)

Note that the target name and IP address would normally be the only entries in their respective directories, assuming you only have the one SAN target and volume. I guess if you have more you'd want to make the change in all of them :)

In /etc/fstab add the following line(s).

UUID<what ever your iscsi volume(s) is/are called> /opt/ktdms/var/Documents/ auto _netdev 0 0

Note to get the UUID do

blkid /dev/sd<your iscsi volume and partition>

after you have connected your iscsi volume, manually or otherwise.

Steve Langasek (vorlon) on 2009-06-17
Changed in open-iscsi (Ubuntu Intrepid):
status: In Progress → Won't Fix
Steve Langasek (vorlon) wrote :

The biggest blocker for getting this fixed in hardy isn't bug #44194, but rather Debian bug #383073 and Debian bug #383123 that were fixed in a post-hardy merge of sysvinit. Without that fix, there's no way to trigger mounting iscsi devices after they've been made available, which is the problem to be solved: the network interfaces are only *guaranteed* to be up after S40networking, and nothing in hardy will trigger mounting of non-network-filesystem devices after that point.

Fixing 44194 would let us cover all our bases for things such as iSCSI + wpa_supplicant, but that's not actually an interesting use case to the typical iSCSI user and not the problem being discussed here; fixing this bug without 44194 would not be a regression since /usr on iSCSI doesn't work today for anyone in hardy.

Steve Langasek (vorlon) wrote :

I've uploaded a backported fix for the open-iscsi part of this to hardy-proposed, waiting now for review.

Changed in open-iscsi (Ubuntu Hardy):
status: Confirmed → In Progress
Chris Puttick (cputtick) wrote :

Consider me confused on the basis of my earlier comment showing how to
configure Hardy so that iscsi volume mounts works consistently. It
ain't broke, why fix it?

--
My employers website: http://thehumanjourney.net - opinions in this
email are however very much my own and may not reflect that of my
current employer, past employers, associates, friends, family, pets etc..

Documents attached to this email may be in ISO 26300 format:
http://iso26300.info

Martin Pitt (pitti) wrote :

Accepted open-iscsi into hardy-proposed, the package will build now and be available in a few hours. Please test and give feedback here. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you in advance!

Changed in open-iscsi (Ubuntu Hardy):
status: In Progress → Fix Committed
tags: added: verification-needed

Chris Puttick wrote:
> Consider me confused on the basis of my earlier comment showing how to
> configure Hardy so that iscsi volume mounts works consistently. It
> ain't broke, why fix it?

That is because you are not accessing the iSCSI target through a bonded
Ethernet interface. With plain Ethernet interface, it always work just
fine.

Quick test lead me to believe that open-iscsi 2.0.865-1ubuntu3.1 from hardy-proposed work as explained by Steve. That is, the block device appear after the bonded Ethernet is brought up. It is too late for the file system on the iSCSI target to be mounted from /etc/fstab, but at least we do not have to restart the open-iscsi service for the iSCSI block device to show up.

Steve Langasek (vorlon) on 2009-06-26
tags: added: verification-done
removed: verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package open-iscsi - 2.0.865-1ubuntu3.1

---------------
open-iscsi (2.0.865-1ubuntu3.1) hardy-proposed; urgency=low

  * Backport changes from jaunty to allow iscsi targets to be activated
    as network interfaces come on-line:
   - debian/open-iscsi.init:
     + Support for being called from ifup/ifdown scripts.
     + Refactor start functions:
       - move daemon start to startdaemon function.
       - don't exit if the daemon is already running during sanitychecks.
         Instead check in startdaemon if the daemon needs to be started.
       - only start automatic targets if necessary.
    LP: #227848.

 -- Steve Langasek <email address hidden> Wed, 17 Jun 2009 23:56:00 +0000

Changed in open-iscsi (Ubuntu Hardy):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.