Failure to boot ephemeral image for Utopic Fast Installer deployment: no ID_PATH for iSCSI device any more

Bug #1391354 reported by Larry Michel on 2014-11-11
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
maas-images
Undecided
Unassigned
systemd (Ubuntu)
Undecided
Unassigned
Utopic
Undecided
Martin Pitt

Bug Description

I am running into issues with the latest Utopic daily ephemeral images for Utopic:

  Release Architecture Size Nodes deployed Last update
 14.10 i386 391.2 MB 0 Tue Nov 11 00:36:04 2014
 14.10 amd64 395.6 MB 0 Tue Nov 11 00:36:04 2014
 14.10 ppc64el 425.8 MB 1 Tue Nov 11 00:36:03 2014
 14.04 LTS i386 373.3 MB 0 Tue Nov 11 00:36:04 2014
 14.04 LTS ppc64el 408.5 MB 1 Tue Nov 11 00:36:04 2014
 14.04 LTS amd64 380.5 MB 52 Tue Nov 11 00:36:03 2014
 12.04 LTS amd64 517.5 MB 245 Tue Nov 11 00:36:05 2014
 12.04 LTS i386 487.3 MB 0 Tue Nov 11 00:36:05 2014

I have tried installing 2 physical servers, moline and premier as well as a ppc64el VM, huffman-vm10, and I get the same failure for each.

============================================================================

iscsistart: Logging into iqn.2004-05.com.ubuntu:maas:ephemeral-ubuntu-ppc64el-generic-utopic-daily 10.245.0.10:3260,1
iscsistart: can not connect to iSCSI daemon (111)!
iscsistart: version 2.0-873
[ 17.810223] sd 1:0:0:1: [sdb] Write Protect is on
[ 17.811504] sd 1:0:0:1: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[ 17.828817] sdb: unknown partition table
[ 17.839360] sd 1:0:0:1: [sdb] Attached SCSI disk
iscsistart: initiator reported error (15 - session exists)
done.
[ 35.868265] random: nonblocking pool is initialized
Begin: Loading essential drivers ... done.
Begin: Running /scripts/init-premount ... done.
Begin: Mounting root file system ... Begin: Running /scripts/local-top ... IP-Config: eth0 hardware address 52:54:00:95:6b:54 mtu 1500 DHCP RARP
IP-Config: no response after 2 secs - giving up
IP-Config: eth0 hardware address 52:54:00:95:6b:54 mtu 1500 DHCP RARP
hostname huffman-vm-10 hostname huffman-vm-10 IP-Config: eth0 complete (dhcp from 10.245.0.10):
 address: 10.245.0.173 broadcast: 10.245.63.255 netmask: 255.255.192.0
 gateway: 10.245.0.1 dns0 : 10.245.0.10 dns1 : 0.0.0.0
 domain : oil
 rootserver: 10.245.0.10 rootpath:
 filename : pxelinux.0
iscsistart: Logging into iqn.2004-05.com.ubuntu:maas:ephemeral-ubuntu-ppc64el-generic-utopic-daily 10.245.0.10:3260,1
iscsistart: version 2.0-873
iscsistart: Connection1:0 to [target: iqn.2004-05.com.ubuntu:maas:ephemeral-ubuntu-ppc64el-generic-utopic-daily, portal: 10.245.0.10,3260] through [iface: default] is operational now
iscsistart: Logging into iqn.2004-05.com.ubuntu:maas:ephemeral-ubuntu-ppc64el-generic-utopic-daily 10.245.0.10:3260,1
iscsistart: can not connect to iSCSI daemon (111)!
iscsistart: version 2.0-873
iscsistart: initiator reported error (15 - session exists)
done.
Gave up waiting for root device. Common problems:
 - Boot args (cat /proc/cmdline)
   - Check rootdelay= (did the system wait long enough?)
   - Check root= (did the system wait for the right device?)
 - Missing modules (cat /proc/modules; ls /dev)
ALERT! /dev/disk/by-path/ip-10.245.0.10:3260-iscsi-iqn.2004-05.com.ubuntu:maas:ephemeral-ubuntu-ppc64el-generic-utopic-daily-lun-1 does not exist. Dropping to a shell!
[ 78.856173] hidraw: raw HID events driver (C) Jiri Kosina
[ 78.860069] usbcore: registered new interface driver usbhid
[ 78.860272] usbhid: USB HID core driver

BusyBox v1.22.1 (Ubuntu 1:1.22.0-8ubuntu1) built-in shell (ash)
Enter 'help' for a list of built-in commands.

(initramfs)

============================================================================

Larry Michel (lmic) on 2014-11-11
affects: curtin → maas
Larry Michel (lmic) wrote :

I saved the ephemeral images for utopic that are failing in case they are requested.

I am attaching screen capture from java console of one of the physical systems.

Julian Edwards (julian-edwards) wrote :

As mentioned on IRC, this is not a maas problem but an image problem. I'm not sure where to file bugs about that though, but:

<larrymi> roaksoax: bigjools: I put a note on #hyperscale .. I have to step out now. I'll check later about moving it.

Changed in maas:
status: New → Incomplete
Scott Moser (smoser) wrote :

from a debug and recreate perspective, maas really needs to be able to provide the serial number of the image that you're using.
Is there any way to get that at all ?

Christian Reis (kiko) on 2014-11-11
Changed in maas:
milestone: none → 1.7.1
Scott Moser (smoser) wrote :

confirmed this on ppc64el with a variation on http://bazaar.launchpad.net/~smoser/maas/maas-ephemeral-sniff/view/head:/test-image.txt .
I'll dig some more.

Larry Michel (lmic) wrote :

Scott, I've recreated this with today's ephemeral images so it shouldn't be a problem to get that. What string should I be looking for and in which image file to get the serial number?

Scott Moser (smoser) wrote :

Larry,
 dont worry. i've recreated. it also recreates on amd64.
 I'm not really sure why, but /dev/disk/by-path doesn't seem to be available in the initramfs and we're relying on that to give us determinable root filesystem selection (see bug 1075313)

Martin Pitt (pitti) wrote :

Scott,

in order to help with debugging this I need to understand what creates a device symlink like /dev/disk/by-path/ip-10.245.0.10:3260-iscsi-iq*. This isn't in the standard udev rules, so I suppose some package like open-iscsi (it's not that, I checked) ships an udev rule which calls a helper to determine that name, and then sets SYMLINK. If you aren't sure, do something like

   grep -r by-path.*iscsi /lib/udev/rules.d

As this apparently needs to be available in the initramfs, that same package then needs to install an initramfs-tools hook to put that rules file and the accompanying helper (something like "id_path_iscsi") into the initramfs.

Martin Pitt (pitti) wrote :

FTR, none of the pacakges which ship udev rules (http://paste.ubuntu.com/8958453/) sound related to iSCSI: So where does that rule come from? Or where are the images that you are trying to boot, so that I can inspect them? (Note: I have absolutely zero knowledge about iscsi and maas).

Scott Moser (smoser) wrote :

The rule that was doing this on trusty is a standard rule.

its just this one from /lib/udev/rules.d/60-persistent-storage.rules:
 ENV{DEVTYPE}=="disk", ENV{ID_PATH}=="?*", SYMLINK+="disk/by-path/$env{ID_PATH}"

From inside the initramfs on I saw on trusty:
 udevadm info --query=property --name=/dev/sda | grep ID_PATH
   ID_PATH=ip-192.168.1.132:3260-iscsi-tgt-boot-test-zdgNnn-lun-1
   ID_PATH_TAG=ip-192_168_1_132_3260-iscsi-tgt-boot-test-zdgNnn-lun-1

but on utopic, I dont get anything with:
 udevadm info --query=property --name=/dev/sda | grep ID_PATH

What would / should be setting ID_PATH ?

Scott Moser (smoser) wrote :

so it seems like udev (systemd) should be doing this, and was in trusty but is now not in utopic.
 src/udev/udev-builtin-path_id.c
has
 handle_scsi_iscsi
that would set that property.

Martin Pitt (pitti) wrote :

Ah thanks, so there are no magic extra rules. As a first thing (as long as nobody tells me which images we are talking about or how to reproduce this), can you please verify that

  zcat /initrd.img | cpio -t | grep 60-persist

actually has that rule in the initramfs? (Should be, but let's check). From the bug log above it seems to me that within the running system you actually do get the ID_PATH and the symlink, just not in initramfs. Can you please run

  udevadm test-builtin path_id /sys/block/sda

both in the running system as well as in the initramfs? Is there some error message there?

Scott Moser (smoser) wrote :

Martin,
 the rule is there. it doesn't fire because ID_PATH is not set. thats waht i tried to show above.
 udev is just not setting it, and thus the rule is not getting applied.

the root and kernel and initramfs that are used here are maas ephemeral images.
you can view/browse them at
 http://maas.ubuntu.com/images/ephemeral-v2/daily/

i hacked some tools for downloading the stuff at http://bazaar.launchpad.net/~smoser/maas/maas-ephemeral-sniff/files
some information on what command line parms to give is at test-image.txt there

i realize its not well documented at the moment, but maybe you can make sense of it in terms of how to set up the tgt and what command line params to pass to a kvm.

Martin Pitt (pitti) wrote :

> it doesn't fire because ID_PATH is not set.

Right, and I was wondering why, hence the "test-builtin" comparison/check.

Martin Pitt (pitti) wrote :
Download full text (3.2 KiB)

I couldn't find something like utopic-daily-maas-amd64, but I downloaded the kernel, initramfs, and rootfs from http://maas.ubuntu.com/images/ephemeral-v2/daily/utopic/amd64/20141110/ and adjusted the install instructions. But it looks like the iSCSI bits don't work:

[ 14.198669] Loading iSCSI transport class v2.0-870.
[ 14.204279] iscsi: registered transport (tcp)
iscsistart: Logging into utopic-daily-maas-amd64 192.168.2.106:3260,1
iscsistart: can not connect to iSCSI daemon (111)!
iscsistart: version 2.0-873
[ 15.459358] scsi2 : iSCSI Initiator over TCP/IP
iscsistart: Connection1:0 to [target: utopic-daily-maas-amd64, portal: 192.168.2.106,3260] through [iface: default] is operational now
[ 15.721252] scsi 2:0:0:0: RAID IET Controller 0001 PQ: 0 ANSI: 5
[ 15.728067] scsi 2:0:0:0: Attached scsi generic sg1 type 12
[ 15.732548] scsi 2:0:0:1: Direct-Access IET VIRTUAL-DISK 0001 PQ: 0 ANSI: 5
[ 15.738735] sd 2:0:0:1: [sda] 2883584 512-byte logical blocks: (1.47 GB/1.37 GiB)
[ 15.742666] sd 2:0:0:1: [sda] 4096-byte physical blocks
[ 15.745415] sd 2:0:0:1: Attached scsi generic sg2 type 0
iscsistart: Logging into utopic-daily-maas-amd64 192.168.2.106:3260,1
iscsistart: can not connect to iSCSI daemon (111)!

Your install instructions get eth0's IP, as I don't have that I took the one from my wlan0 instead:
2: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    inet 192.168.2.106/24 brd 192.168.2.255 scope global wlan0

But I got as far as confirming that booting with root=/dev/sda does work, so one can log in and check "udevadm test-builtin path_id /sys/block/sda" in both the initramfs and the running system.

I also tried with dropping all the custom network stuff and using qemu's default networking and thus using 10.0.2.2. I got a tad further, but still "cannot connect":

$ ipaddr=10.0.2.2
$ kvm -m 512 -serial stdio -kernel boot-kernel -initrd boot-initrd -append "nomodeset iscsi_target_name=utopic-daily-maas-amd64 iscsi_target_ip=$ipaddr iscsi_initiator=maas-enlist ip=::::maas-enlist:BOOTIF BOOTIF=01-52-54-00-12-34-56 ro root=/dev/disk/by-path/ip-$ipaddr:$iport-iscsi-${target_name}-lun-1 overlayroot=tmpfs console=tty1 console=ttyS0 ds=nocloud-net;seedfrom=http://$ipaddr:32600/"
[...]
[ 16.523666] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
IP-Config: eth0 hardware address 52:54:00:12:34:56 mtu 1500 DHCP RARP
IP-Config: no response after 2 secs - giving up
IP-Config: eth0 hardware address 52:54:00:12:34:56 mtu 1500 DHCP RARP
hostname maas-enlist hostname maas-enlist IP-Config: eth0 guessed broadcast address 10.0.2.255
IP-Config: eth0 complete (dhcp from 10.0.2.2):
 address: 10.0.2.15 broadcast: 10.0.2.255 netmask: 255.255.255.0
 gateway: 10.0.2.2 dns0 : 10.0.2.3 dns1 : 0.0.0.0
 rootserver: 10.0.2.2 rootpath:
 filename :
[ 16.645845] Loading iSCSI transport class v2.0-870.
[ 16.657169] iscsi: registered transport (tcp)
iscsistart: Logging into utopic-daily-maas-amd64 10.0.2.2:3260,1
iscsistart: can not connect to iSCSI daemon (111)!

The original reporter has the same problem/me...

Read more...

Martin Pitt (pitti) wrote :

I'm entering serious monkeying-around area now, but I tried to play around with

$ sudo iscsi_discovery 10.0.2.2 -d -l
Please logout from all targets on 10.0.2.2:3260 before trying to run discovery on that portal

$ sudo iscsiadm --mode discoverydb --type sendtargets --portal 10.0.2 --discover
[... long hang ...]
iscsiadm: connect to 62.157.140.133 timed out
iscsiadm: connect to 62.157.140.133 timed out
iscsiadm: connect to 62.157.140.133 timed out
[..., Control-C eventually ]

I don't have the slightest idea where the 62.157.140.133 IP comes from.

Martin Pitt (pitti) wrote :

For comparison I downloaded/booted current trusty, and I get the "can not connect to iSCSI daemon" error there, too. But I do get a PATH_ID:

lrwxrwxrwx 1 root root 9 Nov 13 11:52 /dev/disk/by-path/ip-10.0.2.2:3260-iscsi-trusty-daily-maas-amd64-lun-1 -> ../../sda

ubuntu@ubuntu:~$ sudo udevadm -d test-builtin path_id /sys/block/sda
sudo: unable to resolve host ubuntu
calling: test-builtin
=== trie on-disk ===
tool version: 204
file size: 5660180 bytes
header size 80 bytes
strings 1265196 bytes
nodes 4394904 bytes
load module index
device 0x15fe840 has devpath '/devices/platform/host2/session1/target2:0:0/2:0:0:1/block/sda'
device 0x1601ea0 has devpath '/devices/platform/host2/session1/target2:0:0/2:0:0:1'
device 0x16024b0 has devpath '/devices/platform/host2/session1/target2:0:0'
device 0x1602ab0 has devpath '/devices/platform/host2/session1'
device 0x1602fc0 has devpath '/devices/platform/host2'
device 0x1603590 has devpath '/devices/platform'
device 0x1603f30 has devpath '/devices/platform/host2/session1/iscsi_session/session1'
device 0x16047c0 has devpath '/devices/platform/host2/session1/connection1:0/iscsi_connection/connection1:0'
ID_PATH=ip-10.0.2.2:3260-iscsi-trusty-daily-maas-amd64-lun-1
ID_PATH_TAG=ip-10_0_2_2_3260-iscsi-trusty-daily-maas-amd64-lun-1
unload module index

summary: - Failure to boot ephemeral image for Utopic Fast Installer deployment
+ Failure to boot ephemeral image for Utopic Fast Installer deployment: no
+ ID_PATH for iSCSI device any more
Changed in systemd (Ubuntu Utopic):
status: New → Confirmed
Changed in systemd (Ubuntu):
status: New → Confirmed
Martin Pitt (pitti) wrote :

Notes for reproduction:

ipaddr=10.0.2.2
release=trusty # or utopic
kvm -m 512 -serial stdio -kernel ${release}-kernel -initrd ${release}-initrd -append "nomodeset iscsi_target_name=${release}-daily-maas-amd64 iscsi_target_ip=$ipaddr iscsi_initiator=maas-enlist ip=::::maas-enlist:BOOTIF BOOTIF=01-52-54-00-12-34-56 ro root=/dev/sda overlayroot=tmpfs console=tty1 console=ttyS0 ds=nocloud-net;seedfrom=http://$ipaddr:32600/"

Martin Pitt (pitti) wrote :

I confirm that this is fixed in vivid's udev. "sudo ./udevadm -d test-builtin path_id /sys/block/sda" shows ID_PATH=ip-10.0.2.2:3260-iscsi-utopic-daily-maas-amd64-lun-1 again, and replacing /lib/systemd/systemd-udevd with 215-5's binary gets the by-path/ symlink again. So now off to finding the fix in the history..

Changed in systemd (Ubuntu):
status: Confirmed → Fix Released
Changed in systemd (Ubuntu Utopic):
status: Confirmed → Triaged
assignee: nobody → Martin Pitt (pitti)
Martin Pitt (pitti) wrote :
Changed in systemd (Ubuntu Utopic):
assignee: Martin Pitt (pitti) → nobody
status: Triaged → In Progress
Martin Pitt (pitti) wrote :

I uploaded 208-8ubuntu8.1 to the utopic SRU review queue. I tested it with my "fake" MAAS setup from above. For SRU verification it would be good if you could test it on a real MAAS setup, just to be sure.

Changed in systemd (Ubuntu Utopic):
assignee: nobody → Martin Pitt (pitti)

Hello Larry, or anyone else affected,

Accepted systemd into utopic-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/systemd/208-8ubuntu8.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in systemd (Ubuntu Utopic):
status: In Progress → Fix Committed
tags: added: verification-needed
Larry Michel (lmic) wrote :

Brian, will we be able to pull this fix from the daily images?

Martin Pitt (pitti) wrote :

I'm not involved with MAAS or image building, but I seriously doubt that we build images with -proposed enabled regularly. What you could do is one of the following:

 * Boot an existing installation with "root=/dev/sda", enable -proposed, dist-upgrade, and reboot (again with the default root=/dev/disks/by-path/*iscsi*). That's what I did with the approach from Scott with launching QEMU.

 * If you are dealing with a root fs tarball: take an existing image tarball, unpack it, sudo chroot into it, within the chroot enable -proposed, dist-upgrade it, re-pack it again

 * If you are dealing with a compressed root fs image like http://maas.ubuntu.com/images/ephemeral-v2/daily/utopic/amd64/20141110/root-image.gz: gunzip root-image.gz, "sudo mount -o loop root-image /mnt", then chroot/upgrade like above, and "sudo umount /mnt" again.

As a part of the Stable Release Updates quality process a search for Launchpad bug reports using the version of systemd from utopic-proposed was performed and bug 1395813 was found. Please investigate this bug report to ensure that a regression will not be created by this SRU. In the event that this is not a regression remove the "verification-failed" tag from this bug report and tag 1395813 "bot-stop-nagging". Thanks!

tags: added: verification-failed
Martin Pitt (pitti) wrote :

I investigated bug 1395813, it's not a regression. Tags updated.

tags: removed: verification-failed
Larry Michel (lmic) wrote :

I tested this by installing systemd from Proposed onto image then doing update-initramfs -u. I still saw the issue.

tags: added: verification-failed
Scott Moser (smoser) wrote :

I've verified this change fixes the issue with initramfs mounting iscsi targets . I did this inside of maas.
 * basically set up functional maas with some nodes, and daily images of utopic imported.

from there

### repro
# get some tools
$ apt-get install cloud-image-utils --no-install-recommends --assume-yes

# find the directory that has the image we're interested in
$ imgd=$(for d in /var/lib/maas/boot-r*/snapshot-*/*/amd64/*/utopic/*; do :; done; echo $d)
img=${imgd}/root-image
kernel=${imgd}/boot-kernel
initrd=${imgd}/boot-initrd

$ echo $imgd
/var/lib/maas/boot-resources/snapshot-20141210-025929/ubuntu/amd64/hwe-u/utopic/daily

## back some things up
$ for f in $img $kernel $initrd; do [ -f "$f.dist" ] || cp "$f" "$f.dist"; done

## run the 'fix' script attached inside the mounted root-image
$ mount-image-callback --system-resolvconf "$img" \
   chroot _MOUNTPOINT_ /bin/bash < fix > update-kernels.tar

## just for reference
$ tar tvf update-kernels.tar
-rw-r--r-- root/root 26262410 2014-12-10 02:54 initrd.img-3.16.0-25-generic
-rw-r--r-- root/root 6402112 2014-12-10 02:54 vmlinuz-3.16.0-25-generic

$ d=$(mktemp -d)
$ tar -C "$d" -xf - < update-kernels.tar
cp $d/initrd* $initrd

tags: added: verification-done
removed: verification-failed verification-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 208-8ubuntu8.1

---------------
systemd (208-8ubuntu8.1) utopic-proposed; urgency=medium

  * Fix path-id to correctly recognize supported devices. This brings back
    /dev/disks/by-path/ symlinks for iSCSI devices. (LP: #1391354)
 -- Martin Pitt <email address hidden> Thu, 13 Nov 2014 14:54:35 +0100

Changed in systemd (Ubuntu Utopic):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for systemd has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Changed in maas:
milestone: 1.7.1 → 1.7.2
Blake Rouse (blake-rouse) wrote :

Has the Utopic image from daily been promoted to releases to fix this issue?

no longer affects: maas
Changed in maas-images:
status: New → Confirmed
Scott Moser (smoser) on 2015-06-17
Changed in maas-images:
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers