iBFT network configuration does not correctly populate PROTO=dhcp in /run/net-*.conf which breaks cloud-init

Bug #1684039 reported by Trent Lloyd on 2017-04-19
18
This bug affects 1 person
Affects Status Importance Assigned to Milestone
open-iscsi (Debian)
Fix Released
Unknown
open-iscsi (Ubuntu)
Medium
Trent Lloyd
Xenial
Medium
Trent Lloyd
Yakkety
Medium
Trent Lloyd
Zesty
Medium
Trent Lloyd
Artful
Medium
Trent Lloyd

Bug Description

[Impact]

When booting with iBFT, the network configuration is performed by open-iscsi as part of initramfs.local-top instead of by klibc-ipconfig. This includes populating /run/net-*.conf which is consumed among other things, by cloud-init.
Currently no attempt to determine PROTO is made, and PROTO=none is hard coded into the file which cloud-init does not recognise and crashes out.
Further to this, open-iscsi in the current version (xenial through zesty) does not correctly parse the iBFT origin into the boot protocol in "iscsistart -f" and always returns "STATIC". This is fixed upstream.

[Test Case]

(1) Setup MAAS 2 environment. Install 16.04 LTS images from the "Daily" Stream. Check settings and ensure Commissioning kernel is 16.04 LTS and "xenial (ga-16.04)"

(2) Enroll a libvirt virtual machine as a machine within the MAAS including a SINGLE local disk

(3) Install the 'ipxe' package on your libvirt machine

(4) Commission and test and ensure it's otherwise working

(5) Update MAAS settings and add iscsi_auto to the "Global Kernel Parameters"

(6) You need to build a new open-iscsi package for xenial with the fix, and then rebuild an initrd for 4.4.0-87-generic with the fix integrated. An easy way to do that is to deploy the machine, install the updated (fixed) package and then copy the initrd from /boot/initrd.img-4.4.0-87-generic to the MAAS server. But you could also build it on any other xenial machine with that kernel installed.

(7) Edit the virtual machine config using virt-manager and under the "Boot Options" enable direct kernel boot, set kernel path to /usr/lib/ipxe/ipxe.lkrn and the kernel args to (all one line):
ifconf -c dhcp && sanhook --drive 0x81 iscsi:100.64.0.253::3260:1:iqn.2004-05.com.ubuntu:maas:ephemeral-ubuntu-amd64-ga-16.04-xenial-daily ; autoboot

    Change 100.64.0.253 to the IP of your MAAS instance on the network that it uses to PXE boot.

(8) Try to commission the machine again, you should see it fail if you watch the console (it will also show failed in MAAS) and you'll briefly see a message about PROTO=none

(9) Update the initrd on the MAAS server:
    cd /var/lib/maas/boot-resources/current/ubuntu/amd64/ga-16.04/xenial/daily/
    mv boot-initrd boot-initrd.orig
    cp (PATHMAYBEDIFFERENT)/boot/initrd.img-4.4.0-87-generic boot-initrd

(10) Try to commission the machine again, it should now succeed with the patched initrd. A deployment will also work but a commission is sufficient enough to test.

Notes:
 - MAAS may do an image sync and overwrite your updated initrd. So watch out for that. boot-resources/current is also a symlink so you may need to cd out and back in
 - Advantage of using MAAS to do this is that it also tests that iSCSI boot is otherwise working and not broken by this change, as the commission and deploy etc use iSCSI root. There is also a test for this in debian/tests (tgt-boot-test)

[Regression Potential]

We believe the regression risk is "low" and don't envision any.

The package (including the fixes) has been intensively tested pre-SRU.

If regression is found, it'll be clearly less critical than this actual bug where cloud-init breaks because of this actual missing piece of code and It'll most likely only affect system booting with iBFT.

Additionally, the patches has been proven to work Upstream and Debian for a couple of years now.

* autopkgtest failure

==
XENIAL
==

* Regression in autopkgtest for open-iscsi (i386): test log

This autopkgtest started to fail more than a year ago, more precisely on "2016-03-30"[1] with open-iscsi version : "2.0.873+git0.3b4b4500-14ubuntu2"
Meaning that this was already there prior to this current SRU.

[1] - http://autopkgtest.ubuntu.com/packages/o/open-iscsi/xenial/i386
2.0.873+git0.3b4b4500-14ubuntu2 open-iscsi/2.0.873+git0.3b4b4500-14ubuntu2 2016-03-30 03:46:43 UTC 0h 27m 12s fail log   artifacts

* Regression in autopkgtest for open-iscsi (amd64): test log

This autopkgtest started to fail more than a year ago, more precisely on "2016-04-13"[2] with open-iscsi version : "2.0.873+git0.3b4b4500-14ubuntu2"
Meaning that this was already there prior to this current SRU.

[2] - http://autopkgtest.ubuntu.com/packages/o/open-iscsi/xenial/amd64
2.0.873+git0.3b4b4500-14ubuntu2 open-iscsi/2.0.873+git0.3b4b4500-14ubuntu2 2016-04-13 07:13:11 UTC 0h 16m 17s fail log   artifacts
==

[Other Info]

* This SRU includes the following upstream/debian fixes :

# Debian:
0347300 initramfs: populate PROTO= entry in /run/net-*.conf from iBFT

# Upstream
- 08_Parse-origin-value-from-iBFT.patch --> https://github.com/open-iscsi/open-iscsi/commit/78e24f50ab754f35f4aa208ade7c9fd794d82036#diff-c53311d3f6725aa63577b7bf4b582c3d

- 09_Represent-DHCP-origin-as-an-enum-not-a-string.patch --> https://github.com/open-iscsi/open-iscsi/commit/4959a89f421fdebc521f48003a79c2161e59d192#diff-c53311d3f6725aa63577b7bf4b582c3d

- 10_iBFT-origin-is-an-enum-not-a-string.patch --> https://github.com/open-iscsi/open-iscsi/commit/3f15a2270a7efb1a6ee8ef555b01f3d8674818b9#diff-3ba89d9a64dda0ffc3664bbc27b0fa27

[Original Description]

When booting with iBFT, the network configuration is performed by open-iscsi as part of initramfs.local-top instead of by klibc-ipconfig. This includes populating /run/net-*.conf which is consumed among other things, by cloud-init.

Currently no attempt to determine PROTO is made, and PROTO=none is hard coded into the file which cloud-init does not recognise and crashes out.

Further to this, open-iscsi in the current version (xenial through zesty) does not correctly parse the iBFT origin into the boot protocol in "iscsistart -f" and always returns "STATIC". This is fixed upstream.

Trent Lloyd (lathiat) wrote :

Proposed patch for Review

First cherry-pick 3 upstream patches to fix parsing of the iBFT origin and return DHCP correctly in 'iscsistart -f' output. We then modify initramfs-local.top to use his information.

I do need to check this with IPv6. Both DHCP and STATIC from iscsistart convert direct to PROTO= but need to check what will happen for DHCPv6 and IPv6-RA

I refrained from removing the default PROTO=none at this stage. cloud-init doesn't like this value but it's not clear to me currently who defines what values are acceptable here. klibc-ipconfig appears to never write none.

open-iscsi (2.0.873+git0.3b4b4500-14ubuntu3.4) xenial; urgency=medium

  * debian/patches/08_Parse-origin-value-from-iBFT.patch,
    09_Represent-DHCP-origin-as-an-enum-not-a-string.patch,
    10_iBFT-origin-is-an-enum-not-a-string.patch: Cherry pick upstream patches
    to parse IP origin passed in by IBFT. iscsistart -f will now display the
    correct iface.bootproto
  * debian/extra/initramfs.local-top: parse iface.bootproto and populate PROTO
    in /run/net-*.conf with this information after lowercasing the string.

 -- Trent Lloyd <email address hidden> Wed, 19 Apr 2017 05:53:25 +0800

The attachment "lp1684039-ibft-dhcp-origin.debdiff" seems to be a debdiff. The ubuntu-sponsors team has been subscribed to the bug report so that they can review and hopefully sponsor the debdiff. If the attachment isn't a patch, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are member of the ~ubuntu-sponsors, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issue please contact him.]

tags: added: patch
Changed in open-iscsi (Ubuntu):
importance: Undecided → Medium
Trent Lloyd (lathiat) wrote :

Attaching debdiff for artful. artful already has the upstream patches from the latest version, so the only change required is the initramfs portion in debian/extra/initramfs.local-top

tags: added: sts-sru-needed
Eric Desrochers (slashd) on 2017-05-15
Changed in open-iscsi (Ubuntu):
assignee: nobody → Trent Lloyd (lathiat)
Eric Desrochers (slashd) on 2017-06-23
Changed in open-iscsi (Ubuntu Zesty):
assignee: nobody → Trent Lloyd (lathiat)
Changed in open-iscsi (Ubuntu Yakkety):
assignee: nobody → Trent Lloyd (lathiat)
Changed in open-iscsi (Ubuntu Xenial):
assignee: nobody → Trent Lloyd (lathiat)
importance: Undecided → Medium
Changed in open-iscsi (Ubuntu Yakkety):
importance: Undecided → Medium
Changed in open-iscsi (Ubuntu Zesty):
importance: Undecided → Medium
Trent Lloyd (lathiat) wrote :

Submitted upstream to Debian, maintainer promptly replied and plans to fix upstream also.

Christian suggested a slightly different patch not to rely on busybox for 'tr' and use a case statement instead (as it's a Recommends); He plans to push that update through to stretch (stable) also.

He said he plans to upload that fix rapidly so I'll wait for that so we can sync the same fix in.

Changed in open-iscsi (Debian):
status: Unknown → Confirmed
Changed in open-iscsi (Debian):
status: Confirmed → Fix Released
Eric Desrochers (slashd) wrote :

Debian maintainer has committed the patch ^. [1]

Next step is to make sure Artful has the fix now, and then the SRU for all affected supported stable releases can start.

Trent, feel free to contact me once Artful has the change, and I'll sponsor the SRU.

[1] - Debian
Commit :
0347300 initramfs: populate PROTO= entry in /run/net-*.conf from iBFT

# git show 0347300
commit 034730044a5f9f1557b7a5a26d3ef9fdc69fdf79
Author: Christian Seiler <email address hidden>
Date: Sun Jul 2 17:50:11 2017 +0200

    initramfs: populate PROTO= entry in /run/net-*.conf from iBFT

    When booting from iBFT, set the PROTO= entry in /run/net-*.conf
    accordingly, so that other tools, such as cloud-init, can use that
    information. (cloud-init fails if the current PROTO=none is used.)

    Closes: #866213

diff --git a/debian/extra/initramfs.local-top b/debian/extra/initramfs.local-top
index 93f4c6f..1045c50 100755
--- a/debian/extra/initramfs.local-top
+++ b/debian/extra/initramfs.local-top
@@ -159,6 +159,13 @@ do_iscsi_login ()
                                iface.primary_dns) IPV4DNS0="${v}" ;;
                                iface.secondary_dns) IPV4DNS1="${v}" ;;
                                iface.net_ifacename) DEVICE="${v}" ;;
+ iface.bootproto)
+ case "${v}" in
+ DHCP) PROTO="dhcp" ;;
+ STATIC) PROTO="static" ;;
+ *) PROTO="${v}" ;;
+ esac
+ ;;
                        esac
                done
        fi

Eric Desrochers (slashd) on 2017-07-04
Changed in open-iscsi (Ubuntu Artful):
status: New → In Progress
Eric Desrochers (slashd) wrote :

Artful debdiff (based on on Debian upstream commit: 0347300)

Eric Desrochers (slashd) on 2017-07-05
Changed in open-iscsi (Ubuntu Artful):
assignee: Trent Lloyd (lathiat) → Eric Desrochers (slashd)
Eric Desrochers (slashd) wrote :

Zesty debdiff (based on on Debian upstream commit: 0347300)

Eric Desrochers (slashd) wrote :

Xenial debdiff (based on on Debian upstream commit: 0347300)

Changed in open-iscsi (Ubuntu Yakkety):
status: New → Won't Fix
Eric Desrochers (slashd) wrote :

Yakkety reaches end of life in July 20th, setting the bug for Yakkety release to "Won't fix".

Changed in open-iscsi (Ubuntu Artful):
status: In Progress → Fix Committed
Eric Desrochers (slashd) on 2017-07-05
description: updated
Changed in open-iscsi (Ubuntu Zesty):
status: New → In Progress
Changed in open-iscsi (Ubuntu Xenial):
status: New → In Progress
Eric Desrochers (slashd) on 2017-07-06
Changed in open-iscsi (Ubuntu Artful):
assignee: Eric Desrochers (slashd) → Trent Lloyd (lathiat)
Eric Desrochers (slashd) on 2017-07-13
description: updated
Trent Lloyd (lathiat) wrote :

Test case updated

description: updated
description: updated
description: updated
Eric Desrochers (slashd) wrote :

Hi Trent,

The upload for Xenial and Zesty has been completed. It is now waiting for approval by the SRU verification team before the packages start to build in $RELEASE-proposed.

You should receive an automatic message here in the LP bug when it'll be found there and I'll also let you know just in case.

Once the package is in -proposed, Trent or anyone affected, please proceed with the verification for Xenial and Zesty, and provide in detail what you did, which package version you were using, output (if revelant), ...

Upload queues :
https://launchpad.net/ubuntu/xenial/+queue?queue_state=1&queue_text=open-iscsi
https://launchpad.net/ubuntu/zesty/+queue?queue_state=1&queue_text=open-iscsi

Thank you for your contribution to Ubuntu.

tags: added: sts-sponsor-done
removed: patch

Hello Trent, or anyone else affected,

Accepted open-iscsi into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/open-iscsi/2.0.873+git0.3b4b4500-14ubuntu3.4 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in open-iscsi (Ubuntu Xenial):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-xenial
Eric Desrochers (slashd) on 2017-07-28
description: updated
Trent Lloyd (lathiat) on 2017-08-01
tags: added: verification-done verification-done-xenial
removed: verification-needed verification-needed-xenial
Trent Lloyd (lathiat) wrote :

Verified in both xenial and zesty, broken before the upgrade and solves the issue as described after the upgrade using the package in {xenial,zesty}-proposed.

Also by way of regression testing, "standard" iSCSI booting (without IBFT but using command line parameters instead) that MAAS does as part of a commission/deploy is verified working in both versions.

The verification of the Stable Release Update for open-iscsi has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package open-iscsi - 2.0.873+git0.3b4b4500-14ubuntu3.4

---------------
open-iscsi (2.0.873+git0.3b4b4500-14ubuntu3.4) xenial; urgency=medium

  * d/p/08_Parse-origin-value-from-iBFT.patch
  * d/p/09_Represent-DHCP-origin-as-an-enum-not-a-string.patch
  * d/p/10_iBFT-origin-is-an-enum-not-a-string.patch
      - Cherry pick upstream patches to parse IP origin passed in by IBFT.
        iscsistart -f will now display the correct iface.bootproto
        (e.g. static or dhcp) (LP: #1684039)

  * d/extra/initramfs.local-top: When booting from iBFT,
    set the PROTO= entry in /run/net-*.conf accordingly,
    so that other tools, such as cloud-init, can use that
    information. (cloud-init fails if the current PROTO=none
    is used.) (Closes: #866213)

 -- Trent Lloyd <email address hidden> Fri, 23 Jun 2017 18:03:36 +0800

Changed in open-iscsi (Ubuntu Xenial):
status: Fix Committed → Fix Released
Eric Desrochers (slashd) on 2017-08-01
tags: added: verification-done-zesty
Eric Desrochers (slashd) on 2017-08-01
Changed in open-iscsi (Ubuntu Zesty):
status: In Progress → Fix Committed
Brian Murray (brian-murray) wrote :

This is not yet fixed in Artful due to autopkgtest failures, I have however just retried them since it loooked like an environment failure.

Eric Desrochers (slashd) wrote :

Hi Brian,

They failed again (after you and I restarted them)...

I first thought it was block in -proposed due to the MIR for "open-isns" :
https://bugs.launchpad.net/ubuntu/+source/open-isns/+bug/1689963

but it seems the MIR is "Fix Released" now so.... It seems like the testsuite is failing.

# Logs
======================================================================
FAIL: test_daemon (__main__.OpenIscsiTest)
Test iscsid
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/autopkgtest.HCyw1s/build.tIH/open-iscsi-2.0.874/debian/tests/test-open-iscsi.py", line 104, in test_daemon
    self.assertTrue(self.daemon.status()[0])
AssertionError: False is not true

----------------------------------------------------------------------
Ran 3 tests in 15.943s

# Autopkgtest logs
autopkgtest [16:15:36]: @@@@@@@@@@@@@@@@@@@@ summary
install PASS
testsuite FAIL non-zero exit status 1
nested PASS

Trent, since you were the one driving this SRU and that I have prepare the Artful debdiff on your behalf can you investigate this for us ?

Regards,
Eric

Eric Desrochers (slashd) wrote :

Me aaain ...

I had the time to do some verification this afternoon and decided to run the autopkgtest manually against a Artful QEMU VM using the version right before this particular SRU.

This SRU introduced : 2.0.874-2ubuntu2
Version before this SRU : 2.0.874-2ubuntu1

Autopkgtest run :
#adt-run open-iscsi_2.0.874-2ubuntu1.dsc -l version_before.log --- qemu adt-artful-amd64-cloud.img

During the test above ^, the same regressions occurs, proving that the SRU doesn't introduce this regression and was there before.

test-open-iscsi.py cannot be found in Debian upstream, thus seems to be a Ubuntu test script only.
I suspect this script generate a false positive and would need to be adapt to recent open-iscsi modifications.

======================================================================
FAIL: test_daemon (__main__.OpenIscsiTest)
Test iscsid
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/tmp/autopkgtest.lA3fiS/build.8r5/open-iscsi-2.0.874/debian/tests/test-open-iscsi.py", line 104, in test_daemon
    self.assertTrue(self.daemon.status()[0])
AssertionError: False is not true

Regards,
Eric

Eric Desrochers (slashd) wrote :

I just talked to nacc...

Basically the test assumes iscsid will always start but with new systemd unit, it won't unless there are iscsi targets configured (condition failed).

It's already on nacc's todo this week to try and get the tests fixed.
A decision will be taken to either remove that test, or update it to run in the nested VMs.

Eric

Łukasz Zemczak (sil2100) wrote :

After talking with Eric and seeing the declaration of the autopkgtest failure being addressed ASAP, I will be *conditionally* releasing the zesty update. It is aged sufficiently, tested and artful will be migrating as soon as possible. So... no use in making this wait.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package open-iscsi - 2.0.873+git0.3b4b4500-14ubuntu17.1

---------------
open-iscsi (2.0.873+git0.3b4b4500-14ubuntu17.1) zesty; urgency=medium

  * d/p/08_Parse-origin-value-from-iBFT.patch
  * d/p/09_Represent-DHCP-origin-as-an-enum-not-a-string.patch
  * d/p/10_iBFT-origin-is-an-enum-not-a-string.patch
      - Cherry pick upstream patches to parse IP origin passed in by IBFT.
        iscsistart -f will now display the correct iface.bootproto
        (e.g. static or dhcp) (LP: #1684039)

  * d/extra/initramfs.local-top: When booting from iBFT,
    set the PROTO= entry in /run/net-*.conf accordingly,
    so that other tools, such as cloud-init, can use that
    information. (cloud-init fails if the current PROTO=none
    is used.) (Closes: #866213)

 -- Trent Lloyd <email address hidden> Fri, 23 Jun 2017 16:52:21 +0800

Changed in open-iscsi (Ubuntu Zesty):
status: Fix Committed → Fix Released
Eric Desrochers (slashd) on 2017-08-07
tags: added: sts
removed: sts-sru-needed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package open-iscsi - 2.0.874-4ubuntu1

---------------
open-iscsi (2.0.874-4ubuntu1) artful; urgency=medium

  * Merge with Debian unstable. Remaining changes:
    - debian/tests: Add Ubuntu autopkgtests.
    - debian/iscsi-network-interface.rules, debian/net-interface-handler,
      debian/open-iscsi.install:
      Prevent network interface that contains iscsi root from bouncing
      during boot or going down during shutdown.
      Integrates with resolvconf and initramfs code that writes
      /run/initramfs/open-iscsi.interface
    - debian/open-iscsi.maintscript: clean up the obsolete
      iscsi-network-interface upstart job, file on upgrade.
    - Let iscsid systemd job run in privileged containers but not in
      unprivileged ones
    - Start open-iscsi systemd job when either /etc/iscsi/nodes or
      /sys/class/iscsi_session have content
      Based on patch by Nish Aravamudan, thanks! (LP #1576341)
    - add IPv6 support
      + add support for IPV6{DOMAINSEARCH,DNS0,DNS1} to net-interface-handler
        LP #1621507
      + Source /run/net6-*.conf when needed.
      + debian/extra/initramfs.local-top: handle IPv6 configs being
        shipped in DEVICE6 or /run/net6-*.conf in the initramfs, so we
        can fill in /run/initramfs/open-iscsi.interface (LP #1621507)
  * Drop:
    - d/extra/initramfs.local-top: When booting from iBFT,
      set the PROTO= entry in /run/net-*.conf accordingly,
      so that other tools, such as cloud-init, can use that
      information. (cloud-init fails if the current PROTO=none
      is used.) (LP: #1684039) (Closes: #866213)
      [ Fixed in Debian 2.0.874-4 ]
  * d/t/test-open-iscsi.py: drop test_daemon test
    - With the updates to the systemd units, the services do not run
      unless iSCSI is configured.

 -- Nishanth Aravamudan <email address hidden> Tue, 08 Aug 2017 16:16:27 -0700

Changed in open-iscsi (Ubuntu Artful):
status: Fix Committed → Fix Released
Eric Desrochers (slashd) wrote :

The above ^ fix the autopkgtest failure.

Thanks to Nishanth.

--
  * d/t/test-open-iscsi.py: drop test_daemon test
    - With the updates to the systemd units, the services do not run
      unless iSCSI is configured.
--

That should close the loop for that subject.

I now see open-iscsi version "2.0.874-4ubuntu1" as a "Valid candidate" in the excuse page, thus it will no longer be stuck in artful-proposed.

Regards,
Eric

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.