lost connectivity to a node when using fastpath-installer with precise+hwe-s

Bug #1310076 reported by Nobuto Murata
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Released
Critical
Jeroen T. Vermeulen
1.5
Fix Released
Critical
Jeroen T. Vermeulen
curtin (Ubuntu)
Triaged
Critical
Unassigned
Trusty
New
Undecided
Unassigned
maas (Ubuntu)
Fix Released
Undecided
Unassigned
Trusty
Fix Released
Undecided
Unassigned

Bug Description

[Test Case]
1. Install MAAS
2. Add a node.
3. Select a Hardware Enablement Kernel. (Distro Series: Precise, Architecture: amd64/hwe-s)
4. Deploy will fail.

With the fix:
4. Deploy will succeed

After setting up hwe-s for precise along with the doc below, `juju add-machine` fails with "Failed to bring up br0". Then lost connectivity to a node.
http://maas.ubuntu.com/docs1.5/hardware-enablement-kernels.html

I'm using fastpath-installer, d-i is not tested yet.

ProblemType: Bug
DistroRelease: Ubuntu 14.04
Package: maas 1.5+bzr2252-0ubuntu1
ProcVersionSignature: User Name 3.13.0-19.40-generic 3.13.6
Uname: Linux 3.13.0-19-generic x86_64
ApportVersion: 2.13.3-0ubuntu1
Architecture: amd64
Date: Sun Apr 20 05:05:05 2014
PackageArchitecture: all
SourcePackage: maas
UpgradeStatus: No upgrade log present (probably fresh install)

Related branches

Revision history for this message
Nobuto Murata (nobuto) wrote :
Revision history for this message
Nobuto Murata (nobuto) wrote :
Revision history for this message
Nobuto Murata (nobuto) wrote :

After changing "/etc/maas/preseeds/curtin_userdata" to use default archive not ports.ubuntu.com for hwe-s, `juju add-machine` succeeded.

=== modified file 'maas/preseeds/curtin_userdata'
--- maas/preseeds/curtin_userdata 2014-04-19 18:37:53 +0000
+++ maas/preseeds/curtin_userdata 2014-04-19 20:12:27 +0000
@@ -28,7 +28,7 @@
 power_state:
   mode: reboot

-{{if node.architecture in {'i386/generic', 'amd64/generic'} }}
+{{if node.architecture in {'i386/generic', 'amd64/generic', 'amd64/hwe-s'} }}
 apt_mirrors:
   ubuntu_archive: http://{{main_archive_hostname}}/{{main_archive_directory}}
   ubuntu_security: http://{{main_archive_hostname}}/{{main_archive_directory}}

Revision history for this message
Nobuto Murata (nobuto) wrote :

The kernel was still 3.2 not hwe-s though.

Revision history for this message
Nobuto Murata (nobuto) wrote :

d-i case was filed as Bug #1310082.

Revision history for this message
Nobuto Murata (nobuto) wrote :

From the log of curtin manually kicked inside maas node, curtin is trying to install lts-saucy but fails.
====
Tried to install kernel linux-generic-lts-saucy but package not found.
====

If I manually put the content below in /etc/maas/preseeds/curtin_userdata, curtin installs lts-saucy kernel. However curtin seems to have implementation of auto-detect feature above. Somehow it does not work.
====
kernel:
  package: linux-generic-lts-saucy
====

Revision history for this message
Nobuto Murata (nobuto) wrote :

The reason auto-installation lts-saucy does not work in my environment was that the result of "in_chroot(['apt-cache', 'search', package], capture=True)" was empty. i.e. "apt-cache search linux-generic-lts-saucy" returns empty result.

To put `apt-get update` before the line works for me as a workaround (apt-get update runs twice before and after `apt-cache search`).

I'm not sure this is a issue in curtin or in maas tarball image.

--- /usr/lib/python2.7/dist-packages/curtin/commands/curthooks.py.orig 2014-04-21 03:02:52.271682540 +0900
+++ /usr/lib/python2.7/dist-packages/curtin/commands/curthooks.py 2014-04-21 03:32:08.754944407 +0900
@@ -170,6 +170,8 @@

         package = "linux-{flavor}{map_suffix}".format(
             flavor=flavor, map_suffix=map_suffix)
+ # make sure package cache is updated
+ in_chroot(['apt-get', 'update', '--quiet'])
         out, _ = in_chroot(['apt-cache', 'search', package], capture=True)
         if (len(out.strip()) > 0 and
                 not util.has_pkg_installed(package, target)):

Revision history for this message
Jeroen T. Vermeulen (jtv) wrote :

I am attaching a fix for the part of the problem that's in maas/preseeds/curtin_userdata, in the maas source tree. This won't be enough to fix the whole problem, so I am not marking the branch as “fixing” this bug.

Raphaël Badin (rvb)
Changed in maas:
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
Scott Moser (smoser) wrote :

Can't we do this sanely instead of making the intended-to-be-user-editable template file have such madness as :
  {{if node.split_arch()[0] in {'i386', 'amd64'} }}

Why not make 'arch' actually mean 'arch' and 'subarch' mean 'subarch' as variables available to the template.

Revision history for this message
Scott Moser (smoser) wrote :

in response to comment 5, i'm not sure why the cache in the image wouldn't have had either been updated or have had sane values built in. that could be an image-build thing.

Revision history for this message
Julian Edwards (julian-edwards) wrote :

Scott, is this a Curtin bug as per comment 6? I just verified that it's broken - the installer comes up with the hwe-s kernel but installs the release series one.

tags: added: server-hwe
Scott Moser (smoser)
Changed in curtin (Ubuntu):
status: New → Triaged
importance: Undecided → Critical
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package maas - 1.5+bzr2267-0ubuntu1

---------------
maas (1.5+bzr2267-0ubuntu1) utopic; urgency=medium

  * New upstream bugfix release. Fixes:
    - Hardware Enablement for Cisco B-Series. (LP: #1300476)
    - Allow AMT power type to specify IP Address. (LP: #1308772)
    - Spurious failure when starting and creating lock files. (LP: 1308069)
    - Fix regression introduced by a security fix (LP: #1311433, LP: #1311433)
    - Fix usage of hardware enablement kernels by fixing the preseeds
      (LP: #1310082, LP: #1310076, LP: #1310082)
    - Fix parallel juju deployments. (LP: #1314409)
    - Clear distro_series when stopping node from WebUI (LP: #1316396)
    - Fix click hijacking (LP: #1298784)
    - Fix blocking API client when deleting a resource (LP: #1313556)
    - Do not import Trusty RC images by default (LP: #1311151)
 -- Andres Rodriguez <email address hidden> Tue, 15 Apr 2014 14:41:32 -0400

Changed in maas (Ubuntu):
status: New → Fix Released
description: updated
Changed in maas (Ubuntu):
status: Fix Released → Confirmed
Changed in maas:
status: Triaged → Fix Committed
assignee: nobody → Jeroen T. Vermeulen (jtv)
Revision history for this message
Julian Edwards (julian-edwards) wrote :

Note that only the MAAS part is fixed, Curtin still needs a fix.

Changed in maas:
status: Fix Committed → Fix Released
Chris J Arges (arges)
Changed in maas (Ubuntu):
status: Confirmed → Fix Released
Revision history for this message
Chris J Arges (arges) wrote : Please test proposed package

Hello Nobuto, or anyone else affected,

Accepted maas into trusty-proposed. The package will build now and be available at http://launchpad.net/ubuntu/+source/maas/1.5.1+bzr2269-0ubuntu0.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, and change the tag from verification-needed to verification-done. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in maas (Ubuntu Trusty):
status: New → Fix Committed
tags: added: verification-needed
tags: added: verification-done
removed: verification-needed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package maas - 1.5.1+bzr2269-0ubuntu0.1

---------------
maas (1.5.1+bzr2269-0ubuntu0.1) trusty; urgency=medium

  * Stable Release Update (LP: #1317601):
    - Hardware Enablement for Cisco B-Series. (LP: #1300476)
    - Allow AMT power type to specify IP Address. (LP: #1308772)
    - Spurious failure when starting and creating lock files. (LP: 1308069)
    - Fix usage of hardware enablement kernels by fixing the preseeds
      (LP: #1310082, LP: #1310076, LP: #1310082)
    - Fix parallel juju deployments. (LP: #1314409)
    - Clear distro_series when stopping node from WebUI (LP: #1316396)
    - Fix click hijacking (LP: #1298784)
    - Fix blocking API client when deleting a resource (LP: #1313556)
    - Do not import Trusty RC images by default (LP: #1311151)
    - debian/control: Add missing dep on python-crochet for
      python-maas-provisioningserver (LP: #1311765)
 -- Andres Rodriguez <email address hidden> Fri, 09 May 2014 22:35:43 -0500

Changed in maas (Ubuntu Trusty):
status: Fix Committed → Fix Released
Revision history for this message
Chris J Arges (arges) wrote : Update Released

The verification of the Stable Release Update for maas has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.