maas wily deployment to HP Proliant m400 arm64 server cartridge fails

Bug #1499869 reported by Craig Magina
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
cloud-init
Fix Released
High
Scott Moser
curtin
Invalid
Undecided
Unassigned
cloud-init (Ubuntu)
Fix Released
Undecided
Unassigned
Wily
Fix Released
Undecided
Unassigned
linux (Ubuntu)
Fix Released
Undecided
Tim Gardner
Vivid
Fix Released
Undecided
Tim Gardner
Wily
Fix Released
Undecided
Tim Gardner

Bug Description

This is the error seen on the console:

[ 64.149080] cloud-init[834]: 2015-08-27 15:03:29,289 - util.py[WARNING]: Failed fetching metadata from url http://10.229.32.21/MAAS/metadata/curtin
[ 124.513212] cloud-init[834]: 2015-09-24 17:23:10,006 - url_helper.py[WARNING]: Calling 'http://169.254.169.254/2009-04-04/meta-data/instance-id' failed [2427570/120s]: request error [HTTPConnectionPool(host='169.254.169.254', port=80): Max retries exceeded with url: /2009-04-04/meta-data/instance-id (Caused by ConnectTimeoutError(<requests.packages.urllib3.connection.HTTPConnection object at 0x7f7faa4b70>, 'Connection to 169.254.169.254 timed out. (connect timeout=50.0)'))]
[ 124.515570] cloud-init[834]: 2015-09-24 17:23:10,007 - DataSourceEc2.py[CRITICAL]: Giving up on md from ['http://169.254.169.25/2009-04-04/meta-data/instance-id'] after 2427570 seconds
[ 124.531624] cloud-init[834]: 2015-09-24 17:23:10,024 - url_helper.py[WARNING]: Calling 'http://<internal ip>/latest/meta-data/instance-id' failed [0/120s]: bad status code [404]

This times out eventually and the node is left at the login prompt. I can install wily via netboot without issue and some time back, wily was deployable to this node from MAAS.

Related branches

CVE References

Revision history for this message
Craig Magina (craig.magina) wrote :

The version of MAAS is 1.8.2.

Revision history for this message
Blake Rouse (blake-rouse) wrote :

This is either because curtin is not installing the cloud configuration for MAAS, cloud-init is not reading the correct config, or cloud-init cannot talk to MAAS.

I believe cloud-init changed to python-oauthlib instead of python-oauth so that might be the issue. Going to target to both curtin and cloud-init as well just to make sure the appropriate eyes see this.

Changed in maas:
milestone: none → 1.9.0
summary: - maas wily deployment to HP Proliant m400 fails
+ maas wily deployment to HP Proliant m400 and m800 fails
summary: - maas wily deployment to HP Proliant m400 and m800 fails
+ maas wily deployment to HP Proliant m400 fails
Revision history for this message
Craig Magina (craig.magina) wrote :

I can deploy wily to a virtual machine and an x86 based node without issue using the same MAAS that fails when deploying to the m400 node with the error above.

Revision history for this message
Craig Magina (craig.magina) wrote :
Revision history for this message
Craig Magina (craig.magina) wrote :

Cloud-init is getting the wrong time because of this error:

[ 14.726283] hctosys: unable to open rtc device (rtc0)

What that means is the RTC_DRV_XGENE kernel config option was changed to 'm', this needs to be built-in in order for the device to be available for hctosys.

affects: wily (Ubuntu) → linux (Ubuntu)
Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1499869

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Tim Gardner (timg-tpi)
summary: - maas wily deployment to HP Proliant m400 fails
+ maas wily deployment to HP Proliant m400 arm64 server cartridge fails
Changed in linux (Ubuntu):
assignee: nobody → Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Vivid):
assignee: nobody → Tim Gardner (timg-tpi)
status: New → In Progress
Changed in linux (Ubuntu Wily):
status: Incomplete → In Progress
Revision history for this message
Scott Moser (smoser) wrote :

cloud-init changes caused issues when "fixing" timestamp for oauth on systems with bad clock.
change here is to fix that. I'll upload later.

Changed in cloud-init:
assignee: nobody → Scott Moser (smoser)
importance: Undecided → High
status: New → Confirmed
no longer affects: cloud-init (Ubuntu Vivid)
Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Wily):
status: In Progress → Fix Committed
no longer affects: maas
tags: added: patch
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package cloud-init - 0.7.7~bzr1147-0ubuntu1

---------------
cloud-init (0.7.7~bzr1147-0ubuntu1) wily; urgency=medium

  * New upstream snapshot.
    * MAAS: fix oauth when system clock is bad (LP: #1499869)

 -- Scott Moser <email address hidden> Tue, 29 Sep 2015 20:16:57 -0400

Changed in cloud-init (Ubuntu Wily):
status: New → Fix Released
Brad Figg (brad-figg)
Changed in linux (Ubuntu Vivid):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (3.1 KiB)

This bug was fixed in the package linux - 4.2.0-14.16

---------------
linux (4.2.0-14.16) wily; urgency=low

  [ Tim Gardner ]

  * Release Tracking Bug
    - LP: #1501818
  * rebase to v4.2.2
  * [Config] CONFIG_RTC_DRV_XGENE=y
    - LP: #1499869

  [ Upstream Kernel Changes ]

  * mei: do not access freed cb in blocking write
    - LP: #1494076
  * mei: bus: fix drivers and devices names confusion
    - LP: #1494076
  * mei: bus: rename nfc.c to bus-fixup.c
    - LP: #1494076
  * mei: bus: move driver api functions at the start of the file
    - LP: #1494076
  * mei: bus: rename uevent handler to mei_cl_device_uevent
    - LP: #1494076
  * mei: bus: don't enable events implicitly in device enable
    - LP: #1494076
  * mei: bus: report if event registration failed
    - LP: #1494076
  * mei: bus: revamp device matching
    - LP: #1494076
  * mei: bus: revamp probe and remove functions
    - LP: #1494076
  * mei: bus: add reference to bus device in struct mei_cl_client
    - LP: #1494076
  * mei: bus: add me client device list infrastructure
    - LP: #1494076
  * mei: bus: enable running fixup routines before device registration
    - LP: #1494076
  * mei: bus: blacklist the nfc info client
    - LP: #1494076
  * mei: bus: blacklist clients by number of connections
    - LP: #1494076
  * mei: bus: simplify how we build nfc bus name
    - LP: #1494076
  * mei: bus: link client devices instead of host clients
    - LP: #1494076
  * mei: support for dynamic clients
    - LP: #1494076
  * mei: disconnect on connection request timeout
    - LP: #1494076
  * mei: define async notification hbm commands
    - LP: #1494076
  * mei: implement async notification hbm messages
    - LP: #1494076
  * mei: enable async event notifications only from hbm version 2.0
    - LP: #1494076
  * mei: add mei_cl_notify_request command
    - LP: #1494076
  * mei: add a handler that waits for notification on event
    - LP: #1494076
  * mei: add async event notification ioctls
    - LP: #1494076
  * mei: support polling for event notification
    - LP: #1494076
  * mei: implement fasync for event notification
    - LP: #1494076
  * mei: bus: add and call callback on notify event
    - LP: #1494076
  * mei: hbm: add new error code MEI_CL_CONN_NOT_ALLOWED
    - LP: #1494076
  * mei: me: d0i3: add the control registers
    - LP: #1494076
  * mei: me: d0i3: add flag to indicate D0i3 support
    - LP: #1494076
  * mei: me: d0i3: enable d0i3 interrupts
    - LP: #1494076
  * mei: hbm: reorganize the power gating responses
    - LP: #1494076
  * mei: me: d0i3: add d0i3 enter/exit state machine
    - LP: #1494076
  * mei: me: d0i3: move mei_me_hw_reset down in the file
    - LP: #1494076
  * mei: me: d0i3: exit d0i3 on driver start and enter it on stop
    - LP: #1494076
  * mei: me: add sunrise point device ids
    - LP: #1494076
  * mei: hbm: bump supported HBM version to 2.0
    - LP: #1494076
  * mei: remove check on pm_runtime_active in __mei_cl_disconnect
    - LP: #1494076
  * mei: fix debugfs files leak on error path
    - LP: #1494076

  [ Upstream Kernel Changes ]

  * rebase to v4.2.2
    - LP: #1492132

 -- Tim Gardner <email address hidden> Tue, 29 Sep 20...

Read more...

Changed in linux (Ubuntu Wily):
status: Fix Committed → Fix Released
Revision history for this message
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-vivid' to 'verification-done-vivid'.

If verification is not done by 5 working days from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-vivid
Revision history for this message
Craig Magina (craig.magina) wrote :

Vivid deployment succeeded with no issues.

tags: added: verification-done-vivid
removed: verification-needed-vivid
Revision history for this message
Launchpad Janitor (janitor) wrote :
Download full text (10.5 KiB)

This bug was fixed in the package linux - 3.19.0-31.36

---------------
linux (3.19.0-31.36) vivid; urgency=low

  [ Luis Henriques ]

  * Release Tracking Bug
    - LP: #1503703

  [ Andy Whitcroft ]

  * Revert "SAUCE: aufs3: mmap: Fix races in madvise_remove() and
    sys_msync()"
    - LP: #1503655

  [ Ben Hutchings ]

  * SAUCE: aufs3: mmap: Fix races in madvise_remove() and sys_msync()
    - LP: #1503655
    - CVE-2015-7312

linux (3.19.0-31.35) vivid; urgency=low

  [ Brad Figg ]

  * Release Tracking Bug
    - LP: #1503005

  [ Ben Hutchings ]

  * SAUCE: aufs3: mmap: Fix races in madvise_remove() and sys_msync()
    - CVE-2015-7312

  [ Craig Magina ]

  * [Config] Add XGENE_EDAC, EDAC_SUPPORT and EDAC_ATOMIC_SCRUB
    - LP: #1494357

  [ John Johansen ]

  * SAUCE: (no-up) apparmor: fix mount not handling disconnected paths
    - LP: #1496430

  [ Laurent Dufour ]

  * SAUCE: powerpc/hvsi: Fix endianness issues in the HVSI driver
    - LP: #1499357

  [ Tim Gardner ]

  * [Config] CONFIG_RTC_DRV_XGENE=y for only arm64
    - LP: #1499869

  [ Upstream Kernel Changes ]

  * Revert "sit: Add gro callbacks to sit_offload"
    - LP: #1500493
  * ipmi/powernv: Fix minor locking bug
    - LP: #1493017
  * mmc: sdhci-pci: set the clear transfer mode register quirk for O2Micro
    - LP: #1472843
  * perf probe ppc: Fix symbol fixup issues due to ELF type
    - LP: #1485528
  * perf probe ppc: Use the right prefix when ignoring SyS symbols on ppc
    - LP: #1485528
  * perf probe ppc: Enable matching against dot symbols automatically
    - LP: #1485528
  * perf probe ppc64le: Fix ppc64 ABIv2 symbol decoding
    - LP: #1485528
  * perf probe ppc64le: Prefer symbol table lookup over DWARF
    - LP: #1485528
  * perf probe ppc64le: Fixup function entry if using kallsyms lookup
    - LP: #1485528
  * perf probe: Improve detection of file/function name in the probe
    pattern
    - LP: #1485528
  * perf probe: Ignore tail calls to probed functions
    - LP: #1485528
  * seccomp: cap SECCOMP_RET_ERRNO data to MAX_ERRNO
    - LP: #1496073
  * EDAC: Cleanup atomic_scrub mess
    - LP: #1494357
  * arm64: Enable EDAC on ARM64
    - LP: #1494357
  * MAINTAINERS: Add entry for APM X-Gene SoC EDAC driver
    - LP: #1494357
  * Documentation: Add documentation for the APM X-Gene SoC EDAC DTS
    binding
    - LP: #1494357
  * EDAC: Add APM X-Gene SoC EDAC driver
    - LP: #1494357
  * arm64: Add APM X-Gene SoC EDAC DTS entries
    - LP: #1494357
  * EDAC, edac_stub: Drop arch-specific include
    - LP: #1494357
  * NVMe: Fix blk-mq hot cpu notification
    - LP: #1498778
  * blk-mq: Shared tag enhancements
    - LP: #1498778
  * blk-mq: avoid access hctx->tags->cpumask before allocation
    - LP: #1498778
  * x86/ldt: Make modify_ldt synchronous
    - LP: #1500493
  * x86/ldt: Correct LDT access in single stepping logic
    - LP: #1500493
  * x86/ldt: Correct FPU emulation access to LDT
    - LP: #1500493
  * md: flush ->event_work before stopping array.
    - LP: #1500493
  * ipv6: addrconf: validate new MTU before applying it
    - LP: #1500493
  * virtio-net: drop NETIF_F_FRAGLIST
    - LP: #1500493
  * RDS: verify the underlying transport exists bef...

Changed in linux (Ubuntu Vivid):
status: Fix Committed → Fix Released
Ryan Harper (raharper)
Changed in curtin:
status: New → Invalid
Joshua Powers (powersj)
Changed in cloud-init:
status: Confirmed → Fix Released
Revision history for this message
James Falcon (falcojr) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.