Instance doesn't come up on uefi localboot with agent ramdisk

Bug #1451310 reported by Ramakrishnan G (rameshg87)
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Fix Released
Undecided
Ramakrishnan G (rameshg87)

Bug Description

I tried to deploy a HP Proliant DL 580 machine with UEFI localboot with agent ramdisk using iscsi_ilo driver. The deploy goes through fine and Ironic turns the instance active. However the instance doens't boot. The following is seen on the bare metal's console:

General Protection Exception
x64 Exception Type 0D
ImageBase = 0000000000000000000000000 CPU APIC ID = 00000000000000000000000
ImageName = (No PDB) EntryPoint = xxxxxxxxxxxxxxxxxxxxxxxxxxx
x
x
x
x
x86 Register inforomation
x
x

Revision history for this message
Ramakrishnan G (rameshg87) (rameshg87) wrote :

iscsi_ilo supports both agent ramdisk and dib ramdisk.

The same scenario works with DIB ramdisk everytime, but fails with agent ramdisk most of the time.

Revision history for this message
Ramakrishnan G (rameshg87) (rameshg87) wrote :

There is a difference between agent ramdisk and dib ramdisk - agent ramdisk controls the power of the node from Ironic whereas dib ramdisk reboot the bare metal from within.

I just figured out that it has to do something with the linux kernel not completing some tasks and Ironic power controlling the node. On some experimentation, I figured out that if I add a sleep of around 20 seconds after doing grub-install, it works fine. So linux kernel is doing something during this time frame. I checked this is not some disk operation - tried using blockdev --flush and didn't work. This *might* be something related to firmware as things goes through fine in BIOS boot mode.

Changed in ironic:
assignee: nobody → Ramakrishnan G (rameshg87)
Changed in ironic:
assignee: Ramakrishnan G (rameshg87) (rameshg87) → Lucas Alvares Gomes (lucasagomes)
status: New → In Progress
Changed in ironic:
assignee: Lucas Alvares Gomes (lucasagomes) → Ramakrishnan G (rameshg87) (rameshg87)
Changed in ironic:
assignee: Ramakrishnan G (rameshg87) (rameshg87) → Jim Rollenhagen (jim-rollenhagen)
Changed in ironic:
assignee: Jim Rollenhagen (jim-rollenhagen) → Ramakrishnan G (rameshg87) (rameshg87)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic-python-agent (master)

Reviewed: https://review.openstack.org/189241
Committed: https://git.openstack.org/cgit/openstack/ironic-python-agent/commit/?id=be36ed69039ffb4fb9620f9433a3c481ed31581e
Submitter: Jenkins
Branch: master

commit be36ed69039ffb4fb9620f9433a3c481ed31581e
Author: Ramakrishnan G <email address hidden>
Date: Mon Jun 8 02:55:17 2015 -0700

    Add power_off command in standby extension

    This commit adds a new command power_off to
    standby extension which runs shutdown -h now
    on the system. This commit also adds mappings
    for /proc and /sys in cloud-config.yml for the
    agent service spawned.

    Partial-Bug: #1451310
    Change-Id: I2a5f984af26bbbe03002bb8c367c8c6af8d91434

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to ironic (master)

Reviewed: https://review.openstack.org/185667
Committed: https://git.openstack.org/cgit/openstack/ironic/commit/?id=7adbf0622c8c2c3c52cdb24b235fca4b3afef8d3
Submitter: Jenkins
Branch: master

commit 7adbf0622c8c2c3c52cdb24b235fca4b3afef8d3
Author: Ramakrishnan G <email address hidden>
Date: Tue May 26 14:32:02 2015 +0100

    IPA: Do a soft power off at the end of deployment

    For IPA we currently do a hard reboot at the end of the deployment,
    but that doesn't always play nice with some systems specially when they
    have to deal with UEFI. So this patch is changing the code in Ironic to
    actually tell the ramdisk to power off itself instead of hard powering
    off the machine from BMC. This will give the kernel the necessary
    time to finish sync'ing up whatever it has to sync before rebooting
    the node.

    Depends-On: I2a5f984af26bbbe03002bb8c367c8c6af8d91434
    Co-Authored-By: Lucas Alvares Gomes <email address hidden>
    Closes-Bug: #1451310
    Change-Id: I831d8dc1d15c82a0caab94173c2b9f147017500f

Changed in ironic:
status: In Progress → Fix Committed
Changed in ironic:
milestone: none → 4.0.0
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.