devstack VMs are not booting

Bug #1499054 reported by Jim Rollenhagen
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Ironic
Invalid
Critical
Unassigned
Ironic Inspector
Invalid
Critical
Unassigned
neutron
Fix Released
Critical
Kevin Benton
Kilo
Fix Released
Undecided
Unassigned

Bug Description

In devstack, VMs are failing to boot the deploy ramdisk consistently. It appears ipxe is failing to configure the NIC, which is usually caused by a DHCP timeout, but can also be caused by a bug in the PXE ROM that chainloads to ipxe. See also http://ipxe.org/err/040ee1

Console output:

 eaBIOS (version 1.7.4-20140219_122710-roseapple)
 achine UUID 37679b90-9a59-4a85-8665-df8267e09a3b
M

iPXE (http://ipxe.org) 00:04.0 CA00 PCI2.10 PnP PMM+3FFC2360+3FF22360 CA00

Booting from ROM...
iPXE (PCI 00:04.0) starting execution...ok
iPXE initialising devices...ok

iPXE 1.0.0+git-20131111.c3d1e78-2ubuntu1.1 -- Open Source Network Boot Firmware
-- http://ipxe.org
Features: HTTP HTTPS iSCSI DNS TFTP AoE bzImage ELF MBOOT PXE PXEXT Menu

net0: 52:54:00:7c:af:9e using 82540em on PCI00:04.0 (open)
  [Link:up, TX:0 TXE:0 RX:0 RXE:0]
Configuring (net0 52:54:00:7c:af:9e).................. Error 0x040ee119 (http://
ipxe.org/040ee119)
No more network devices

No bootable device.

Revision history for this message
Jim Rollenhagen (jim-rollenhagen) wrote :
Changed in ironic:
status: New → Confirmed
importance: Undecided → Critical
Revision history for this message
Jim Rollenhagen (jim-rollenhagen) wrote :

This revert fixes the problem locally for me: https://review.openstack.org/#/c/226969/

tags: added: liberty-backport-potential
Dmitry Tantsur (divius)
Changed in ironic-inspector:
status: New → Confirmed
importance: Undecided → Critical
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/226995

Changed in neutron:
assignee: nobody → Kevin Benton (kevinbenton)
status: New → In Progress
Revision history for this message
Kyle Mestery (mestery) wrote :

We need to either land this fix or the revert [1] before we can cut either ironic-rc1 or neutron-r1 for Liberty.

[1] https://review.openstack.org/#/c/226969/

Changed in neutron:
importance: Undecided → Critical
milestone: none → liberty-rc1
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on neutron (master)

Change abandoned by Jim Rollenhagen (<email address hidden>) on branch: master
Review: https://review.openstack.org/226969
Reason: Abandoned in favor of a real fix

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/226995
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=22f5fef5251cff9dbce35ca9a0ec8ea3b42f359c
Submitter: Jenkins
Branch: master

commit 22f5fef5251cff9dbce35ca9a0ec8ea3b42f359c
Author: Kevin Benton <email address hidden>
Date: Fri Sep 18 07:15:45 2015 -0700

    Don't write DHCP opts for SLAAC entries

    Change I81b4669eadaa9119e08c6a5e1d2a7b5959babdcc
    caused DHCP options to be written for SLAAC entries
    when they previously were not. This restores the previous
    behavior.

    Closes-Bug: #1499054
    Change-Id: I81400305f166d62aa4612aab54602abb8178b64c

Changed in neutron:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Revision history for this message
Thierry Carrez (ttx) wrote :

This was actually merged after the branch cut and might need a backport now

Changed in neutron:
status: Fix Released → Fix Committed
Revision history for this message
Dmitry Tantsur (divius) wrote :

That fixed gate for ironic and inspector, thanks

Changed in ironic:
status: Confirmed → Invalid
Changed in ironic-inspector:
status: Confirmed → Invalid
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/227304

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/227425

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/227304
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=69662eb3cad925a3933f2ba3e5ef03dbde52ffc9
Submitter: Jenkins
Branch: stable/liberty

commit 69662eb3cad925a3933f2ba3e5ef03dbde52ffc9
Author: Kevin Benton <email address hidden>
Date: Fri Sep 18 07:15:45 2015 -0700

    Don't write DHCP opts for SLAAC entries

    Change I81b4669eadaa9119e08c6a5e1d2a7b5959babdcc
    caused DHCP options to be written for SLAAC entries
    when they previously were not. This restores the previous
    behavior.

    Closes-Bug: #1499054
    Change-Id: I81400305f166d62aa4612aab54602abb8178b64c
    (cherry picked from commit 22f5fef5251cff9dbce35ca9a0ec8ea3b42f359c)

tags: added: in-stable-liberty
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/kilo)

Reviewed: https://review.openstack.org/227425
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=d78899dc90f01ea39823900444c8c3c070ed091f
Submitter: Jenkins
Branch: stable/kilo

commit d78899dc90f01ea39823900444c8c3c070ed091f
Author: Kevin Benton <email address hidden>
Date: Fri Sep 18 07:15:45 2015 -0700

    Don't write DHCP opts for SLAAC entries

    Change I81b4669eadaa9119e08c6a5e1d2a7b5959babdcc
    caused DHCP options to be written for SLAAC entries
    when they previously were not. This restores the previous
    behavior.

    Conflicts:
     neutron/agent/linux/dhcp.py

    Closes-Bug: #1499054
    Change-Id: I81400305f166d62aa4612aab54602abb8178b64c
    (cherry picked from commit 22f5fef5251cff9dbce35ca9a0ec8ea3b42f359c)

tags: added: in-stable-kilo
tags: removed: liberty-backport-potential
Thierry Carrez (ttx)
Changed in neutron:
milestone: liberty-rc1 → 7.0.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/235300

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)
Download full text (16.6 KiB)

Reviewed: https://review.openstack.org/235300
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=140ccc36d172bead2605968b3d61b36cca8a0040
Submitter: Jenkins
Branch: master

commit 6dcfe3a9362ae5fcf18e5cfb59663e43446cd59c
Author: Kevin Benton <email address hidden>
Date: Tue Oct 6 19:28:47 2015 -0700

    Mock oslo policy HTTPCheck instead of urllib

    We were mocking internal behavior of oslo policy by
    patching urllib. This will break with the upcoming oslo
    release that switches to requests.

    This patch changes the mock to the HTTPCheck level and we
    can leave implementation details testing up to oslo_policy.

    Change-Id: I07957f01307e25f1547197c720eea6e3e7f0ef5a
    Closes-Bug: #1503890
    (cherry picked from commit a0f1d9d6de1560be91d3001c8ac9f880a7a5a7e0)

    Add testresources used by oslo.db fixture

    If we use oslo.db fixtures, we'll need the package or
    the next version of oslo.db release will break us.

    Closes-Bug: #1503501
    Change-Id: I7dfbf240333095d91a414ba15a439bdc4804eb25
    (cherry picked from commit 86ad967e40c2c6752ec0fb46cfd3098ede0c7178)

    Fix functional test_server tests

    Now oslo.service 0.10.0 no longer sends SIGHUP to parent and
    children services.

    This was a chance introduced by 286a6ea, and since it invalidated
    the very logic under test, this must be revised.

    (cherry picked from commit 090fe713592c2b6398d999bfa03b80cbb2054609)

    Change-Id: I18a11283925369bc918002477774f196010a1bc3
    Closes-bug: #1505438
    (cherry picked from commit 090fe713592c2b6398d999bfa03b80cbb2054609)

    Make test_server work with older versions of oslo.service

    Change I18a11283925369bc918002477774f196010a1bc3 fixed the test for
    oslo.service >= 0.10.0, but it also broke it for older versions of
    oslo.service. Since the library has minimal version of >= 0.7.0 in
    requirements.txt, test should pass for those versions too.

    Now, instead of validating that either reset() or restart() of workers
    are triggered on SIGHUP, just validate that .start() is triggered the
    expected number of times (either way, no matter how oslo.service decide
    to clean up the children, they exit and then are respawned).

    Change-Id: I41f9d3af780b3178b075bc1e7084f417a2bd1378
    Closes-Bug: #1505645
    (cherry picked from commit 7bb40921660cf29beb68e338e205499efd6ffa36)

    Fixed multiple py34 gate issues

    1. Scope mock of 'open' to module

    By mocking 'open' at the module level, we can avoid affecting
    'open' calls from other modules.

    2. Stop using LOG.exception in contexts with no sys.exc_info set

    Python 3.4 logger fills in record.exc_info with sys.exc_info() result
    [1], and then it uses it to determine the current exception [2] to
    append to the log message. Since there is no exception, exc_info[1] is
    None, and we get AttributeError inside traceback module.

    It's actually a bug in Python interpreter that it attempt to access the
    attribute when there is no exception. It turns out that it's fixed in
    latest master of cPython [3] (...

Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/neutron 8.0.0.0b1

This issue was fixed in the openstack/neutron 8.0.0.0b1 development milestone.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.