ubuntu bootstrap: node fails to boot (hardware dependent)

Bug #1485188 reported by Leontii Istomin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Fix Released
High
Alexei Sheplyakov

Bug Description

During discovering nodes one node wasn't discovered. It hangs arounda minute (Screenshot from ipmi is attached) then goes rebooted. This actions are in cycle.

We use baremetal nodes, fuel is kvm virtual machine.

api: '1.0'
astute_sha: e24ca066bf6160bc1e419aaa5d486cad1aaa937d
auth_required: true
build_id: 2015-08-14_21-21-22
build_number: '177'
feature_groups:
- mirantis
fuel-agent_sha: 57145b1d8804389304cd04322ba0fb3dc9d30327
fuel-library_sha: 9de2625d26c3b88d22082baecd789b6bd5ddf3fa
fuel-nailgun-agent_sha: e01693992d7a0304d926b922b43f3b747c35964c
fuel-ostf_sha: 17786b86b78e5b66d2b1c15500186648df10c63d
fuelmain_sha: d8c726645be087bc67e2eeca134f0f9747cfeacd
nailgun_sha: 4710801a2f4a6d61d652f8f1e64215d9dde37d2e
openstack_version: 2015.1.0-7.0
production: docker
python-fuelclient_sha: 4c74a60aa60c06c136d9197c7d09fa4f8c8e2863
release: '7.0'

Diagnostic Snapshot: http://mos-scale-share.mirantis.com/fuel-snapshot-2015-08-15_08-12-06.tar.xz

Tags: scale
Revision history for this message
Leontii Istomin (listomin) wrote :
Revision history for this message
Leontii Istomin (listomin) wrote :

I have found from the screenshot that network interface which bootstrap wants use to connect to admin network has right mac address, but incorrect number (erh0). Actually it should be eth2

Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

> network interface which bootstrap wants use to connect to admin network has right mac address,
> but incorrect number (erh0). Actually it should be eth2

The name of the interface is irrelevant - it's not guaranteed to be persistent.

Could you please collect the hardware information (lspci -vv; dmidecode), and capture dhcp traffic
(from the master node).

summary: - ubuntu bootstrap wrong determines network interfaces
+ ubuntu bootstrap: node fails to boot (hardware dependent)
Changed in fuel:
status: New → Incomplete
assignee: nobody → Leontiy Istomin (listomin)
Revision history for this message
Alexei Sheplyakov (asheplyakov) wrote :

live-boot picks the right NIC, however it takes quite a long time to initialize (about a minute)
on this particular machine, so live-boot bails out. Increasing the timeout solves the problem.

Changed in fuel:
status: Incomplete → Confirmed
assignee: Leontiy Istomin (listomin) → Alexei Sheplyakov (asheplyakov)
importance: Undecided → High
milestone: none → 7.0
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to fuel-library (master)

Fix proposed to branch: master
Review: https://review.openstack.org/213519

Changed in fuel:
status: Confirmed → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to fuel-library (master)

Reviewed: https://review.openstack.org/213519
Committed: https://git.openstack.org/cgit/stackforge/fuel-library/commit/?id=efe317702f911db5d04c9d2430c98ac4df9ff1c6
Submitter: Jenkins
Branch: master

commit efe317702f911db5d04c9d2430c98ac4df9ff1c6
Author: Alexey Sheplyakov <email address hidden>
Date: Sun Aug 16 19:02:48 2015 +0300

    nailgun::cobbler: bootstrap: adjust network configuration timeout

    live-boot assumes 15 sec is enough to setup the network interface
    However some Ethernet cards might take (much) longer to initialize,
    which causes a boot failure. To solve the issue set the timeout to
    2 minutes by default, and make it configurable via the
    BOOTSTRAP/ethdevice_timeout knob in astute.yaml

    DocImpact: bootstrap configuration item: ethdevice_timeout
    Closes-Bug: #1485188
    Change-Id: I4ac18aa229f13a9283b9bb0b19f1ce36e43954da

Changed in fuel:
status: In Progress → Fix Committed
Revision history for this message
Leontii Istomin (listomin) wrote :

Hasn't been reproduced since at least 7.0-300 build

Changed in fuel:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.