mirantis 9.0 fuel PXE boot fails

Bug #1610967 reported by koren
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Fuel for OpenStack
Invalid
High
koren
Mitaka
Invalid
High
koren

Bug Description

fuel 9.0 , latest ISO , downloaded Aug 7 2016.
slave node booting from PXE , with 16G RAM and 2vCPU , stuck at "random: nonblocking pool is initialized" step during boot...and constantly rebooting. re-creates several times on vsphere setup.
issue not seen with same setup, same resources, using fuel 8.0. can't initialize an environment.

Tags: area-library
Revision history for this message
koren (korenlev) wrote :
Download full text (6.9 KiB)

from /var/log/cobbler/tasks - 2016-08-08_102920_sync.log :

Mon Aug 8 10:29:21 2016 - INFO | running pre-sync triggers
Mon Aug 8 10:29:21 2016 - INFO | cleaning trees
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/www/cobbler/images/ubuntu_bootstrap
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/www/cobbler/images/ubuntu_1404_x86_64
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/www/cobbler/images/centos-x86_64
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/tftpboot/pxelinux.cfg/default
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/tftpboot/grub/images
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/tftpboot/grub/efidefault
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/tftpboot/images/ubuntu_bootstrap
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/tftpboot/images/ubuntu_1404_x86_64
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/tftpboot/images/centos-x86_64
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/tftpboot/s390x/profile_list
Mon Aug 8 10:29:21 2016 - INFO | copying bootloaders
Mon Aug 8 10:29:21 2016 - INFO | copying: /usr/share/syslinux/pxelinux.0 -> /var/lib/tftpboot/pxelinux.0
Mon Aug 8 10:29:21 2016 - INFO | copying: /usr/share/syslinux/menu.c32 -> /var/lib/tftpboot/menu.c32
Mon Aug 8 10:29:21 2016 - INFO | copying: /usr/share/syslinux/memdisk -> /var/lib/tftpboot/memdisk
Mon Aug 8 10:29:21 2016 - INFO | copying distros to tftpboot
Mon Aug 8 10:29:21 2016 - INFO | copying files for distro: centos-x86_64
Mon Aug 8 10:29:21 2016 - INFO | trying hardlink /var/www/nailgun/centos/x86_64/isolinux/vmlinuz -> /var/lib/tftpboot/images/centos-x86_64/vmlinuz
Mon Aug 8 10:29:21 2016 - INFO | trying hardlink /var/www/nailgun/centos/x86_64/isolinux/initrd.img -> /var/lib/tftpboot/images/centos-x86_64/initrd.img
Mon Aug 8 10:29:21 2016 - INFO | copying files for distro: ubuntu_bootstrap
Mon Aug 8 10:29:21 2016 - INFO | trying hardlink /var/www/nailgun/bootstraps/active_bootstrap/vmlinuz -> /var/lib/tftpboot/images/ubuntu_bootstrap/vmlinuz
Mon Aug 8 10:29:21 2016 - INFO | trying hardlink /var/www/nailgun/bootstraps/active_bootstrap/initrd.img -> /var/lib/tftpboot/images/ubuntu_bootstrap/initrd.img
Mon Aug 8 10:29:21 2016 - INFO | copying files for distro: ubuntu_1404_x86_64
Mon Aug 8 10:29:21 2016 - INFO | running: /usr/bin/sha1sum /var/www/nailgun/ubuntu/x86_64/images/linux
Mon Aug 8 10:29:21 2016 - INFO | received on stdout: da39a3ee5e6b4b0d3255bfef95601890afd80709 /var/www/nailgun/ubuntu/x86_64/images/linux

Mon Aug 8 10:29:21 2016 - DEBUG | received on stderr:
Mon Aug 8 10:29:21 2016 - DEBUG | trying cachelink /var/www/nailgun/ubuntu/x86_64/images/linux -> /var/lib/tftpboot/images/.link_cache/da39a3ee5e6b4b0d3255bfef95601890afd80709 -> /var/lib/tftpboot/images/ubuntu_1404_x86_64/linux
Mon Aug 8 10:29:21 2016 - INFO | running: /usr/bin/sha1sum /var/www/nailgun/ubuntu/x86_64/images/initrd.gz
Mon Aug 8 10:29:21 2016 - INFO | received on stdout: da39a3ee5e6b4b0d3255bfef95601890afd80709 /var/www/nailgun/ubuntu/x86_64/images/initrd.gz

Mon Aug 8 10:29:21 2016 - DEBUG | received on stderr:
Mon Aug 8 10:29:21 2016 - DEBUG | trying cachelink /var/www/nailgun/ubuntu/x86_64/images/in...

Read more...

no longer affects: mos
tags: added: area-library
Revision history for this message
Dmitry Pyzhov (dpyzhov) wrote :

Please attach diagnostic snapshot.

Dmitry Pyzhov (dpyzhov)
no longer affects: fuel/newton
Revision history for this message
Dmitry Klenov (dklenov) wrote :

Closing as requested data were not provided within a month. @koren, feel free to attach snapshot and reopen the bug.

Changed in fuel:
status: Incomplete → Invalid
Revision history for this message
koren (korenlev) wrote :

snapshot attached (from our broken mirantis 9.0 setup)

Revision history for this message
koren (korenlev) wrote :

re-open bug ?

Changed in fuel:
status: Invalid → In Progress
status: In Progress → New
Changed in fuel:
status: New → Confirmed
Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

Hello @koren,

it might be related with missing 'nomodeset' : http://blog.jamesrhall.com/2014/04/ubuntu-server-1404-fun.html

could you give it a try?

you need to append this option to `extend_kopts` string in /etc/fuel-bootstrap-cli/fuel_bootstrap_cli.yaml on the master node and rebuild image like:

$ fuel-bootstrap build --activate

Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

It looks like cmdline options for bootstrap image and IBP images are heavily misaligned:

as every release contains `nomodeset` option given for the kernel:
1) https://github.com/openstack/fuel-web/blob/064cb5bda6db404b2c8fc8c60ee37bfe881b01ee/nailgun/nailgun/fixtures/openstack.yaml#L1316
2) https://github.com/openstack/fuel-web/blob/064cb5bda6db404b2c8fc8c60ee37bfe881b01ee/nailgun/nailgun/fixtures/openstack.yaml#L1948
3) https://github.com/openstack/fuel-web/blob/064cb5bda6db404b2c8fc8c60ee37bfe881b01ee/nailgun/nailgun/fixtures/openstack.yaml#L2021

but there's no `nomodeset` for bootstrap image.

https://github.com/openstack/fuel-agent/blob/stable/mitaka/contrib/fuel_bootstrap/fuel_bootstrap_cli/fuel_bootstrap/settings.yaml.sample#L6

I'm not absolutely sure if `nomodeset` will resolve the issue. So, somebody should probably fix bootstrap config file under 'Partial-Bug' or 'Related-Bug'.

Revision history for this message
koren (korenlev) wrote : Re: [Bug 1610967] Re: mirantis 9.0 fuel PXE boot fails
Download full text (5.0 KiB)

hi.
did what u asked:

1. /etc/fuel-bootstrap-cli/fuel_bootstrap_cli.yaml added:
extend_kopts: "biosdevname=0 net.ifnames=1 debug ignore_loglevel
log_buf_len=10M print_fatal_signals=1 LOGLEVEL=8 nomodeset"

2. [root@fuel3 fuel-bootstrap-cli]# fuel-bootstrap build --activate
Try to build image with data:
bootstrap:
  certs: null
  container: {format: tar.gz, meta_file: metadata.yaml}
  extend_kopts: biosdevname=0 net.ifnames=1 debug ignore_loglevel
log_buf_len=10M
    print_fatal_signals=1 LOGLEVEL=8 nomodeset
  extra_files: [/usr/share/fuel_bootstrap_cli/files/trusty]
  label: b23d1405-cb87-4b35-b19a-baaf7673e494
  modules:
  - {mask: kernel, name: kernel, uri: '
http://127.0.0.1:8080/bootstraps/b23d1405-cb87-4b35-b19a-baaf7673e494/vmlinuz
'}
  - {compress_format: xz, mask: initrd, name: initrd, uri: '
http://127.0.0.1:8080/bootstraps/b23d1405-cb87-4b35-b19a-baaf7673e494/initrd.img
'}
  - &id001 {compress_format: xz, container: raw, format: ext4, mask:
rootfs, name: rootfs,
    uri: '
http://127.0.0.1:8080/bootstraps/b23d1405-cb87-4b35-b19a-baaf7673e494/root.squashfs
'}
  post_script_file: null
  root_ssh_authorized_file: /root/.ssh/id_rsa.pub
  uuid: b23d1405-cb87-4b35-b19a-baaf7673e494
codename: trusty
hashed_root_password:
$6$fEpl6cX2zq.bt49d$eCD64eGeN3Vja2PTqyebaEfLiSFnqKzWrRRWZzqELLnqb3b3zYBWGt9iWBMmiUCEoBiyRXaY2MgzCn045oh7s.
image_data:
  /: *id001
output: /tmp/b23d1405-cb87-4b35-b19a-baaf7673e494.tar.gz
packages: [squashfs-tools, nailgun-agent, nailgun-mcagents, live-boot,
openssh-server,
  i40e-dkms, vim, fuel-agent, mcollective, xz-utils, msmtp-mta, wget,
hpsa-dkms, multipath-tools-boot,
  multipath-tools, hwloc, openssh-client, mc, ntp, linux-firmware,
linux-firmware-nonfree,
  network-checker, live-boot-initramfs-tools, linux-headers-generic,
linux-image-generic-lts-trusty,
  ubuntu-minimal]
proxies: {}
repos:
- {name: ubuntu, priority: null, section: main universe multiverse, suite:
trusty,
  type: deb, uri: 'http://archive.ubuntu.com/ubuntu'}
- {name: ubuntu-updates, priority: null, section: main universe multiverse,
suite: trusty-updates,
  type: deb, uri: 'http://archive.ubuntu.com/ubuntu'}
- {name: ubuntu-security, priority: null, section: main universe
multiverse, suite: trusty-security,
  type: deb, uri: 'http://archive.ubuntu.com/ubuntu'}
- {name: mos, priority: 1050, section: main restricted, suite: mos9.0,
type: deb,
  uri: 'http://127.0.0.1:8080/ubuntu/x86_64'}
- {name: mos-updates, priority: 1050, section: main restricted, suite:
mos9.0-updates,
  type: deb, uri: 'http://mirror.fuel-infra.org/mos-repos/ubuntu/9.0'}
- {name: mos-security, priority: 1050, section: main restricted, suite:
mos9.0-security,
  type: deb, uri: 'http://mirror.fuel-infra.org/mos-repos/ubuntu/9.0'}
- {name: mos-holdback, priority: 1100, section: main restricted, suite:
mos9.0-holdback,
  type: deb, uri: 'http://mirror.fuel-infra.org/mos-repos/ubuntu/9.0'}
root_password: null

Build process is in progress. Usually it takes 15-20 minutes. It depends on
your internet connection and hardware performance.
--- Building bootstrap image (do_mkbootstrap) ---
*** Preparing image space ***
Installing BASE operating system into image

Starting new HTTP conn...

Read more...

Revision history for this message
Alexander Gordeev (a-gordeev) wrote :

Assigning to MOS-Linux team, since it's tricky to debug and `nomodeset` didn't help. Guys, please take a look at the issue.

Changed in fuel:
assignee: Fuel Sustaining (fuel-sustaining-team) → MOS Linux (mos-linux)
Revision history for this message
Dmitry Teselkin (teselkin-d) wrote :

Please confirm that 'nomodeset' was added to cobbler profile. What shows the command below?

  cobbler profile report --name ubuntu_bootstrap | grep 'Kernel Options'

I saw similar issues with KVM and found that 'nomodeset' was absent. After adding it manually via 'cobbler profile edit ...' the issue has gone. Here is the output of the command above on my system:

---
[root@nailgun ~]# cobbler profile report --name ubuntu_bootstrap | grep 'Kernel Options'
Kernel Options : {'nomodeset': '~', 'console': ['ttyS0,9600', 'tty0'], 'LOGLEVEL': '8', 'ip': 'frommedia', 'boot': 'live', 'mco_user': 'mcollective', 'toram': '~', 'ignore_loglevel': '~', 'biosdevname': '0', 'url': 'http://10.20.0.2:8000/api', 'components': '~', 'fetch': 'http://10.20.0.2:8080/bootstraps/active_bootstrap/root.squashfs', 'debug': '~', 'mco_pass': 'PrmMzurdVnk9ifKzbAhZg2QL', 'net.ifnames': '1', 'panic': '60', 'log_buf_len': '10M', 'ethdevice-timeout': '120', 'print_fatal_signals': '1'}
Kernel Options (Post Install) : {}
---

Anyway, I had to restart 6 VMs simultaneously to reproduce the failure, it never occurred if I restarted nodes one by one.

Revision history for this message
koren (korenlev) wrote :

output from our setup:

[root@fuel3 ~]# cobbler profile report --name ubuntu_bootstrap | grep
'Kernel Options'
Kernel Options : {'nomodeset': '~', 'console':
['ttyS0,9600', 'tty0'], 'LOGLEVEL': '8', 'ip': 'frommedia', 'boot': 'live',
'net.ifnames': '1', 'toram': '~', 'ignore_loglevel': '~', 'biosdevname':
'0', 'url': 'http://192.168.100.254:8000/api', 'components': '~', 'debug':
'~', 'mco_pass': 'UjQL1L9u8lke5dE9Gn4Cl9S4', 'log_buf_len': '10M',
'mco_user': 'mcollective', 'panic': '60', 'fetch': '
http://192.168.100.254:8080/bootstraps/active_bootstrap/root.squashfs',
'ethdevice-timeout': '120', 'print_fatal_signals': '1'}
Kernel Options (Post Install) : {}
[root@fuel3 ~]#

looks good ? if yes, still no node is able to boot...

2016-09-29 18:13 GMT+03:00 Dmitry Teselkin <email address hidden>:

> Please confirm that 'nomodeset' was added to cobbler profile. What shows
> the command below?
>
> cobbler profile report --name ubuntu_bootstrap | grep 'Kernel Options'
>
>
> I saw similar issues with KVM and found that 'nomodeset' was absent. After
> adding it manually via 'cobbler profile edit ...' the issue has gone. Here
> is the output of the command above on my system:
>
> ---
> [root@nailgun ~]# cobbler profile report --name ubuntu_bootstrap | grep
> 'Kernel Options'
> Kernel Options : {'nomodeset': '~', 'console':
> ['ttyS0,9600', 'tty0'], 'LOGLEVEL': '8', 'ip': 'frommedia', 'boot': 'live',
> 'mco_user': 'mcollective', 'toram': '~', 'ignore_loglevel': '~',
> 'biosdevname': '0', 'url': 'http://10.20.0.2:8000/api', 'components':
> '~', 'fetch': 'http://10.20.0.2:8080/bootstraps/active_bootstrap/
> root.squashfs', 'debug': '~', 'mco_pass': 'PrmMzurdVnk9ifKzbAhZg2QL',
> 'net.ifnames': '1', 'panic': '60', 'log_buf_len': '10M',
> 'ethdevice-timeout': '120', 'print_fatal_signals': '1'}
> Kernel Options (Post Install) : {}
> ---
>
> Anyway, I had to restart 6 VMs simultaneously to reproduce the failure,
> it never occurred if I restarted nodes one by one.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1610967
>
> Title:
> mirantis 9.0 fuel PXE boot fails
>
> Status in Fuel for OpenStack:
> Confirmed
> Status in Fuel for OpenStack mitaka series:
> Confirmed
>
> Bug description:
> fuel 9.0 , latest ISO , downloaded Aug 7 2016.
> slave node booting from PXE , with 16G RAM and 2vCPU , stuck at "random:
> nonblocking pool is initialized" step during boot...and constantly
> rebooting. re-creates several times on vsphere setup.
> issue not seen with same setup, same resources, using fuel 8.0. can't
> initialize an environment.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/fuel/+bug/1610967/+subscriptions
>

Revision history for this message
Albert Syriy (asyriy) wrote :

Hello koren,

Could you attach your bootstrap files (initrd.img and root.squashfs) from the active bootstrap folder on the fuel-master node?
Here the folder:
/var/www/nailgun/bootstraps/active_bootstrap/

Could you also check what kernel version is used on MOS-8.0 bootstrap?

Revision history for this message
Albert Syriy (asyriy) wrote :

And another question,
How many nodes do you have? (Is the issue reproducible on one node?)

Revision history for this message
Albert Syriy (asyriy) wrote :

Hello Koren,

I am still looking forward to your initrd.img file. Please attach it to the bug.

I built the initrd-new.img, which outputs the debug messages to console during the boot.

Please copy the initrd-new.img into the folder
/var/www/nailgun/bootstraps/active_bootstrap/
rename it to initrd.img and run the following command (to update the initrd.img in the cache):

"fuel-bootstrap activate <id-of-your-active-bootstrap>"

In my case (based on the output of the "fuel-bootstrap list" command below):
$ fuel-bootstrap list
+--------------------------------------+--------------------------------------+--------+
| uuid | label | status |
+--------------------------------------+--------------------------------------+--------+
| 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2 | 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2 | active |
+--------------------------------------+--------------------------------------+--------+

the command should be:

fuel-bootstrap activate 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2

In your case the ID could be different.

After that please add "debug" key to the kernel parameters passing on boot and reboot one node.
You will see in the output what is going on during init script.

Please attach (to the bog) the screen-shot where is the node get stuck before reboot by timeout.

Looking forward to your update,
Albert

Revision history for this message
Albert Syriy (asyriy) wrote :
Changed in fuel:
status: Confirmed → Incomplete
assignee: MOS Linux (mos-linux) → koren (korenlev)
Revision history for this message
koren (korenlev) wrote :

hi, here is the screenshot from a node which fails to boot from fuel
...(using the initrd.img i sent ...
i am trying now to activate your new img and will update soon ....

2016-10-04 13:25 GMT+03:00 Albert Syriy <email address hidden>:

> Hello Koren,
>
> I am still looking forward to your initrd.img file. Please attach it to
> the bug.
>
> I built the initrd-new.img, which outputs the debug messages to console
> during the boot.
>
> Please copy the initrd-new.img into the folder
> /var/www/nailgun/bootstraps/active_bootstrap/
> rename it to initrd.img and run the following command (to update the
> initrd.img in the cache):
>
> "fuel-bootstrap activate <id-of-your-active-bootstrap>"
>
> In my case (based on the output of the "fuel-bootstrap list" command
> below):
> $ fuel-bootstrap list
> +--------------------------------------+--------------------
> ------------------+--------+
> | uuid | label
> | status |
> +--------------------------------------+--------------------
> ------------------+--------+
> | 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2 | 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2
> | active |
> +--------------------------------------+--------------------
> ------------------+--------+
>
> the command should be:
>
> fuel-bootstrap activate 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2
>
> In your case the ID could be different.
>
> After that please add "debug" key to the kernel parameters passing on boot
> and reboot one node.
> You will see in the output what is going on during init script.
>
> Please attach (to the bog) the screen-shot where is the node get stuck
> before reboot by timeout.
>
> Looking forward to your update,
> Albert
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1610967
>
> Title:
> mirantis 9.0 fuel PXE boot fails
>
> Status in Fuel for OpenStack:
> Incomplete
> Status in Fuel for OpenStack mitaka series:
> Incomplete
>
> Bug description:
> fuel 9.0 , latest ISO , downloaded Aug 7 2016.
> slave node booting from PXE , with 16G RAM and 2vCPU , stuck at "random:
> nonblocking pool is initialized" step during boot...and constantly
> rebooting. re-creates several times on vsphere setup.
> issue not seen with same setup, same resources, using fuel 8.0. can't
> initialize an environment.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/fuel/+bug/1610967/+subscriptions
>

Revision history for this message
koren (korenlev) wrote :

i used ur (new) img file, copied it, renamed , moved to replace the
previous one , then 'activate' as suggested ...
attached is the screenshot from another failed boot, using this new img ...

2016-10-04 13:25 GMT+03:00 Albert Syriy <email address hidden>:

> Hello Koren,
>
> I am still looking forward to your initrd.img file. Please attach it to
> the bug.
>
> I built the initrd-new.img, which outputs the debug messages to console
> during the boot.
>
> Please copy the initrd-new.img into the folder
> /var/www/nailgun/bootstraps/active_bootstrap/
> rename it to initrd.img and run the following command (to update the
> initrd.img in the cache):
>
> "fuel-bootstrap activate <id-of-your-active-bootstrap>"
>
> In my case (based on the output of the "fuel-bootstrap list" command
> below):
> $ fuel-bootstrap list
> +--------------------------------------+--------------------
> ------------------+--------+
> | uuid | label
> | status |
> +--------------------------------------+--------------------
> ------------------+--------+
> | 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2 | 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2
> | active |
> +--------------------------------------+--------------------
> ------------------+--------+
>
> the command should be:
>
> fuel-bootstrap activate 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2
>
> In your case the ID could be different.
>
> After that please add "debug" key to the kernel parameters passing on boot
> and reboot one node.
> You will see in the output what is going on during init script.
>
> Please attach (to the bog) the screen-shot where is the node get stuck
> before reboot by timeout.
>
> Looking forward to your update,
> Albert
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1610967
>
> Title:
> mirantis 9.0 fuel PXE boot fails
>
> Status in Fuel for OpenStack:
> Incomplete
> Status in Fuel for OpenStack mitaka series:
> Incomplete
>
> Bug description:
> fuel 9.0 , latest ISO , downloaded Aug 7 2016.
> slave node booting from PXE , with 16G RAM and 2vCPU , stuck at "random:
> nonblocking pool is initialized" step during boot...and constantly
> rebooting. re-creates several times on vsphere setup.
> issue not seen with same setup, same resources, using fuel 8.0. can't
> initialize an environment.
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/fuel/+bug/1610967/+subscriptions
>

Revision history for this message
koren (korenlev) wrote :

2016-10-06 14:04 GMT+03:00 Koren Lev <email address hidden>:

> i used ur (new) img file, copied it, renamed , moved to replace the
> previous one , then 'activate' as suggested ...
> attached is the screenshot from another failed boot, using this new img ...
>
> 2016-10-04 13:25 GMT+03:00 Albert Syriy <email address hidden>:
>
>> Hello Koren,
>>
>> I am still looking forward to your initrd.img file. Please attach it to
>> the bug.
>>
>> I built the initrd-new.img, which outputs the debug messages to console
>> during the boot.
>>
>> Please copy the initrd-new.img into the folder
>> /var/www/nailgun/bootstraps/active_bootstrap/
>> rename it to initrd.img and run the following command (to update the
>> initrd.img in the cache):
>>
>> "fuel-bootstrap activate <id-of-your-active-bootstrap>"
>>
>> In my case (based on the output of the "fuel-bootstrap list" command
>> below):
>> $ fuel-bootstrap list
>> +--------------------------------------+--------------------
>> ------------------+--------+
>> | uuid | label
>> | status |
>> +--------------------------------------+--------------------
>> ------------------+--------+
>> | 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2 | 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2
>> | active |
>> +--------------------------------------+--------------------
>> ------------------+--------+
>>
>> the command should be:
>>
>> fuel-bootstrap activate 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2
>>
>> In your case the ID could be different.
>>
>> After that please add "debug" key to the kernel parameters passing on
>> boot and reboot one node.
>> You will see in the output what is going on during init script.
>>
>> Please attach (to the bog) the screen-shot where is the node get stuck
>> before reboot by timeout.
>>
>> Looking forward to your update,
>> Albert
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https://bugs.launchpad.net/bugs/1610967
>>
>> Title:
>> mirantis 9.0 fuel PXE boot fails
>>
>> Status in Fuel for OpenStack:
>> Incomplete
>> Status in Fuel for OpenStack mitaka series:
>> Incomplete
>>
>> Bug description:
>> fuel 9.0 , latest ISO , downloaded Aug 7 2016.
>> slave node booting from PXE , with 16G RAM and 2vCPU , stuck at
>> "random: nonblocking pool is initialized" step during boot...and constantly
>> rebooting. re-creates several times on vsphere setup.
>> issue not seen with same setup, same resources, using fuel 8.0. can't
>> initialize an environment.
>>
>> To manage notifications about this bug go to:
>> https://bugs.launchpad.net/fuel/+bug/1610967/+subscriptions
>>
>
>

Revision history for this message
Albert Syriy (asyriy) wrote :

Hello koren,

Thank you for providing screenshots.

According to the log booting process is "got stuck" in the initializing network card (ipconfig command) interface ens160 with timeout 120 seconds.

See for details the code:

https://github.com/Webconverger/webc/blob/master/lib/live/boot/9990-networking.sh#L98L100

Is it possible, that timeout for rebooting node less than the timeout for network card (120) sec?

Could you decrease the network card timeout from 120 sec let say to 60.
To do that please pass to the kernel parameters

ethdevice-timeout=60
instead of
ethdevice-timeout=120

Revision history for this message
Albert Syriy (asyriy) wrote :

BTW,

Compare to MOS-8.0 do you have the same timeout value for the ethdevice-timeout ?

Revision history for this message
Albert Syriy (asyriy) wrote :

Hello Koren,

Could you decrease the timeout for initializing network card.

Please edit the /var/www/nailgun/bootstraps/active_bootstrap/metadata.yaml file
and add to the kernel parameters:
-----------------
    164 extend_kopts: biosdevname=0 net.ifnames=1 debug ignore_loglevel log_buf_len=10M print_fatal_signals=1
    165 LOGLEVEL=8
----------------
parameter 'ethdevice-timeout=60'

So should have in the file
----------------
    164 extend_kopts: biosdevname=0 net.ifnames=1 debug ignore_loglevel log_buf_len=10M print_fatal_signals=1
    165 LOGLEVEL=8 ethdevice-timeout=60
----------------

After that to apply the changes type the command
$ fuel-bootstrap activate <ID of the active bootstrap>

in my case it's
# fuel-bootstrap list
+--------------------------------------+--------------------------------------+--------+
| uuid | label | status |
+--------------------------------------+--------------------------------------+--------+
| 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2 | 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2 | active |
| 6a9e5d3d-dcf4-487d-b876-effdb5a83958 | 6a9e5d3d-dcf4-487d-b876-effdb5a83958 | |
+--------------------------------------+--------------------------------------+--------+

# fule-bootstrap activate 4314e0cf-4e7c-4986-8bf1-2b7e9058e4a2

Please reboot the node, sure, that you have the new timeout ethdevice-timeout=60 and check if it helps to boot the node.

Regards,

Albert

Revision history for this message
koren (korenlev) wrote :

this is from my updated /var/www/nailgun/bootstraps/active_bootstrap/metadata.yaml:
extend_kopts: biosdevname=0 net.ifnames=1 debug ignore_loglevel log_buf_len=10M print_fatal_signals=1
  LOGLEVEL=8 nomodeset ethdevice-timeout=60

followed re-activating as suggested. "Bootstrap image b23d1405-cb87-4b35-b19a-baaf7673e494 has been activated."

during boot of node i've seen this message: "echo "Using timeout of 60 seconds for network configuration ...."

then r"rebooting automatically due to panic= argument + sleep 60
then stuck for 5 minutes ...then reboots forever ...

so no help there.

can't you guys replicate with simple steps i took :
1. download ISO of 9.0
2. install as usual on a vmware VM, some basic net configs just like it worked for 8.0
9config_menu stuff)
3. reboot a node from it ...

?

Revision history for this message
koren (korenlev) wrote :

doing an upgrade now :
INFO [alembic.runtime.migration] Running upgrade 675105097a69 -> f2314e5d63c9, Fuel 9.1

hopefully u'll have this resolved in this update ...

regards
koren

Revision history for this message
Dmitry Teselkin (teselkin-d) wrote :

Hello Koren,
How is it going with 9.1? Did it solve the issue?

Revision history for this message
Oleksiy Molchanov (omolchanov) wrote :

Marking as Invalid because of no update for more than a month.

Changed in fuel:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Related questions

Remote bug watches

Bug watches keep track of this bug in other bug trackers.