mirantis 9.0 fuel PXE boot fails
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Fuel for OpenStack |
Invalid
|
High
|
koren | ||
Mitaka |
Invalid
|
High
|
koren |
Bug Description
fuel 9.0 , latest ISO , downloaded Aug 7 2016.
slave node booting from PXE , with 16G RAM and 2vCPU , stuck at "random: nonblocking pool is initialized" step during boot...and constantly rebooting. re-creates several times on vsphere setup.
issue not seen with same setup, same resources, using fuel 8.0. can't initialize an environment.
koren (korenlev) wrote : | #1 |
no longer affects: | mos |
tags: | added: area-library |
Dmitry Pyzhov (dpyzhov) wrote : | #2 |
Please attach diagnostic snapshot.
no longer affects: | fuel/newton |
Dmitry Klenov (dklenov) wrote : | #3 |
Closing as requested data were not provided within a month. @koren, feel free to attach snapshot and reopen the bug.
Changed in fuel: | |
status: | Incomplete → Invalid |
koren (korenlev) wrote : | #4 |
- fuel-snapshot-2016-09-21_11-23-00.tgz Edit (48.7 MiB, application/x-tar)
snapshot attached (from our broken mirantis 9.0 setup)
koren (korenlev) wrote : | #5 |
re-open bug ?
Changed in fuel: | |
status: | Invalid → In Progress |
status: | In Progress → New |
Changed in fuel: | |
status: | New → Confirmed |
Alexander Gordeev (a-gordeev) wrote : | #6 |
Hello @koren,
it might be related with missing 'nomodeset' : http://
could you give it a try?
you need to append this option to `extend_kopts` string in /etc/fuel-
$ fuel-bootstrap build --activate
Alexander Gordeev (a-gordeev) wrote : | #7 |
It looks like cmdline options for bootstrap image and IBP images are heavily misaligned:
as every release contains `nomodeset` option given for the kernel:
1) https:/
2) https:/
3) https:/
but there's no `nomodeset` for bootstrap image.
I'm not absolutely sure if `nomodeset` will resolve the issue. So, somebody should probably fix bootstrap config file under 'Partial-Bug' or 'Related-Bug'.
koren (korenlev) wrote : Re: [Bug 1610967] Re: mirantis 9.0 fuel PXE boot fails | #8 |
hi.
did what u asked:
1. /etc/fuel-
extend_kopts: "biosdevname=0 net.ifnames=1 debug ignore_loglevel
log_buf_len=10M print_fatal_
2. [root@fuel3 fuel-bootstrap-
Try to build image with data:
bootstrap:
certs: null
container: {format: tar.gz, meta_file: metadata.yaml}
extend_kopts: biosdevname=0 net.ifnames=1 debug ignore_loglevel
log_buf_len=10M
print_
extra_files: [/usr/share/
label: b23d1405-
modules:
- {mask: kernel, name: kernel, uri: '
http://
'}
- {compress_format: xz, mask: initrd, name: initrd, uri: '
http://
'}
- &id001 {compress_format: xz, container: raw, format: ext4, mask:
rootfs, name: rootfs,
uri: '
http://
'}
post_script_file: null
root_
uuid: b23d1405-
codename: trusty
hashed_
$6$fEpl6cX2zq.
image_data:
/: *id001
output: /tmp/b23d1405-
packages: [squashfs-tools, nailgun-agent, nailgun-mcagents, live-boot,
openssh-server,
i40e-dkms, vim, fuel-agent, mcollective, xz-utils, msmtp-mta, wget,
hpsa-dkms, multipath-
multipath-tools, hwloc, openssh-client, mc, ntp, linux-firmware,
linux-firmware-
network-checker, live-boot-
linux-image-
ubuntu-minimal]
proxies: {}
repos:
- {name: ubuntu, priority: null, section: main universe multiverse, suite:
trusty,
type: deb, uri: 'http://
- {name: ubuntu-updates, priority: null, section: main universe multiverse,
suite: trusty-updates,
type: deb, uri: 'http://
- {name: ubuntu-security, priority: null, section: main universe
multiverse, suite: trusty-security,
type: deb, uri: 'http://
- {name: mos, priority: 1050, section: main restricted, suite: mos9.0,
type: deb,
uri: 'http://
- {name: mos-updates, priority: 1050, section: main restricted, suite:
mos9.0-updates,
type: deb, uri: 'http://
- {name: mos-security, priority: 1050, section: main restricted, suite:
mos9.0-security,
type: deb, uri: 'http://
- {name: mos-holdback, priority: 1100, section: main restricted, suite:
mos9.0-holdback,
type: deb, uri: 'http://
root_password: null
Build process is in progress. Usually it takes 15-20 minutes. It depends on
your internet connection and hardware performance.
--- Building bootstrap image (do_mkbootstrap) ---
*** Preparing image space ***
Installing BASE operating system into image
Starting new HTTP conn...
Alexander Gordeev (a-gordeev) wrote : | #9 |
Assigning to MOS-Linux team, since it's tricky to debug and `nomodeset` didn't help. Guys, please take a look at the issue.
Changed in fuel: | |
assignee: | Fuel Sustaining (fuel-sustaining-team) → MOS Linux (mos-linux) |
Dmitry Teselkin (teselkin-d) wrote : | #10 |
Please confirm that 'nomodeset' was added to cobbler profile. What shows the command below?
cobbler profile report --name ubuntu_bootstrap | grep 'Kernel Options'
I saw similar issues with KVM and found that 'nomodeset' was absent. After adding it manually via 'cobbler profile edit ...' the issue has gone. Here is the output of the command above on my system:
---
[root@nailgun ~]# cobbler profile report --name ubuntu_bootstrap | grep 'Kernel Options'
Kernel Options : {'nomodeset': '~', 'console': ['ttyS0,9600', 'tty0'], 'LOGLEVEL': '8', 'ip': 'frommedia', 'boot': 'live', 'mco_user': 'mcollective', 'toram': '~', 'ignore_loglevel': '~', 'biosdevname': '0', 'url': 'http://
Kernel Options (Post Install) : {}
---
Anyway, I had to restart 6 VMs simultaneously to reproduce the failure, it never occurred if I restarted nodes one by one.
koren (korenlev) wrote : | #11 |
output from our setup:
[root@fuel3 ~]# cobbler profile report --name ubuntu_bootstrap | grep
'Kernel Options'
Kernel Options : {'nomodeset': '~', 'console':
['ttyS0,9600', 'tty0'], 'LOGLEVEL': '8', 'ip': 'frommedia', 'boot': 'live',
'net.ifnames': '1', 'toram': '~', 'ignore_loglevel': '~', 'biosdevname':
'0', 'url': 'http://
'~', 'mco_pass': 'UjQL1L9u8lke5d
'mco_user': 'mcollective', 'panic': '60', 'fetch': '
http://
'ethdevice-
Kernel Options (Post Install) : {}
[root@fuel3 ~]#
looks good ? if yes, still no node is able to boot...
2016-09-29 18:13 GMT+03:00 Dmitry Teselkin <email address hidden>:
> Please confirm that 'nomodeset' was added to cobbler profile. What shows
> the command below?
>
> cobbler profile report --name ubuntu_bootstrap | grep 'Kernel Options'
>
>
> I saw similar issues with KVM and found that 'nomodeset' was absent. After
> adding it manually via 'cobbler profile edit ...' the issue has gone. Here
> is the output of the command above on my system:
>
> ---
> [root@nailgun ~]# cobbler profile report --name ubuntu_bootstrap | grep
> 'Kernel Options'
> Kernel Options : {'nomodeset': '~', 'console':
> ['ttyS0,9600', 'tty0'], 'LOGLEVEL': '8', 'ip': 'frommedia', 'boot': 'live',
> 'mco_user': 'mcollective', 'toram': '~', 'ignore_loglevel': '~',
> 'biosdevname': '0', 'url': 'http://
> '~', 'fetch': 'http://
> root.squashfs', 'debug': '~', 'mco_pass': 'PrmMzurdVnk9if
> 'net.ifnames': '1', 'panic': '60', 'log_buf_len': '10M',
> 'ethdevice-
> Kernel Options (Post Install) : {}
> ---
>
> Anyway, I had to restart 6 VMs simultaneously to reproduce the failure,
> it never occurred if I restarted nodes one by one.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> mirantis 9.0 fuel PXE boot fails
>
> Status in Fuel for OpenStack:
> Confirmed
> Status in Fuel for OpenStack mitaka series:
> Confirmed
>
> Bug description:
> fuel 9.0 , latest ISO , downloaded Aug 7 2016.
> slave node booting from PXE , with 16G RAM and 2vCPU , stuck at "random:
> nonblocking pool is initialized" step during boot...and constantly
> rebooting. re-creates several times on vsphere setup.
> issue not seen with same setup, same resources, using fuel 8.0. can't
> initialize an environment.
>
> To manage notifications about this bug go to:
> https:/
>
Albert Syriy (asyriy) wrote : | #12 |
Hello koren,
Could you attach your bootstrap files (initrd.img and root.squashfs) from the active bootstrap folder on the fuel-master node?
Here the folder:
/var/www/
Could you also check what kernel version is used on MOS-8.0 bootstrap?
Albert Syriy (asyriy) wrote : | #13 |
And another question,
How many nodes do you have? (Is the issue reproducible on one node?)
Albert Syriy (asyriy) wrote : | #14 |
Hello Koren,
I am still looking forward to your initrd.img file. Please attach it to the bug.
I built the initrd-new.img, which outputs the debug messages to console during the boot.
Please copy the initrd-new.img into the folder
/var/www/
rename it to initrd.img and run the following command (to update the initrd.img in the cache):
"fuel-bootstrap activate <id-of-
In my case (based on the output of the "fuel-bootstrap list" command below):
$ fuel-bootstrap list
+------
| uuid | label | status |
+------
| 4314e0cf-
+------
the command should be:
fuel-bootstrap activate 4314e0cf-
In your case the ID could be different.
After that please add "debug" key to the kernel parameters passing on boot and reboot one node.
You will see in the output what is going on during init script.
Please attach (to the bog) the screen-shot where is the node get stuck before reboot by timeout.
Looking forward to your update,
Albert
Albert Syriy (asyriy) wrote : | #15 |
Changed in fuel: | |
status: | Confirmed → Incomplete |
assignee: | MOS Linux (mos-linux) → koren (korenlev) |
koren (korenlev) wrote : | #16 |
- failed-boot-node.png Edit (205.4 KiB, image/png; name="failed-boot-node.png")
hi, here is the screenshot from a node which fails to boot from fuel
...(using the initrd.img i sent ...
i am trying now to activate your new img and will update soon ....
2016-10-04 13:25 GMT+03:00 Albert Syriy <email address hidden>:
> Hello Koren,
>
> I am still looking forward to your initrd.img file. Please attach it to
> the bug.
>
> I built the initrd-new.img, which outputs the debug messages to console
> during the boot.
>
> Please copy the initrd-new.img into the folder
> /var/www/
> rename it to initrd.img and run the following command (to update the
> initrd.img in the cache):
>
> "fuel-bootstrap activate <id-of-
>
> In my case (based on the output of the "fuel-bootstrap list" command
> below):
> $ fuel-bootstrap list
> +------
> -------
> | uuid | label
> | status |
> +------
> -------
> | 4314e0cf-
> | active |
> +------
> -------
>
> the command should be:
>
> fuel-bootstrap activate 4314e0cf-
>
> In your case the ID could be different.
>
> After that please add "debug" key to the kernel parameters passing on boot
> and reboot one node.
> You will see in the output what is going on during init script.
>
> Please attach (to the bog) the screen-shot where is the node get stuck
> before reboot by timeout.
>
> Looking forward to your update,
> Albert
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> mirantis 9.0 fuel PXE boot fails
>
> Status in Fuel for OpenStack:
> Incomplete
> Status in Fuel for OpenStack mitaka series:
> Incomplete
>
> Bug description:
> fuel 9.0 , latest ISO , downloaded Aug 7 2016.
> slave node booting from PXE , with 16G RAM and 2vCPU , stuck at "random:
> nonblocking pool is initialized" step during boot...and constantly
> rebooting. re-creates several times on vsphere setup.
> issue not seen with same setup, same resources, using fuel 8.0. can't
> initialize an environment.
>
> To manage notifications about this bug go to:
> https:/
>
koren (korenlev) wrote : | #17 |
- another-img-fails-boot.png Edit (89.7 KiB, image/png; name="another-img-fails-boot.png")
i used ur (new) img file, copied it, renamed , moved to replace the
previous one , then 'activate' as suggested ...
attached is the screenshot from another failed boot, using this new img ...
2016-10-04 13:25 GMT+03:00 Albert Syriy <email address hidden>:
> Hello Koren,
>
> I am still looking forward to your initrd.img file. Please attach it to
> the bug.
>
> I built the initrd-new.img, which outputs the debug messages to console
> during the boot.
>
> Please copy the initrd-new.img into the folder
> /var/www/
> rename it to initrd.img and run the following command (to update the
> initrd.img in the cache):
>
> "fuel-bootstrap activate <id-of-
>
> In my case (based on the output of the "fuel-bootstrap list" command
> below):
> $ fuel-bootstrap list
> +------
> -------
> | uuid | label
> | status |
> +------
> -------
> | 4314e0cf-
> | active |
> +------
> -------
>
> the command should be:
>
> fuel-bootstrap activate 4314e0cf-
>
> In your case the ID could be different.
>
> After that please add "debug" key to the kernel parameters passing on boot
> and reboot one node.
> You will see in the output what is going on during init script.
>
> Please attach (to the bog) the screen-shot where is the node get stuck
> before reboot by timeout.
>
> Looking forward to your update,
> Albert
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https:/
>
> Title:
> mirantis 9.0 fuel PXE boot fails
>
> Status in Fuel for OpenStack:
> Incomplete
> Status in Fuel for OpenStack mitaka series:
> Incomplete
>
> Bug description:
> fuel 9.0 , latest ISO , downloaded Aug 7 2016.
> slave node booting from PXE , with 16G RAM and 2vCPU , stuck at "random:
> nonblocking pool is initialized" step during boot...and constantly
> rebooting. re-creates several times on vsphere setup.
> issue not seen with same setup, same resources, using fuel 8.0. can't
> initialize an environment.
>
> To manage notifications about this bug go to:
> https:/
>
koren (korenlev) wrote : | #18 |
- another-img-fails-boot.png Edit (89.7 KiB, image/png; name="another-img-fails-boot.png")
2016-10-06 14:04 GMT+03:00 Koren Lev <email address hidden>:
> i used ur (new) img file, copied it, renamed , moved to replace the
> previous one , then 'activate' as suggested ...
> attached is the screenshot from another failed boot, using this new img ...
>
> 2016-10-04 13:25 GMT+03:00 Albert Syriy <email address hidden>:
>
>> Hello Koren,
>>
>> I am still looking forward to your initrd.img file. Please attach it to
>> the bug.
>>
>> I built the initrd-new.img, which outputs the debug messages to console
>> during the boot.
>>
>> Please copy the initrd-new.img into the folder
>> /var/www/
>> rename it to initrd.img and run the following command (to update the
>> initrd.img in the cache):
>>
>> "fuel-bootstrap activate <id-of-
>>
>> In my case (based on the output of the "fuel-bootstrap list" command
>> below):
>> $ fuel-bootstrap list
>> +------
>> -------
>> | uuid | label
>> | status |
>> +------
>> -------
>> | 4314e0cf-
>> | active |
>> +------
>> -------
>>
>> the command should be:
>>
>> fuel-bootstrap activate 4314e0cf-
>>
>> In your case the ID could be different.
>>
>> After that please add "debug" key to the kernel parameters passing on
>> boot and reboot one node.
>> You will see in the output what is going on during init script.
>>
>> Please attach (to the bog) the screen-shot where is the node get stuck
>> before reboot by timeout.
>>
>> Looking forward to your update,
>> Albert
>>
>> --
>> You received this bug notification because you are subscribed to the bug
>> report.
>> https:/
>>
>> Title:
>> mirantis 9.0 fuel PXE boot fails
>>
>> Status in Fuel for OpenStack:
>> Incomplete
>> Status in Fuel for OpenStack mitaka series:
>> Incomplete
>>
>> Bug description:
>> fuel 9.0 , latest ISO , downloaded Aug 7 2016.
>> slave node booting from PXE , with 16G RAM and 2vCPU , stuck at
>> "random: nonblocking pool is initialized" step during boot...and constantly
>> rebooting. re-creates several times on vsphere setup.
>> issue not seen with same setup, same resources, using fuel 8.0. can't
>> initialize an environment.
>>
>> To manage notifications about this bug go to:
>> https:/
>>
>
>
Albert Syriy (asyriy) wrote : | #19 |
Hello koren,
Thank you for providing screenshots.
According to the log booting process is "got stuck" in the initializing network card (ipconfig command) interface ens160 with timeout 120 seconds.
See for details the code:
https:/
Is it possible, that timeout for rebooting node less than the timeout for network card (120) sec?
Could you decrease the network card timeout from 120 sec let say to 60.
To do that please pass to the kernel parameters
ethdevice-
instead of
ethdevice-
Albert Syriy (asyriy) wrote : | #20 |
BTW,
Compare to MOS-8.0 do you have the same timeout value for the ethdevice-timeout ?
Albert Syriy (asyriy) wrote : | #21 |
Hello Koren,
Could you decrease the timeout for initializing network card.
Please edit the /var/www/
and add to the kernel parameters:
-----------------
164 extend_kopts: biosdevname=0 net.ifnames=1 debug ignore_loglevel log_buf_len=10M print_fatal_
165 LOGLEVEL=8
----------------
parameter 'ethdevice-
So should have in the file
----------------
164 extend_kopts: biosdevname=0 net.ifnames=1 debug ignore_loglevel log_buf_len=10M print_fatal_
165 LOGLEVEL=8 ethdevice-
----------------
After that to apply the changes type the command
$ fuel-bootstrap activate <ID of the active bootstrap>
in my case it's
# fuel-bootstrap list
+------
| uuid | label | status |
+------
| 4314e0cf-
| 6a9e5d3d-
+------
# fule-bootstrap activate 4314e0cf-
Please reboot the node, sure, that you have the new timeout ethdevice-
Regards,
Albert
koren (korenlev) wrote : | #22 |
this is from my updated /var/www/
extend_kopts: biosdevname=0 net.ifnames=1 debug ignore_loglevel log_buf_len=10M print_fatal_
LOGLEVEL=8 nomodeset ethdevice-
followed re-activating as suggested. "Bootstrap image b23d1405-
during boot of node i've seen this message: "echo "Using timeout of 60 seconds for network configuration ...."
then r"rebooting automatically due to panic= argument + sleep 60
then stuck for 5 minutes ...then reboots forever ...
so no help there.
can't you guys replicate with simple steps i took :
1. download ISO of 9.0
2. install as usual on a vmware VM, some basic net configs just like it worked for 8.0
9config_menu stuff)
3. reboot a node from it ...
?
koren (korenlev) wrote : | #23 |
doing an upgrade now :
INFO [alembic.
hopefully u'll have this resolved in this update ...
regards
koren
Dmitry Teselkin (teselkin-d) wrote : | #24 |
Hello Koren,
How is it going with 9.1? Did it solve the issue?
Oleksiy Molchanov (omolchanov) wrote : | #25 |
Marking as Invalid because of no update for more than a month.
Changed in fuel: | |
status: | Incomplete → Invalid |
from /var/log/ cobbler/ tasks - 2016-08- 08_102920_ sync.log :
Mon Aug 8 10:29:21 2016 - INFO | running pre-sync triggers cobbler/ images/ ubuntu_ bootstrap cobbler/ images/ ubuntu_ 1404_x86_ 64 cobbler/ images/ centos- x86_64 tftpboot/ pxelinux. cfg/default tftpboot/ grub/images tftpboot/ grub/efidefault tftpboot/ images/ ubuntu_ bootstrap tftpboot/ images/ ubuntu_ 1404_x86_ 64 tftpboot/ images/ centos- x86_64 tftpboot/ s390x/profile_ list syslinux/ pxelinux. 0 -> /var/lib/ tftpboot/ pxelinux. 0 syslinux/ menu.c32 -> /var/lib/ tftpboot/ menu.c32 syslinux/ memdisk -> /var/lib/ tftpboot/ memdisk nailgun/ centos/ x86_64/ isolinux/ vmlinuz -> /var/lib/ tftpboot/ images/ centos- x86_64/ vmlinuz nailgun/ centos/ x86_64/ isolinux/ initrd. img -> /var/lib/ tftpboot/ images/ centos- x86_64/ initrd. img nailgun/ bootstraps/ active_ bootstrap/ vmlinuz -> /var/lib/ tftpboot/ images/ ubuntu_ bootstrap/ vmlinuz nailgun/ bootstraps/ active_ bootstrap/ initrd. img -> /var/lib/ tftpboot/ images/ ubuntu_ bootstrap/ initrd. img nailgun/ ubuntu/ x86_64/ images/ linux d3255bfef956018 90afd80709 /var/www/ nailgun/ ubuntu/ x86_64/ images/ linux
Mon Aug 8 10:29:21 2016 - INFO | cleaning trees
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/www/
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/www/
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/www/
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/
Mon Aug 8 10:29:21 2016 - INFO | removing: /var/lib/
Mon Aug 8 10:29:21 2016 - INFO | copying bootloaders
Mon Aug 8 10:29:21 2016 - INFO | copying: /usr/share/
Mon Aug 8 10:29:21 2016 - INFO | copying: /usr/share/
Mon Aug 8 10:29:21 2016 - INFO | copying: /usr/share/
Mon Aug 8 10:29:21 2016 - INFO | copying distros to tftpboot
Mon Aug 8 10:29:21 2016 - INFO | copying files for distro: centos-x86_64
Mon Aug 8 10:29:21 2016 - INFO | trying hardlink /var/www/
Mon Aug 8 10:29:21 2016 - INFO | trying hardlink /var/www/
Mon Aug 8 10:29:21 2016 - INFO | copying files for distro: ubuntu_bootstrap
Mon Aug 8 10:29:21 2016 - INFO | trying hardlink /var/www/
Mon Aug 8 10:29:21 2016 - INFO | trying hardlink /var/www/
Mon Aug 8 10:29:21 2016 - INFO | copying files for distro: ubuntu_1404_x86_64
Mon Aug 8 10:29:21 2016 - INFO | running: /usr/bin/sha1sum /var/www/
Mon Aug 8 10:29:21 2016 - INFO | received on stdout: da39a3ee5e6b4b0
Mon Aug 8 10:29:21 2016 - DEBUG | received on stderr: nailgun/ ubuntu/ x86_64/ images/ linux -> /var/lib/ tftpboot/ images/ .link_cache/ da39a3ee5e6b4b0 d3255bfef956018 90afd80709 -> /var/lib/ tftpboot/ images/ ubuntu_ 1404_x86_ 64/linux nailgun/ ubuntu/ x86_64/ images/ initrd. gz d3255bfef956018 90afd80709 /var/www/ nailgun/ ubuntu/ x86_64/ images/ initrd. gz
Mon Aug 8 10:29:21 2016 - DEBUG | trying cachelink /var/www/
Mon Aug 8 10:29:21 2016 - INFO | running: /usr/bin/sha1sum /var/www/
Mon Aug 8 10:29:21 2016 - INFO | received on stdout: da39a3ee5e6b4b0
Mon Aug 8 10:29:21 2016 - DEBUG | received on stderr: nailgun/ ubuntu/ x86_64/ images/ in...
Mon Aug 8 10:29:21 2016 - DEBUG | trying cachelink /var/www/