Juniper Openstack

SM: servers re-image is happening in a loop for ESX ISO, when reimaged a cluster

Bug #1461791 reported by Bharat Kumar on 2015-06-04

This bug affects 1 person

	Status	Importance	Assigned to
Juniper Openstack	Status tracked in Trunk
R2.20	Won't Fix	Medium	prasad miriyala
Trunk	New	High	prasad miriyala

Bug Description

When a servers in a cluster are reimaged with ESX ISO image, servers in that cluster a reimaging in a loop, only one server in that cluster re-image happens sucessfully, all other servers in that cluster will be re-imaging in a loop.

For looping servers netboot is always enabled in cobbler profile and the reimage sucessfull server status mail is sent for every non sucessfull servers reimage happened.

Tested with R2.20 Build #40

Logs:
=====
root@nodec32:~/images# cobbler system report --name=nodec50
Name : nodec50
TFTP Boot Files : {}
Comment :
Enable gPXE? : 0
Fetchable Files : {}
Gateway :
Hostname : nodec50
Image :
IPv6 Autoconfiguration : False
IPv6 Default Device :
Kernel Options : {'system_name': 'nodec50', 'system_domain': 'englab.juniper.net', 'ip_address': '10.204.221.3', 'server': '10.204.217.17'}
Kernel Options (Post Install) : {}
Kickstart : <<inherit>>
Kickstart Metadata : {'system_name': 'nodec50', 'esx_nicname': 'vmnic0', 'device_cfg': 'http://10.204.217.17/contrail/config_file/nodec50.sh', 'system_domain': 'englab.juniper.net', 'passwd': '$1$ueJTahJl$erdQZWKkNuli3Mks9rpRD.', 'partition': '/dev/sd?', 'server_license': '', 'ip_address': '10.204.221.3'}
LDAP Enabled : False
LDAP Management Type : authconfig
Management Classes : <<inherit>>
Management Parameters : <<inherit>>
Monit Enabled : False
Name Servers : []
Name Servers Search Path : []
Netboot Enabled : True <<<<<<<<<<<<<<<<< Its always true
Owners : ['admin']
Power Management Address : 10.207.25.144
Power Management ID :
Power Management Password : ADMIN
Power Management Type : ipmilan
Power Management Username : ADMIN
Profile : esx
Proxy : <<inherit>>
Red Hat Management Key : <<inherit>>
Red Hat Management Server : <<inherit>>
Repos Enabled : False
Server Override : <<inherit>>
Status : production
Template Files : {}
Virt Auto Boot : <<inherit>>
Virt CPUs : <<inherit>>
Virt Disk Driver Type : <<inherit>>
Virt File Size(GB) : <<inherit>>
Virt Path : <<inherit>>
Virt PXE Boot : 0
Virt RAM (MB) : <<inherit>>
Virt Type : <<inherit>>
Interface ===== : eth0
Bonding Opts :
Bridge Opts :
CNAMES : []
DHCP Tag :
DNS Name : nodec50.englab.juniper.net
Per-Interface Gateway :
Master Interface :
Interface Type :
IP Address : 10.204.221.3
IPv6 Address :
IPv6 Default Gateway :
IPv6 MTU :
IPv6 Prefix :
IPv6 Secondaries : []
IPv6 Static Routes : []
MAC Address : 00:25:90:C4:83:90
Management Interface : False
MTU :
Subnet Mask :
Static : False
Static Routes : []
Virt Bridge :

root@nodec32:~/images# vim /var/lib/tftpboot/pxelinux.cfg/*90
root@nodec32:~/images# cat /var/lib/tftpboot/pxelinux.cfg/*90
default linux
prompt 0
timeout 1
label linux
kernel /images/esx/mboot.c32
ipappend 2
append -c /images/esx/cobbler-boot.cfg
root@nodec32:~/images#

Tags:

Ashish Ranjan (aranjan-n) on 2015-06-05

information type:

Private → Public

Revision history for this message

Bharat Kumar (pbharat) wrote on 2015-06-05:

If reimaged server by server after completion of reimage, re-image is happening sucessfully, if reimaged a cluster issue is seen.
Same issue is seen with R2.1 also.

Revision history for this message

prasad miriyala (pmiriyala) wrote on 2015-06-09:

This is a cobbler issue that, when we issue multiple reimages for esxi. Cobbler is picking the last issued reimage hostname and sending the post installation triggers. As part of post installation trigger, cobbler turns off the Netboot flag.
Because of this, Netboot flag turned off for the last target, and all the others netboot flag is on. Except the last target, all the other targets get into reimage loop.

Work around:
Issue esxi reimages one after the other in sequence

OpenContrail Admin (ci-admin-f) on 2015-06-10

tags:

added: quench

Revision history for this message

prasad miriyala (pmiriyala) wrote on 2015-06-11:

As we create an image, SM creates a distro and profile with cobbler corresponding to an image id. Profile will be associated with distro.
Reimage creates a system and associate with profile. The profile corresponds to the distro.
Distro->Profile->System 1
Distro->Profile->System n
Kernel data, Kickstart meta data and etc… are present in Distro, Profile and System. Data will be taken from lower level and if not goes higher levels.
System configuration should take precedence for kernel meta data or any other kernel options.
Typically system configuration contains specifics about that system, ex: system name, ip address and etc… It looks like ESXi works with only profile data, not with system data.
we are updating the profile data for each reimage to satisfy above hack. Because of that one reimage works at a time.

A workaround:
Create multiple images say esx5.5-s1, esx5.5-s2… esx5.5-sn, which is one time job.
To reimage s1 to sn..
reimage s1 with esx5.5-s1, s2 with esx5.5-s2 and so on…

Sudheendra Rao (sudheendra-k) on 2015-09-10

tags:

added: releasenote

Revision history for this message

OpenContrail Admin (ci-admin-f) wrote on 2016-04-19: [Bug update]

bug update...

Nagabhushana R (bhushana) on 2016-08-03

no longer affects:	juniperopenstack/r3.0
no longer affects:	juniperopenstack/r3.1

Nagabhushana R (bhushana) on 2016-08-03

tags:

removed: releasenote

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.