Lot of Guest-VM's across different Computes when Power-On, not receiving its Virtual-Network IP from Contrail DHCP

Bug #1519124 reported by Sarath
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Won't Fix
Critical
Unassigned
R2.21.x
Won't Fix
Critical
Unassigned
Trunk
New
Medium
Andra Cismaru

Bug Description

For some of requirements like VM cloning etc, we need to have power-off Guest-VM's being recommended procedure. When power-up
~900 VM's which is part of 7 different Computes, I see Guest VM not receiving its Virtual-Network IP from Contrail DHCP

This Bug most likely not related to Bug# 1518137 because of below reasons and so opening this for tracking,
  >> In Bug# 1518137, it is on Single Esx host/Compute and when this same tests tried on different Esx host/compute, the issue not seen prooving issue only related to "Maintainence Mode" recovery of Esx
  >> This new Bug not setting any Esx on Maintaience and instead try only hundreds of Guest VM Power-off and Power-on part of routines which need this like Cloning etc. Mostly this new bug could be Performance issue handling hundreds of Guest VM's DHCP

Topology
##########

3 controllers
8 Esx nodes

### Debugging

root@oblocknode03:/var/log/contrail# cat contrail-vcenter-plugin.log | grep 208-30
2015-11-23 13:00:28,466 - INFO [pool-1-thread-1:VncDB@782] - VifPlug : VN:8d111279-b824-31ce-9920-92a99e95c051, VM:VM-208-30 (5004e0ab-403d-cde8-f6c9-9d794128507f), vrouterIp:172.16.80.113, EsxiHost:10.87.26.208, vlan:1982/1983
2015-11-23 13:00:28,485 - INFO [pool-1-thread-1:VncDB@869] - VRouterAPi Add Port success - interface name: vmi-testvn1-10-VM-208-30(1182229b-02af-4d5e-a220-c0dca705bf18),, VM=VM-208-30, VN=testvn1-10, vmIpAddress=2.0.1.54, vlan=1982/1983
2015-11-23 13:00:28,485 - INFO [pool-1-thread-1:VncDB@891] - VifPlug for VM:VM-208-30 (5004e0ab-403d-cde8-f6c9-9d794128507f) Done
root@oblocknode03:/var/log/contrail#
root@oblocknode03:/var/log/contrail#
root@oblocknode03:/var/log/contrail#

### Version

root@oblocknode02:/var#
root@oblocknode02:/var#
root@oblocknode02:/var# contrail-version
Package Version Build-ID | Repo | Package Name
-------------------------------------- ------------------------------ ----------------------------------
contrail-analytics 2.21-103 103
contrail-config 2.21-103 103
contrail-control 2.21-103 103
contrail-dns 2.21-103 103
contrail-f5 2.21-103 103
contrail-fabric-utils 2.21-103 103
contrail-install-packages 2.21-103~vcenter 103
contrail-install-vcenter-plugin 2.21-103 103
contrail-lib 2.21-103 103
contrail-nodemgr 2.21-103 103
contrail-openstack-analytics 2.21-103 103
contrail-openstack-control 2.21-103 103
contrail-openstack-database 2.21-103 103
contrail-openstack-webui 2.21-103 103
contrail-setup 2.21-103 103
contrail-utils 2.21-103 103
contrail-vmware-config 2.21-103 103
contrail-vmware-utils 2.21-103 103
contrail-web-controller 2.21-103 103
contrail-web-core 2.21-103 103
ifmap-python-client 0.1-2 103
ifmap-server 0.3.2-1contrail1 103
python-contrail 2.21-103 103
root@oblocknode02:/var#
root@oblocknode02:/var#

Tags: vmware
Revision history for this message
Sarath (nsarath) wrote :

-bash-4.1$
-bash-4.1$ hostname
centos-build02
-bash-4.1$
-bash-4.1$ pwd
/users/nsarath/PR/Bug-1519124
-bash-4.1$
-bash-4.1$ ls -l
total 2661268
-rwxrwxrwx 1 nsarath test 1072435200 Nov 23 13:32 Ctrl-A-log.tar*
-rwxrwxrwx 1 nsarath test 982466560 Nov 23 13:32 Ctrl-B-log.tar*
-rwxrwxrwx 1 nsarath test 433827840 Nov 23 13:31 Ctrl-C-log.tar*
-rwxrwxrwx 1 nsarath test 28579840 Nov 23 13:41 Vrtr-0-log.tar*
-rwxrwxrwx 1 nsarath test 27893760 Nov 23 13:31 Vrtr-1-log.tar*
-rwxrwxrwx 1 nsarath test 28026880 Nov 23 13:31 Vrtr-2-log.tar*
-rwxrwxrwx 1 nsarath test 25937920 Nov 23 13:31 Vrtr-3-log.tar*
-rwxrwxrwx 1 nsarath test 28692480 Nov 23 13:31 Vrtr-4-log.tar*
-rwxrwxrwx 1 nsarath test 28856320 Nov 23 13:31 Vrtr-5-log.tar*
-rwxrwxrwx 1 nsarath test 29194240 Nov 23 13:31 Vrtr-7-log.tar*
-rwxrwxrwx 1 nsarath test 28477440 Nov 23 13:31 Vrtr-8-log.tar*
-bash-4.1$

Revision history for this message
Sarath (nsarath) wrote :

Today Debugging this problem with Development (Sachin) and followup discussions below to be noted,
   1) Both Bug#1518137 and Bug#1519124, possibly narrow-down to same root-cause/theory and this is timing issue (inconsistent) depends on the sequence of DHCP request and VM powerON event and Plugin VM events processed and this could impact only for Guest-VM's which has less than 10 seconds max timeout for DHCP
   2) The possible contrail fix could help this if slightly lowering scan timer and Sachin prefer to put this only for 3.0 release to get more-tested and so for customer releases 2.21 and 2.22 we will release-note this bug for customer awareness

Release-Notes
#################
cirros-vm (or) any flavor of vm when having default of 9 seconds max timeout for DHCP from power-on, it may not get its IP address from Contrail DHCP during Power-on. The work-around is either to increase the DHCP max timeout configs (or) to request for new DHCP after power-on.

no longer affects: juniperopenstack/r3.0
Changed in juniperopenstack:
milestone: r3.0-fcs → none
Nischal Sheth (nsheth)
Changed in juniperopenstack:
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.