SM:R3.2 Build 6: Provision stops at compute_started, with management,vhost0 having same IP address

Bug #1646452 reported by sundarkh
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.2
Invalid
Medium
sundarkh
Trunk
Invalid
Medium
sundarkh

Bug Description

SM:R3.2 Build 6: Provision stops at compute_started, since node becomes un-reachable to puppet due to multiple interfaces having same IP address

1) Install R3.2 Build 6 Mitaka SM on a machine. Successfully installed
2) Reimage a target node (Here in Single node setup) with ubuntu-14-04. Reimage is successful. It creates a single interface as mentioned in the json file
3) Initiate Provision. Observe that Provision gets stuck with compute_started

Observations
-------------
1) After reimage : p2p1 has IP address; Able to reach puppet
2) After Issuing Provision : p2p1,vhost0 both have same IP address. So, route has two interfaces to reach the same destination, and looses connectivity to puppet
3) Please refer snapshots attached

4) Issue seen with Multi Interface setup as well, slight difference
--------------------------------------------------------------------------
after reimage
--------------
root@nodea4:~# ifconfig -a | grep 192.168.100 -B1
bond0 Link encap:Ethernet HWaddr 00:25:90:c4:98:a9
          inet addr:192.168.100.1 Bcast:192.168.100.255 Mask:255.255.255.0

After provision
----------------
root@nodea4:~# ifconfig -a | grep 192.168.100 -B1
bond0 Link encap:Ethernet HWaddr 00:25:90:c4:98:a9
          inet addr:192.168.100.1 Bcast:192.168.100.255 Mask:255.255.255.0
--
vhost0 Link encap:Ethernet HWaddr 00:25:90:c4:98:a9
          inet addr:192.168.100.1 Bcast:192.168.100.255 Mask:255.255.255.0
root@nodea4:~#

Here, connectivity to puppet is in-tact , but still it remains in compute_started state with error as follows

Dec 1 05:29:16 nodea4 puppet-agent[22677]: /bin/bash -c "python /opt/contrail/utils/provision_vrouter.py --host_name nodea4 --host_ip 192.168.100.1 --api_server_ip 192.168.100.4 --oper add --admin_user admin --admin_password contrail123 --admin_tenant_name admin --openstack_ip 192.168.100.4 && echo add-vnc-config >> /etc/contrail/contrail_compute_exec.out" returned 1 instead of one of [0]
Dec 1 05:29:16 nodea4 puppet-agent[22677]: (/Stage[compute]/Contrail::Compute::Add_vnc_config/Exec[add-vnc-config]/returns) change from notrun to 0 failed: /bin/bash -c "python /opt/contrail/utils/provision_vrouter.py --host_name nodea4 --host_ip 192.168.100.1 --api_server_ip 192.168.100.4 --oper add --admin_user admin --admin_password contrail123 --admin_tenant_name admin --openstack_ip 192.168.100.4 && echo add-vnc-config >> /etc/contrail/contrail_compute_exec.out" returned 1 instead of one of [0]
Dec 1 05:29:16 nodea4 pup

WorkAround:
-----------
1) Tried disabling the p2p1
interface using /sbin/ifconfig p2p1 down
ip link set dev p2p1 down

to avoid the duplicate route, but did not help

2) This behavior is seen in Ubuntu; For centos , provisioning does not reach Compute, but gets stuck in openstack_Completed state ; Seperate bug raised 1646491, 1646490

sundarkh (sundar-kh)
description: updated
summary: - SM:R3.2 Build 6: Provision stops at compute_started, since node becomes
- un-reachable to puppet due to multiple interfaces having same IP address
+ SM:R3.2 Build 6: Provision stops at compute_started, with multiple
+ interfaces having same IP address
Revision history for this message
sundarkh (sundar-kh) wrote : Re: SM:R3.2 Build 6: Provision stops at compute_started, with multiple interfaces having same IP address

After Provision, p2p1,vhost having same IP addresses

summary: - SM:R3.2 Build 6: Provision stops at compute_started, with multiple
- interfaces having same IP address
+ SM:R3.2 Build 6: Provision stops at compute_started, with
+ management,vhost0 having same IP address
sundarkh (sundar-kh)
description: updated
description: updated
sundarkh (sundar-kh)
description: updated
description: updated
Revision history for this message
Siva Gurumurthy (sgurumurthy) wrote :

We looked into this issue.

1. This issue is not reproducible in our both single and multi-node setup.
2. Sanity was going through and this issue was not seen. This was seen only in Sundar's setup

In Sundar's setup he had this issue in both single-node/single-interface setup as well as multi-node/multi-interface setup.

While we were looking into the single-node setup, Sundar tried build 7 in the multi-interface setup and this issue is not see anymore. Now he is trying build 7 in single-node setup as well.

Also observed that in Sundar's setup the network was very slow and even after we configured the gateway manually the gateway was not reachable from his single-node/single-interface node.

Revision history for this message
sundarkh (sundar-kh) wrote :

Issue not seen in Single node, Multi interface setups in Build 7

Revision history for this message
Siva Gurumurthy (sgurumurthy) wrote :

Moving it to incomplete state as this issue can happen again according to Dheeraj and Prasad.
If reproducible we can take a look at it

Revision history for this message
Jeba Paulaiyan (jebap) wrote :

As per Sundar this is not seen in latest builds and happened only in build 6

Revision history for this message
Sudheendra Rao (sudheendra-k) wrote :

The problem is not seen on R3.2 builds, but as per comment #4 keeping the bug in incomplete state but removed the R3.2.0.0 milestone

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.