[4.1.0.0-48~mitaka] vcenter-only provisioning: Rabbitmq cluster not forming and controller and vcplugin containers keep restarting

Bug #1732860 reported by Pavana
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R4.1
Fix Committed
Critical
Abhay Joshi
Trunk
Fix Committed
Critical
Abhay Joshi

Bug Description

Issue seen on a fresh install of build 4.1.0.0-48~mitaka ubuntu14.04 on a multi-node vcenter-only setup

root@nodec4(controller):~# rabbitmqctl cluster_status
Cluster status of node rabbit@nodec4 ...
[{nodes,[{disc,[rabbit@nodec4]}]},
 {running_nodes,[rabbit@nodec4]},
 {cluster_name,<<"<email address hidden>">>},
 {partitions,[]}]

root@nodec4:~# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
578d5d59a006 10.204.216.61:5100/ubuntu14vcenter48-contrail-vcenter-plugin:48 "/bin/sh -c /entry..." About an hour ago Up 46 seconds vcplugin
4e024e3f4ba7 10.204.216.61:5100/ubuntu14vcenter48-contrail-analytics:48 "/bin/sh -c /entry..." About an hour ago Up About an hour analytics
3a9bb330d392 10.204.216.61:5100/ubuntu14vcenter48-contrail-analyticsdb:48 "/bin/sh -c /entry..." About an hour ago Up About an hour analyticsdb
79c7de4756f4 10.204.216.61:5100/ubuntu14vcenter48-contrail-controller:48 "/bin/sh -c /entry..." 2 hours ago Up 19 seconds controller
89b4e6adff6d registry:2 "/entrypoint.sh /e..." 2 hours ago Up 2 hours registry

root@nodec4(controller):~# root@nodec4:~#
root@nodec4:~# docker ps -a
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
578d5d59a006 10.204.216.61:5100/ubuntu14vcenter48-contrail-vcenter-plugin:48 "/bin/sh -c /entry..." 2 hours ago Up 4 minutes vcplugin
4e024e3f4ba7 10.204.216.61:5100/ubuntu14vcenter48-contrail-analytics:48 "/bin/sh -c /entry..." 2 hours ago Up 2 hours analytics
3a9bb330d392 10.204.216.61:5100/ubuntu14vcenter48-contrail-analyticsdb:48 "/bin/sh -c /entry..." 2 hours ago Up 2 hours analyticsdb
79c7de4756f4 10.204.216.61:5100/ubuntu14vcenter48-contrail-controller:48 "/bin/sh -c /entry..." 2 hours ago Up About a minute controller
89b4e6adff6d registry:2 "/entrypoint.sh /e..." 2 hours ago Up 2 hours registry

root@nodec4(controller):~# service rabbitmq-server status
Status of node rabbit@nodec4 ...
[{pid,20427},
 {running_applications,[{rabbit,"RabbitMQ","3.5.0"},
                        {os_mon,"CPO CXC 138 46","2.2.14"},
                        {mnesia,"MNESIA CXC 138 12","4.11"},
                        {xmerl,"XML parser","1.3.5"},
                        {sasl,"SASL CXC 138 11","2.3.4"},
                        {stdlib,"ERTS CXC 138 10","1.19.4"},
                        {kernel,"ERTS CXC 138 10","2.16.4"}]},
 {os,{unix,linux}},
 {erlang_version,"Erlang R16B03 (erts-5.10.4) [source] [64-bit] [smp:4:4] [async-threads:30] [kernel-poll:true]\n"},
 {memory,[{total,58353960},
          {connection_readers,141208},
          {connection_writers,40776},
          {connection_channels,90848},
          {connection_other,278632},
          {queue_procs,206040},
          {queue_slave_procs,0},
          {plugins,0},
          {other_proc,13417016},
          {mnesia,91512},
          {mgmt_db,0},
          {msg_index,46496},
          {other_ets,785320},
          {binary,21778664},
          {code,16351158},
          {atom,561761},
          {other_system,4564529}]},
 {alarms,[]},
 {listeners,[{clustering,25672,"::"},{amqp,5672,"0.0.0.0"}]},
 {vm_memory_high_watermark,0.4},
 {vm_memory_limit,13483900928},
 {disk_free_limit,50000000},
 {disk_free,421375651840},
 {file_descriptors,[{total_limit,3996},
                    {total_used,18},
                    {sockets_limit,3594},
                    {sockets_used,16}]},
 {processes,[{limit,1048576},{used,302}]},
 {run_queue,0},
 {uptime,83}]

Revision history for this message
Ignatious Johnson Christopher (ijohnson-x) wrote :

+ Abhay

Hi Pavana,

All the rabbit nodes are started blank and have formed independent clusters.

=WARNING REPORT==== 17-Nov-2017::10:49:25 ===
Could not find any node for auto-clustering from: [rabbit@puppet,
                                                   rabbit@nodec5,
                                                   rabbit@nodec6]
Starting blank node...

=INFO REPORT==== 17-Nov-2017::10:49:26 ===

This is due to the reason that the hostname for first node is populated incorrectly in the /etc/rabbitmq/rabbitmq.config.
Which is due to the presence of an incorrect host file entry.

root@nodec5:~# cat /etc/hosts

# The following lines are desirable for IPv6 capable hosts
::1 localhost ip6-localhost ip6-loopback
ff02::1 ip6-allnodes
ff02::2 ip6-allrouters
127.0.0.1 localhost.englab.juniper.net localhost
10.204.216.62 nodec5.englab.juniper.net nodec5
10.204.216.61 puppet
127.0.0.1 nodec5
10.204.216.63 nodec6
10.204.216.62 nodec5
10.204.216.61 nodec4
root@nodec5:~#

Abhay,

I guess "10.204.216.61 puppet" populated by smlite, followed by these entries using contrail-ansible.
10.204.216.63 nodec6
10.204.216.62 nodec5
10.204.216.61 nodec4

If "10.204.216.61 puppet” is not required, do not add it.
If required, contrail-ansbile should add these entries before the line "10.204.216.61 puppet”

Thanks,
Ignatious

Changed in juniperopenstack:
importance: Undecided → Critical
milestone: none → r5.0.0
Pavana (pavanap)
information type: Proprietary → Public
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R4.1

Review in progress for https://review.opencontrail.org/37855
Submitter: kamlesh parmar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/37856
Submitter: kamlesh parmar (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/37855
Committed: http://github.com/Juniper/contrail-ansible/commit/751925ecbd9e641d0fcf2fd9c8cc00b3667c2bb9
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit 751925ecbd9e641d0fcf2fd9c8cc00b3667c2bb9
Author: Kamlesh Parmar <email address hidden>
Date: Fri Nov 24 23:16:16 2017 -0800

remove puppet entry from /etc/hosts before rabbitmq hosts and
add it after rabbitmq hosts are added.
Closes-Bug: #1732860

Change-Id: I40a9339af1f2d4f95663f3cb23f8e8d960e7f67c

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/37856
Committed: http://github.com/Juniper/contrail-ansible/commit/d7d8de4f181f74a5a58ea0b05db4dd2e48864baf
Submitter: Zuul (<email address hidden>)
Branch: master

commit d7d8de4f181f74a5a58ea0b05db4dd2e48864baf
Author: Kamlesh Parmar <email address hidden>
Date: Fri Nov 24 23:16:16 2017 -0800

remove puppet entry from /etc/hosts before rabbitmq hosts and
add it after rabbitmq hosts are added.
Closes-Bug: #1732860

Change-Id: I40a9339af1f2d4f95663f3cb23f8e8d960e7f67c

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.