Contrail: upgrade from 2.20 build 16 to build 27 , caused some services on controller to not start after upgrade

Bug #1458290 reported by Suni Eapen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R2.20
Fix Committed
High
Shweta Naik
Trunk
Fix Committed
High
Shweta Naik

Bug Description

I did a contrail upgrade from build 16 to build 27
Many service are stuck in initializing?
My servers are cd-st-lnxserver-07 and TSN cd-st-lnxserver-08

root@cd-st-lnxserver-07:~# contrail-status
== Contrail vRouter ==
supervisor-vrouter: active
contrail-vrouter-agent initializing (Collector, Discovery:Collector, Discovery:dns-server, Discovery:xmpp-server connection down)
contrail-vrouter-nodemgr active

== Contrail Control ==
supervisor-control: active
contrail-control initializing (Number of connections:4, Expected:5)
contrail-control-nodemgr active
contrail-dns active
contrail-named active

== Contrail Analytics ==
supervisor-analytics: active
contrail-analytics-api initializing (Discovery:OpServer connection down)
contrail-analytics-nodemgr active
contrail-collector initializing (Database:cd-st-lnxserver-07:Global, Discovery:Collector connection down)
contrail-query-engine timeout
contrail-snmp-collector active
contrail-topology active

== Contrail Config ==
supervisor-config: active
contrail-api:0 initializing (Discovery:Collector, Collector, Database:Cassandra connection down)
contrail-config-nodemgr active
contrail-device-manager initializing (Collector, Discovery:Collector, Database:Cassandra connection down)
contrail-discovery:0 active
contrail-schema backup
contrail-svc-monitor initializing (Discovery:Collector, Database:Database, Collector connection down)
ifmap active

== Contrail Web UI ==
supervisor-webui: active
contrail-webui active
contrail-webui-middleware active

== Contrail Database ==
supervisor-database: active
contrail-database inactive
contrail-database-nodemgr initializing (Disk for analytics db is too low, cassandra stopped. Cassandra state detected DOWN.)
kafka active

== Contrail Support Services ==
supervisor-support-service: active
rabbitmq-server active

##############TSN###################

root@cd-st-lnxserver-08:~# contrail-status
== Contrail vRouter ==
supervisor-vrouter: active
contrail-tor-agent-1 initializing (Collector, Discovery:Collector, Discovery:dns-server, Discovery:xmpp-server, ToR:st-96s-p2-30 connection down Number of connections:5, Expected: 4)
contrail-tor-agent-2 initializing (Collector, Discovery:Collector, Discovery:dns-server, Discovery:xmpp-server, ToR:st-vc20-48s-p2-02 connection down Number of connections:5, Expected: 4)
contrail-vrouter-agent initializing (Collector, Discovery:Collector, Discovery:dns-server, Discovery:xmpp-server connection down)
contrail-vrouter-nodemgr active

Tags: blocker bms qfx
Suni Eapen (seapen)
affects: canonical-identity-provider → juniperopenstack
Changed in juniperopenstack:
importance: Undecided → High
Suni Eapen (seapen)
tags: added: blocker bms qfx
Revision history for this message
Senthilnathan Murugappan (msenthil) wrote :

Hi Suni,

Your setup should be up now.
Your setup doesn’t have the default minimum disk requirement 256GB.

Have set the minimum disk req to be of 128G in database conf. Going forward please set minimum_diskGB in testbed.py before initial provisioning.

root@cd-st-lnxserver-07:~# cat /etc/contrail/contrail-database-nodemgr.conf
[DEFAULT]
minimum_diskgb=128

Thanks,
Senthil

-----Original Message-----
From: Raj Sahu
Sent: Monday, May 25, 2015 4:31 PM
To: Suni Eapen
Cc: Chhandak Mukherjee; Hari Prasad Killi; Manish Singh; Ashish Ranjan; Rahul Kasralikar; Senthilnathan Murugappan
Subject: Re: Contrail upgrade went through, but many services stuck in initializing - xmpp server connection down

Ashish is there any known pr?
Can someone help Suni recover?

,
+folks

Raj

Changed in juniperopenstack:
status: New → Invalid
Changed in juniperopenstack:
status: Invalid → New
status: New → Triaged
assignee: nobody → Shweta Naik (stnaik)
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : R2.20

Review in progress for https://review.opencontrail.org/10952
Submitter: Shweta Naik (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : master

Review in progress for https://review.opencontrail.org/10953
Submitter: Shweta Naik (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/10953
Committed: http://github.org/Juniper/contrail-provisioning/commit/b4b161f3d9c38623bafb2d1c68387760ee7f02d5
Submitter: Zuul
Branch: master

commit b4b161f3d9c38623bafb2d1c68387760ee7f02d5
Author: Shweta Naik <email address hidden>
Date: Wed May 27 16:58:36 2015 -0700

1. Stopping the supervisor-vrouter during upgrade only if the service is running
2. Setting the minimum_diskGB in contrail-database-nodemgr.conf during upgrade
Closes-Bug:1458290

Change-Id: Iae878dc45d7636982552915ff63ecd57a9e09201

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/10952
Committed: http://github.org/Juniper/contrail-provisioning/commit/a9d1f49609da6a12797d0ee397f4d5ecdfce9b84
Submitter: Zuul
Branch: R2.20

commit a9d1f49609da6a12797d0ee397f4d5ecdfce9b84
Author: Shweta Naik <email address hidden>
Date: Wed May 27 16:58:36 2015 -0700

1. Stopping the supervisor-vrouter during upgrade only if the service is running
2. Setting the minimum_diskGB in contrail-database-nodemgr.conf during upgrade
Closes-Bug:1458290

Change-Id: Iae878dc45d7636982552915ff63ecd57a9e09201

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.