Config backup not restored post upgrade to 3.1.3.0-85.

Bug #1729348 reported by vijaya kumar shankaran
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.1
Invalid
High
Sachin Bansal
R3.2
Invalid
High
Sachin Bansal
R4.0
Invalid
High
Sachin Bansal
R4.1
Invalid
High
Sachin Bansal
Trunk
Invalid
High
Sachin Bansal

Bug Description

As per the changes in the design of Cassandra DB post 3.0.3 for analytics DB and config DB customer is encountering an error in config restore.
Customer is running 3.1.3.0-85.Customer has split DB setup. It consists of 9 node cluster environment (3 control&config, 3 collectors, 3 databases Customer has created virtual network and backup Cassandra and Zookeeper
fab backup_cassandra_db
fab backup_zookeeper_data

they delete one of the VN in the above setup and updon restoring the config
fab restore_zookeeper_data
fab restore_cassandra_db

The deleted VN is not restored.
On single node this works fine. However on a distributed setup with separate DB. DB services on the config nodes are not restarted. DB services are only restarted on the Database nodes.

Is there way we can backup DB only from the config nodes instead of backing from Database node running analytics?

Testbed.py attached
cc21 = 'root@172.23.1.91'
cc22 = 'root@172.23.1.92'
cc23 = 'root@172.23.1.93' are the config nodes. From the below logs DB instances are stopped/started on the config nodes upon restoring config.

2017-10-05 15:51:20:539357: [root@172.23.1.91] sudo: python -c 'from platform import linux_distribution; print linux_distribution()'
2017-10-05 15:51:20:540400: [root@172.23.1.91] out: ('Ubuntu', '14.04', 'trusty')
2017-10-05 15:51:21:167493: [root@172.23.1.91] out:
2017-10-05 15:51:21:168026:
2017-10-05 15:51:21:168341: [root@172.23.1.91] sudo: service supervisor-config stop
2017-10-05 15:51:21:168694: [root@172.23.1.91] out: supervisor-config stop/waiting
2017-10-05 15:51:30:479161: [root@172.23.1.91] out:
2017-10-05 15:51:30:479569:
2017-10-05 15:51:30:480165: [root@172.23.1.91] sudo: service neutron-server stop
2017-10-05 15:51:30:480423: [root@172.23.1.91] out: neutron-server stop/waiting
2017-10-05 15:51:33:416024: [root@172.23.1.91] out:
2017-10-05 15:51:33:416410:
2017-10-05 15:51:33:416656: [root@172.23.1.91] sudo: service supervisor-support-service stop
2017-10-05 15:51:33:416886: [root@172.23.1.91] out: supervisor-support-service stop/waiting
2017-10-05 15:51:36:920370: [root@172.23.1.91] out:
2017-10-05 15:51:36:920720:
2017-10-05 15:51:36:920939: [root@172.23.1.92] Executing task 'stop_cfgm'
2017-10-05 15:51:36:921210: [root@172.23.1.92] sudo: python -c 'from platform import linux_distribution; print linux_distribution()'
2017-10-05 15:51:36:921675: [root@172.23.1.92] out: ('Ubuntu', '14.04', 'trusty')
2017-10-05 15:51:38:931073: [root@172.23.1.92] out:
2017-10-05 15:51:38:932036:
2017-10-05 15:51:38:932738: [root@172.23.1.92] sudo: service supervisor-config stop
2017-10-05 15:51:38:933712: [root@172.23.1.92] out: supervisor-config stop/waiting
2017-10-05 15:51:48:676875: [root@172.23.1.92] out:
2017-10-05 15:51:48:678389:
2017-10-05 15:51:48:682834: [root@172.23.1.92] sudo: service neutron-server stop
2017-10-05 15:51:48:683399: [root@172.23.1.92] out: neutron-server stop/waiting
2017-10-05 15:51:51:457170: [root@172.23.1.92] out:
2017-10-05 15:51:51:458414:
2017-10-05 15:51:51:459132: [root@172.23.1.92] sudo: service supervisor-support-service stop
2017-10-05 15:51:51:459657: [root@172.23.1.92] out: supervisor-support-service stop/waiting
2017-10-05 15:51:54:802415: [root@172.23.1.92] out:
2017-10-05 15:51:54:802787:
2017-10-05 15:51:54:803021: [root@172.23.1.93] Executing task 'stop_cfgm'
2017-10-05 15:51:54:803288: [root@172.23.1.93] sudo: python -c 'from platform import linux_distribution; print linux_distribution()'
2017-10-05 15:51:54:803683: [root@172.23.1.93] out: ('Ubuntu', '14.04', 'trusty')
2017-10-05 15:51:57:021561: [root@172.23.1.93] out:
2017-10-05 15:51:57:021929:
2017-10-05 15:51:57:022243: [root@172.23.1.93] sudo: service supervisor-config stop
2017-10-05 15:51:57:022597: [root@172.23.1.93] out: supervisor-config stop/waiting
2017-10-05 15:52:05:936336: [root@172.23.1.93] out:
2017-10-05 15:52:05:936898:
2017-10-05 15:52:05:937212: [root@172.23.1.93] sudo: service neutron-server stop
2017-10-05 15:52:05:937485: [root@172.23.1.93] out: neutron-server stop/waiting
2017-10-05 15:52:07:073111: [root@172.23.1.93] out:
2017-10-05 15:52:07:073530:
2017-10-05 15:52:07:073768: [root@172.23.1.93] sudo: service supervisor-support-service stop
2017-10-05 15:52:07:073980: [root@172.23.1.93] out: supervisor-support-service stop/waiting
2017-10-05 15:52:09:762165: [root@172.23.1.93] out:
2017-10-05 15:52:09:762547:
2017-10-05 15:52:09:765437: [root@172.23.1.130] Executing task 'stop_database'
2017-10-05 15:52:09:765811: [root@172.23.1.130] sudo: service contrail-database stop
2017-10-05 15:52:09:766069: [root@172.23.1.130] sudo: service supervisor-database stop
2017-10-05 15:52:13:539517: [root@172.23.1.130] out: supervisor-database stop/waiting
……………………………
2017-10-05 15:52:09:765437: [root@172.23.1.130] Executing task 'stop_database'
2017-10-05 15:52:09:765811: [root@172.23.1.130] sudo: service contrail-database stop
2017-10-05 15:52:09:766069: [root@172.23.1.130] sudo: service supervisor-database stop
2017-10-05 15:52:13:539517: [root@172.23.1.130] out: supervisor-database stop/waiting
2017-10-05 15:52:17:180344: [root@172.23.1.130] out:
2017-10-05 15:52:17:180812:
2017-10-05 15:52:17:181081: [root@172.23.1.131] Executing task 'stop_database'
2017-10-05 15:52:17:181325: [root@172.23.1.131] sudo: service contrail-database stop
2017-10-05 15:52:17:181598: [root@172.23.1.131] sudo: service supervisor-database stop
2017-10-05 15:52:19:696980: [root@172.23.1.131] out: supervisor-database stop/waiting
2017-10-05 15:52:31:606944: [root@172.23.1.131] out:
2017-10-05 15:52:31:607191:
2017-10-05 15:52:31:607500: [root@172.23.1.132] Executing task 'stop_database'
2017-10-05 15:52:31:607792: [root@172.23.1.132] sudo: service contrail-database stop
2017-10-05 15:52:31:607980: [root@172.23.1.132] sudo: service supervisor-database stop
2017-10-05 15:52:34:516530: [root@172.23.1.132] out: supervisor-database stop/waiting
2017-10-05 15:52:46:482278: [root@172.23.1.132] out:

Tags: config
Revision history for this message
vijaya kumar shankaran (vijayks) wrote :
information type: Proprietary → Public
Jeba Paulaiyan (jebap)
tags: added: config
Revision history for this message
vijaya kumar shankaran (vijayks) wrote :

HI,

Can we have this assigned?

Best Regards,
Vijay Kumar

Revision history for this message
vijaya kumar shankaran (vijayks) wrote :

Hi Jeba,

Can we have ticket assigned and worked upon. Customer is escalating the support ticket.

Best Regards,
Vijay Kumar

Revision history for this message
vijaya kumar shankaran (vijayks) wrote :

Hi Team,

Customer is looking for best practices to backup and restore config data in separate DB environment (config and Analytics). Customer confirmed that this works with JSON based export/import.

Do we have to backupt/restore with FAB or
Do we recommend JSON based export/import?

what is the recommended approach?

Best Regards,
Vijay Kumar

Revision history for this message
Ignatious Johnson Christopher (ijohnson-x) wrote :
Download full text (3.6 KiB)

Hi vijay,

1)In case cfgm and database are in different nodes

stopping of supervisor-database seems not stopping cassnadra on cfgm nodes
on stopping contrail-database, Cassandra in db_nodes is getting stopped.

Need to check which service stops and starts cfgm_cassandra or we need to stop/start Cassandra service explicitly

Restarting cassandra in cfgm nodes retrived the deleted right vn successfully

root@ntt-os1:~# openstack network list
+--------------------------------------+-------------------------+--------------------------------------+
| ID | Name | Subnets |
+--------------------------------------+-------------------------+--------------------------------------+
| 1d63ff42-3684-47dd-90c0-5915f3e7c16f | left | 671b353e-e79d-4b6d-bbb4-41d43656ecea |
| 51060bc0-fa70-4b40-8ca0-70167fb6bd9d | default-virtual-network | |
| eeaa6440-6093-46ef-a4a1-36894820aeca | ip-fabric | |
| 50b6113b-bcd1-453d-a189-1421210648ac | right | 4053df36-f4d5-4cb2-942f-482906b6f61f |
| 8168c719-bb1b-4fab-b22e-72e3b80742dc | __link_local__ | |
+--------------------------------------+-------------------------+--------------------------------------+

Thanks,
Aswani Kumar
From: Vijay Kumar Shankaran <email address hidden>
Date: Monday, 13 November 2017 at 1:47 PM
To: Aswani Kumar Gaddam <email address hidden>, Ignatious Johnson <email address hidden>
Cc: Madhava Rao Sudheendra Rao <email address hidden>, Sandeep Sridhar <email address hidden>
Subject: RE: need help on Launchpad Bug 1729348

Hi Aswani,

This issue is not fixed.

Setup details are as follows

Contrail UI

10.204.74.234:8080
admin/contrail

Openstack UI
10.204.74.240/horizon
admin/contrail

For shell access username\password is root\contrail.

Best Regards,
Vijay Kumar

From: Aswani Kumar Gaddam
Sent: Monday, November 13, 2017 12:53 PM
To: Vijay Kumar Shankaran <email address hidden>; Ignatious Johnson <email address hidden>
Cc: Sudheendra Rao <email address hidden>; Sandeep Sridhar <email address hidden>
Subject: Re: need help on Launchpad Bug 1729348

Hi vijay,

1)for fab
In the bug u mentioned ‘DB services on the config nodes are not restarted’
But there is a code checkin by ignatious which takes care of this
https://github.com/Juniper/contrail-fabric-utils/commit/d6682ad757e35fa170738570f9a99d1b3ced9947

If possible could you provide a setup.

2)I am not sure about 2nd one and best practice adding ignatious to this

From: Vijay Kumar Shankaran <email address hidden>
Date: Monday, 13 November 2017 at 12:21 PM
To: Aswani Kumar Gaddam <email address hidden>
Cc: Madhava Rao Sudheendra Rao <email address hidden>, Sandeep Sridhar <email address hidden>
Subject: need help on Launchpad Bug 1729348

Hi Aswani,

This is a follow up case of one of the issue where in
Cassandra was part of supervisor-database but now it is moved separately to contrail-database service and restarting the contrail-database fixed the issue.

In contrai...

Read more...

Revision history for this message
Sachin Bansal (sbansal) wrote :

There is no plan to support any other backup/restore at this point. We should use json based backup/restore as noted already. Closing the bug as invalid.

Revision history for this message
Randeep Jalli (rj2083) wrote :

Sachin, can you provide the procedure for json based backup/restore or a link to it?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.