Intermittent sm provision failure on api down due to cassandra issue

Bug #1732005 reported by wenqing liang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R4.0
Incomplete
High
wenqing liang
R4.1
Incomplete
High
wenqing liang
Trunk
Incomplete
High
wenqing liang

Bug Description

Seen in CB R16.04.2-107 newton, but not in following builds 109 and 111.

== Contrail Config ==
contrail-api: initializing (Collector, Database:Cassandra[] connection down)
contrail-schema: initializing (ApiServer:ApiServer[] connection down)
contrail-svc-monitor: initializing (ApiServer:ApiServer[] connection down)
contrail-device-manager: initializing (ApiServer:ApiServer[] connection down)
contrail-config-nodemgr: active
== Contrail Config ==
contrail-api: initializing
contrail-schema: initializing (ApiServer:ApiServer[] connection down)
contrail-svc-monitor: initializing (ApiServer:ApiServer[] connection down)
contrail-device-manager: initializing (ApiServer:ApiServer[] connection down)
contrail-config-nodemgr: active
== Contrail Config Database==
contrail-database: active

In the debug.log we see:
"2017-11-08 21:14:42,433-INFO-sm_ansible_callback.py:43-append(): fatal: [10.0.0.9]Traceback (most recent call last):
  File "/usr/share/contrail-utils/provision_analytics_node.py", line 176, in <module>
    main()
  File "/usr/share/contrail-utils/provision_analytics_node.py", line 172, in main
    AnalyticsNodeProvisioner(args_str)
  File "/usr/share/contrail-utils/provision_analytics_node.py", line 45, in __init__
    fq_name=['default-global-system-config'])
  File "/usr/lib/python2.7/dist-packages/vnc_api/vnc_api.py", line 42, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/vnc_api/vnc_api.py", line 532, in _object_read
    res_type, fq_name, fq_name_str, id, ifmap_id)
  File "/usr/lib/python2.7/dist-packages/vnc_api/vnc_api.py", line 859, in _read_args_to_id
    return (True, self.fq_name_to_id(res_type, fq_name))
  File "/usr/lib/python2.7/dist-packages/vnc_api/vnc_api.py", line 42, in wrapper
    return func(self, *args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/vnc_api/vnc_api.py", line 1114, in fq_name_to_id
    raise he

When I run contrail-api in foreground It crashes with error:
Traceback (most recent call last):
  File "/usr/bin/contrail-api", line 9, in <module>
    load_entry_point('vnc-cfg-api-server==0.1.dev0', 'console_scripts', 'contrail-api')()
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/vnc_cfg_api_server.py", line 3551, in server_main
    main(args_str, VncApiServer(args_str))
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/vnc_cfg_api_server.py", line 1446, in __init__
    self._db_init_entries()
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/vnc_cfg_api_server.py", line 2693, in _db_init_entries
    config_version=CONFIG_VERSION))
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/vnc_cfg_api_server.py", line 2819, in _create_singleton_entry
    cass_uuid = self._db_conn._object_db.fq_name_to_uuid(obj_type, fq_name)
  File "/usr/lib/python2.7/dist-packages/cfgm_common/vnc_cassandra.py", line 484, in wrapper
    return func(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/cfgm_common/vnc_cassandra.py", line 1310, in fq_name_to_uuid
    raise VncError('Multi match %s for %s' % (fq_name_str, obj_type))
VncError: Multi match default-global-system-config for global_system_config

            "provision_role_sequence": "{'completed': [(u'server3', 'keepalived', '2017_11_08__20_30_33'), (u'server1', 'keepalived', '2017_11_08__200_36'), (u'server2', 'keepalived', '2017_11_08__20_30_38'), (u'server3', 'haproxy', '2017_11_08__20_31_05'), (u'server1', 'haproxy', '2017_11_08__20__09'), (u'server2', 'haproxy', '2017_11_08__20_31_12'), (u'server3', 'openstack', '2017_11_08__20_42_07'), (u'server1', 'openstack', '2017_11_08__20__14'), (u'server2', 'openstack', '2017_11_08__20_45_33'), (u'server2', 'pre_exec_vnc_galera', '2017_11_08__20_46_28'), (u'server1', 'pre_exec_vnc_gala', '2017_11_08__20_47_28'), (u'server3', 'pre_exec_vnc_galera', '2017_11_08__20_47_55'), (u'server2', 'post_exec_vnc_galera', '2017_11_08__20_48_33' (u'server1', 'post_exec_vnc_galera', '2017_11_08__20_49_06'), (u'server3', 'post_exec_vnc_galera', '2017_11_08__20_49_34'), (u'server1', 'post_provion', '2017_11_08__20_50_09'), (u'server2', 'post_provision', '2017_11_08__20_50_36'), (u'server3', 'post_provision', '2017_11_08__20_50_38')], 'steps []}",

root@servermanager:~# server-manager status server

+----------+---------------------+------------+-------------------+
| id | status | ip_address | mac_address |
+----------+---------------------+------------+-------------------+
| server2 | provision_completed | 10.0.0.5 | 02:A1:6D:72:65:B6 |
| server4 | provision_failed | 10.0.0.7 | 02:6D:CD:36:01:9A |
| server1 | provision_completed | 10.0.0.4 | 02:6F:55:2B:06:A4 |
| server6 | provision_failed | 10.0.0.9 | 02:62:63:CD:E1:51 |
| server9 | provision_failed | 10.0.0.12 | 02:B4:9F:B0:99:90 |
| server8 | provision_failed | 10.0.0.11 | 02:22:CA:66:15:86 |
| server10 | provision_failed | 10.0.0.13 | 02:B1:84:1B:0F:2F |
| server5 | provision_failed | 10.0.0.8 | 02:71:B2:B6:09:1B |
| server7 | provision_failed | 10.0.0.10 | 02:34:37:29:74:63 |
| server3 | provision_completed | 10.0.0.6 | 02:A9:9B:11:49:15 |
+----------+---------------------+------------+-------------------+
root@servermanager:~#

/var/log/contrail/ and /var/log/cassandra/ from all three nodes are being uploaded to /cs-shared/bugs/.

Tags: config
Revision history for this message
Ignatious Johnson Christopher (ijohnson-x) wrote :

Need the collect the /var/log/contrail/ and /var/log/cassandra/ from all three controller containers.

Please share the setup/above logs when we hit this issue again.

tags: added: config
removed: analytics
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.