Comment 0 for bug 1665486

Revision history for this message
Piyush Srivastava (piyush0101) wrote :

After reboot of contrail services, it looks like contrail-schema tries to clean up stale routing instance objects. We were seeing a lot of errors related to schema not able to delete routing instance objects.

02/10/2017 08:54:41 PM [contrail-schema]: Error while deleting routing instance default-domain:wd5-ttint.az2.eng.pdx.wd:e2:e2: HTTP Status: 500 Content: Internal Server Error
02/10/2017 08:54:43 PM [contrail-schema]: Error while deleting routing instance default-domain:wd5-ttprod.az2.eng.pdx.wd:e2:e2: HTTP Status: 500 Content: Internal Server Error
02/10/2017 08:59:34 PM [contrail-schema]: Starting Introspect on HTTP Port 8087
02/10/2017 08:59:34 PM [contrail-schema]: Cannot write http_port 8087 to /tmp/contrail-schema.2826.http_port
02/10/2017 08:59:39 PM [contrail-schema]: Error while deleting routing instance default-domain:wd5-ttint.az2.eng.pdx.wd:e2:e2: HTTP Status: 500 Content: Internal Server Error
02/10/2017 08:59:39 PM [contrail-schema]: Error while deleting routing instance default-domain:wd5-ttprod.az2.eng.pdx.wd:e2:e2: HTTP Status: 500 Content: Internal Server Error

On closer inspection, we found out that the routing instance objects had 'fq_name' attribute missing which caused schema to throw exceptions and crash. As a side effect of this, tap interfaces for new VMs on openstack were not receiving a vrf and showing in ERROR state. To work around this problem we added the following patch to

/usr/lib/python2.6/site-packages/vnc_cfg_api_server/vnc_cfg_api_server.py

1313 obj_dict = self._db_conn.uuid_to_obj_dict(uuid)
1314 if 'fq_name' not in obj_dict: # patched line
1315 return (True, '') # patched line
1316 parent_fq_name = json.loads(obj_dict['fq_name'])[:-1]
1317 try:
1318 parent_uuid = self._db_conn.fq_name_to_uuid(
                                                             1319 parent_type, parent_fq_name)
1320 except NoIdError:

Why are the routing instance getting into a corrputed state and what is the proper fix for this issue?