db_manage.py reports clean_subnet_addr_alloc exception when running cleaner

Bug #1768265 reported by Ning Zhong on 2018-05-01
18
This bug affects 2 people
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.2
Fix Committed
High
Édouard Thuleau
R4.0
Fix Committed
High
Édouard Thuleau
R4.1
Fix Committed
High
Édouard Thuleau
R5.0
Fix Committed
High
Édouard Thuleau
Trunk
Fix Committed
High
Édouard Thuleau

Bug Description

For a contrail 3.2.9.0 version cluster database cleanup, we always see the following “clean_subnet_addr_alloc” exception on the 1st round of clean/heal. However, “db_manage –check” after the 1st time clean confirms the reported identity has already been removed. The 2nd run of clean/heal no longer observes any exception. I also attach the db_manage script, clean and heal log with error for your reference.

ERROR: Cleaner clean_subnet_addr_alloc: Exception, <class 'kazoo.exceptions.NoNodeError'>
Python 2.7.6: /usr/bin/python
Mon Apr 30 22:15:46 2018

A problem occurred in a Python script. Here is the sequence of function calls leading up to the error, in the order they occurred.

 /usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py in wrapper(*args=(<__main__.DatabaseCleaner object>,), **kwargs={})
 1380 self = args[0]
 1381 try:
 1382 errors = func(*args, **kwargs)
 1383 if not errors:
 1384 self._logger.info('Cleaner %s: Success' % func.__name__)
errors undefined
func = <function clean_subnet_addr_alloc>
args = (<__main__.DatabaseCleaner object>,)
kwargs = {}

 /usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py in clean_subnet_addr_alloc(self=<__main__.DatabaseCleaner object>)
 1824 self._zk_client.delete(path, recursive=True)
 1825 if path_no_mask != path:
 1826 self._zk_client.delete(path_no_mask, recursive=False)
 1827 if vn in zk_all_vns:
 1828 zk_all_vns[vn].pop(sn_key, None)
self = <__main__.DatabaseCleaner object>
self._zk_client = <kazoo.client.KazooClient object>
self._zk_client.delete = <bound method KazooClient.delete of <kazoo.client.KazooClient object>>
path_no_mask = u'/api-server/subnets/default-domain:Mobility_AT...-bd3d0306-12b0-4b57-a493-79be0b7eaad1:172.26.0.0'
recursive undefined
builtinFalse = False

 /usr/lib/python2.7/dist-packages/kazoo/client.py in delete(self=<kazoo.client.KazooClient object>, path=u'/api-server/subnets/default-domain:Mobility_AT...-bd3d0306-12b0-4b57-a493-79be0b7eaad1:172.26.0.0', version=-1, recursive=False)
 1157 return self._delete_recursive(path)
 1158 else:
 1159 return self.delete_async(path, version).get()
 1160
 1161 def delete_async(self, path, version=-1):
self = <kazoo.client.KazooClient object>
self.delete_async = <bound method KazooClient.delete_async of <kazoo.client.KazooClient object>>
path = u'/api-server/subnets/default-domain:Mobility_AT...-bd3d0306-12b0-4b57-a493-79be0b7eaad1:172.26.0.0'
version = -1
).get undefined

 /usr/lib/python2.7/dist-packages/kazoo/handlers/threading.py in get(self=<kazoo.handlers.threading.AsyncResult object>, block=True, timeout=None)
  105 if self._exception is None:
  106 return self.value
  107 raise self._exception
  108
  109 # if we get to this point we timeout
self = <kazoo.handlers.threading.AsyncResult object>
self._exception = NoNodeError((), {})
<class 'kazoo.exceptions.NoNodeError'>: ((), {})
    __class__ = <class 'kazoo.exceptions.NoNodeError'>
    __delattr__ = <method-wrapper '__delattr__' of NoNodeError object>
    __dict__ = {}
    __doc__ = None
    __format__ = <built-in method __format__ of NoNodeError object>
    __getattribute__ = <method-wrapper '__getattribute__' of NoNodeError object>
    __getitem__ = <method-wrapper '__getitem__' of NoNodeError object>
    __getslice__ = <method-wrapper '__getslice__' of NoNodeError object>
    __hash__ = <method-wrapper '__hash__' of NoNodeError object>
    __init__ = <method-wrapper '__init__' of NoNodeError object>
    __module__ = 'kazoo.exceptions'
    __new__ = <built-in method __new__ of type object>
    __reduce__ = <built-in method __reduce__ of NoNodeError object>
    __reduce_ex__ = <built-in method __reduce_ex__ of NoNodeError object>
    __repr__ = <method-wrapper '__repr__' of NoNodeError object>
    __setattr__ = <method-wrapper '__setattr__' of NoNodeError object>
    __setstate__ = <built-in method __setstate__ of NoNodeError object>
    __sizeof__ = <built-in method __sizeof__ of NoNodeError object>
    __str__ = <method-wrapper '__str__' of NoNodeError object>
    __subclasshook__ = <built-in method __subclasshook__ of type object>
    __unicode__ = <built-in method __unicode__ of NoNodeError object>
    __weakref__ = None
    args = ((), {})
    code = -101
    message = ''

The above is a description of an error in a Python program. Here is
the original traceback:

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py", line 1382, in wrapper
    errors = func(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py", line 1826, in clean_subnet_addr_alloc
    self._zk_client.delete(path_no_mask, recursive=False)
  File "/usr/lib/python2.7/dist-packages/kazoo/client.py", line 1159, in delete
    return self.delete_async(path, version).get()
  File "/usr/lib/python2.7/dist-packages/kazoo/handlers/threading.py", line 107, in get
    raise self._exception
NoNodeError: ((), {})

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py", line 1382, in wrapper
    errors = func(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py", line 1826, in clean_subnet_addr_alloc
    self._zk_client.delete(path_no_mask, recursive=False)
  File "/usr/lib/python2.7/dist-packages/kazoo/client.py", line 1159, in delete
    return self.delete_async(path, version).get()
  File "/usr/lib/python2.7/dist-packages/kazoo/handlers/threading.py", line 107, in get
    raise self._exception
NoNodeError: ((), {})
<class 'kazoo.exceptions.NoNodeError'>
Python 2.7.6: /usr/bin/python
Mon Apr 30 22:15:46 2018

A problem occurred in a Python script. Here is the sequence of
function calls leading up to the error, in the order they occurred.

 /usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py in <module>()
 2397 # end main
 2398
 2399
 2400 if __name__ == '__main__':
 2401 main()
main = <function main>

 /usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py in main()
 2380 verb = args.operation
 2381 if 'db_%s' % (verb) in globals():
 2382 return globals()['db_%s' % (verb)](args, api_args)
 2383
 2384 if getattr(DatabaseChecker, verb, None):
builtinglobals = <built-in function globals>
verb = 'clean'
args = Namespace(api_conf='/etc/contrail/contrail-api.c...e, execute=True, operation='clean', verbose=True)
api_args = Namespace(aaa_mode='rbac', admin_password='***', ad...e, worker_id='0', zk_server_ip='10.174.8.4:2181')

 /usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py in db_clean(args=Namespace(api_conf='/etc/contrail/contrail-api.c...e, execute=True, operation='clean', verbose=True), api_args=Namespace(aaa_mode='rbac', admin_password='***', ad...e, worker_id='0', zk_server_ip='10.174.8.4:2181'))
 2332 db_cleaner.clean_stale_virtual_network_id()
 2333 db_cleaner.clean_stale_security_group_id()
 2334 db_cleaner.clean_subnet_addr_alloc()
 2335
 2336
db_cleaner = <__main__.DatabaseCleaner object>
db_cleaner.clean_subnet_addr_alloc = <bound method DatabaseCleaner.clean_subnet_addr_alloc of <__main__.DatabaseCleaner object>>

 /usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py in wrapper(*args=(<__main__.DatabaseCleaner object>,), **kwargs={})
 1380 self = args[0]
 1381 try:
 1382 errors = func(*args, **kwargs)
 1383 if not errors:
 1384 self._logger.info('Cleaner %s: Success' % func.__name__)
errors undefined
func = <function clean_subnet_addr_alloc>
args = (<__main__.DatabaseCleaner object>,)
kwargs = {}

 /usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py in clean_subnet_addr_alloc(self=<__main__.DatabaseCleaner object>)
 1824 self._zk_client.delete(path, recursive=True)
 1825 if path_no_mask != path:
 1826 self._zk_client.delete(path_no_mask, recursive=False)
 1827 if vn in zk_all_vns:
 1828 zk_all_vns[vn].pop(sn_key, None)
self = <__main__.DatabaseCleaner object>
self._zk_client = <kazoo.client.KazooClient object>
self._zk_client.delete = <bound method KazooClient.delete of <kazoo.client.KazooClient object>>
path_no_mask = u'/api-server/subnets/default-domain:Mobility_AT...-bd3d0306-12b0-4b57-a493-79be0b7eaad1:172.26.0.0'
recursive undefined
builtinFalse = False

 /usr/lib/python2.7/dist-packages/kazoo/client.py in delete(self=<kazoo.client.KazooClient object>, path=u'/api-server/subnets/default-domain:Mobility_AT...-bd3d0306-12b0-4b57-a493-79be0b7eaad1:172.26.0.0', version=-1, recursive=False)
 1157 return self._delete_recursive(path)
 1158 else:
 1159 return self.delete_async(path, version).get()
 1160
 1161 def delete_async(self, path, version=-1):
self = <kazoo.client.KazooClient object>
self.delete_async = <bound method KazooClient.delete_async of <kazoo.client.KazooClient object>>
path = u'/api-server/subnets/default-domain:Mobility_AT...-bd3d0306-12b0-4b57-a493-79be0b7eaad1:172.26.0.0'
version = -1
).get undefined

 /usr/lib/python2.7/dist-packages/kazoo/handlers/threading.py in get(self=<kazoo.handlers.threading.AsyncResult object>, block=True, timeout=None)
  105 if self._exception is None:
  106 return self.value
  107 raise self._exception
  108
  109 # if we get to this point we timeout
self = <kazoo.handlers.threading.AsyncResult object>
self._exception = NoNodeError((), {})
<class 'kazoo.exceptions.NoNodeError'>: ((), {})
    __class__ = <class 'kazoo.exceptions.NoNodeError'>
    __delattr__ = <method-wrapper '__delattr__' of NoNodeError object>
    __dict__ = {}
    __doc__ = None
    __format__ = <built-in method __format__ of NoNodeError object>
    __getattribute__ = <method-wrapper '__getattribute__' of NoNodeError object>
    __getitem__ = <method-wrapper '__getitem__' of NoNodeError object>
    __getslice__ = <method-wrapper '__getslice__' of NoNodeError object>
    __hash__ = <method-wrapper '__hash__' of NoNodeError object>
    __init__ = <method-wrapper '__init__' of NoNodeError object>
    __module__ = 'kazoo.exceptions'
    __new__ = <built-in method __new__ of type object>
    __reduce__ = <built-in method __reduce__ of NoNodeError object>
    __reduce_ex__ = <built-in method __reduce_ex__ of NoNodeError object>
    __repr__ = <method-wrapper '__repr__' of NoNodeError object>
    __setattr__ = <method-wrapper '__setattr__' of NoNodeError object>
    __setstate__ = <built-in method __setstate__ of NoNodeError object>
    __sizeof__ = <built-in method __sizeof__ of NoNodeError object>
    __str__ = <method-wrapper '__str__' of NoNodeError object>
    __subclasshook__ = <built-in method __subclasshook__ of type object>
    __unicode__ = <built-in method __unicode__ of NoNodeError object>
    __weakref__ = None
    args = ((), {})
    code = -101
    message = ''

The above is a description of an error in a Python program. Here is
the original traceback:

Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py", line 2401, in <module>
    main()
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py", line 2382, in main
    return globals()['db_%s' % (verb)](args, api_args)
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py", line 2334, in db_clean
    db_cleaner.clean_subnet_addr_alloc()
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py", line 1382, in wrapper
    errors = func(*args, **kwargs)
  File "/usr/lib/python2.7/dist-packages/vnc_cfg_api_server/db_manage_6.py", line 1826, in clean_subnet_addr_alloc
    self._zk_client.delete(path_no_mask, recursive=False)
  File "/usr/lib/python2.7/dist-packages/kazoo/client.py", line 1159, in delete
    return self.delete_async(path, version).get()
  File "/usr/lib/python2.7/dist-packages/kazoo/handlers/threading.py", line 107, in get
    raise self._exception
NoNodeError: ((), {})

Ning Zhong (nzhong) wrote :
Changed in juniperopenstack:
importance: Undecided → High
Ning Zhong (nzhong) wrote :

pre-check before clean/heal

Ning Zhong (nzhong) wrote :

clean exception

Ning Zhong (nzhong) wrote :

heal log

Ning Zhong (nzhong) on 2018-05-01
tags: added: 2018-0420-0563
Ning Zhong (nzhong) on 2018-05-01
Changed in juniperopenstack:
milestone: none → r3.2.10.0
Jim Reilly (jpreilly) on 2018-05-01
information type: Private → Public
Ning Zhong (nzhong) wrote :

It also affects 3.0.3.3-22

Review in progress for https://review.opencontrail.org/42704
Submitter: Édouard Thuleau (<email address hidden>)

Reviewed: https://review.opencontrail.org/42704
Committed: http://github.com/Juniper/contrail-controller/commit/a772013b1f1bd7c3acb0acd00193761fcdb59c4b
Submitter: Zuul v3 CI (<email address hidden>)
Branch: master

commit a772013b1f1bd7c3acb0acd00193761fcdb59c4b
Author: Édouard Thuleau <email address hidden>
Date: Wed May 2 18:06:06 2018 +0200

db_manage v1.3: Fix clean_subnet_addr_alloc method

Don't build flatten VN/subnet list versions before stale VN were clean

Change-Id: I9247654b344fade58f3026cb5b3185cd8b044b83
Closes-Bug: #1768265

Review in progress for https://review.opencontrail.org/43946
Submitter: Édouard Thuleau (<email address hidden>)

Review in progress for https://review.opencontrail.org/43947
Submitter: Édouard Thuleau (<email address hidden>)

Review in progress for https://review.opencontrail.org/43948
Submitter: Édouard Thuleau (<email address hidden>)

Review in progress for https://review.opencontrail.org/43949
Submitter: Édouard Thuleau (<email address hidden>)

Reviewed: https://review.opencontrail.org/43946
Committed: http://github.com/Juniper/contrail-controller/commit/6bf45a5c40942599542fd26b9582cd20cf34fa2d
Submitter: Zuul v3 CI (<email address hidden>)
Branch: R5.0

commit 6bf45a5c40942599542fd26b9582cd20cf34fa2d
Author: Édouard Thuleau <email address hidden>
Date: Wed May 2 18:06:06 2018 +0200

db_manage v1.3: Fix clean_subnet_addr_alloc method

Don't build flatten VN/subnet list versions before stale VN were clean

Change-Id: I9247654b344fade58f3026cb5b3185cd8b044b83
Closes-Bug: #1768265
(cherry picked from commit a772013b1f1bd7c3acb0acd00193761fcdb59c4b)

Reviewed: https://review.opencontrail.org/43947
Committed: http://github.com/Juniper/contrail-controller/commit/7ca0176e53a135b258743e25dbf0507437f6cf96
Submitter: Zuul (<email address hidden>)
Branch: R4.1

commit 7ca0176e53a135b258743e25dbf0507437f6cf96
Author: Édouard Thuleau <email address hidden>
Date: Wed May 2 18:06:06 2018 +0200

db_manage v1.3: Fix clean_subnet_addr_alloc method

Don't build flatten VN/subnet list versions before stale VN were clean

Change-Id: I9247654b344fade58f3026cb5b3185cd8b044b83
Closes-Bug: #1768265
(cherry picked from commit a772013b1f1bd7c3acb0acd00193761fcdb59c4b)

Reviewed: https://review.opencontrail.org/43949
Committed: http://github.com/Juniper/contrail-controller/commit/194005f50374c631bc2e9ad9f465262a677f503a
Submitter: Zuul (<email address hidden>)
Branch: R3.2

commit 194005f50374c631bc2e9ad9f465262a677f503a
Author: Édouard Thuleau <email address hidden>
Date: Wed May 2 18:06:06 2018 +0200

db_manage v1.3: Fix clean_subnet_addr_alloc method

Don't build flatten VN/subnet list versions before stale VN were clean

Change-Id: I9247654b344fade58f3026cb5b3185cd8b044b83
Closes-Bug: #1768265
(cherry picked from commit a772013b1f1bd7c3acb0acd00193761fcdb59c4b)

Reviewed: https://review.opencontrail.org/43948
Committed: http://github.com/Juniper/contrail-controller/commit/4d2d2f25520a1b222a3623f67c27740ee001c909
Submitter: Zuul (<email address hidden>)
Branch: R4.0

commit 4d2d2f25520a1b222a3623f67c27740ee001c909
Author: Édouard Thuleau <email address hidden>
Date: Wed May 2 18:06:06 2018 +0200

db_manage v1.3: Fix clean_subnet_addr_alloc method

Don't build flatten VN/subnet list versions before stale VN were clean

Change-Id: I9247654b344fade58f3026cb5b3185cd8b044b83
Closes-Bug: #1768265
(cherry picked from commit a772013b1f1bd7c3acb0acd00193761fcdb59c4b)

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers