2019-01-31 11:19:28 |
Achuth M |
description |
Hostconfig periodic task maybe stuck in processing state (like during a neutron server reboot scenario or one of the nutron process died ) and there is no way to unlock the task within the n-odl driver to set it back to pending for such scenarios.
This will eventually cause ODL agents to be marked as down.
mysql> select * from opendaylight_periodic_task;
+------------+-----------------------------+-------------+---------------------+
| state | processing_operation | task | lock_updated |
+------------+-----------------------------+-------------+---------------------+
| processing | _get_and_update_hostconfigs | hostconfig | 2018-12-03 05:31:39 | ====> 05:31:39
| pending | NULL | maintenance | 2018-12-03 08:16:18 | ====> 08:16:18
+------------+-----------------------------+-------------+---------------------+
2 rows in set (0.00 sec)
Neutron Log Snippet
WARNING neutron.db.agents_db [req-c872e719-1268-4aff-853f-9a954f05cecc - - - - -] Agent healthcheck: found 3 dead agents out of 9:
Type Last heartbeat host
ODL L2 2018-12-03 07:53:58 compute-0-3.domain.tld
ODL L2 2018-12-03 07:52:32 compute-0-1.domain.tld
ODL L2 2018-12-03 07:54:09 compute-0-2.domain.tld |
Issue reproduced with Openstack networking-odl stable/pike release ( but could be applicable to later releases as weell as per description below)
Frequency - Low
It is observed that Hostconfig periodic task maybe stuck in processing state (like during a neutron server reboot scenario or one of the neutron process died ) and there is no way to unlock the task within the n-odl driver to set it back to pending for such scenarios.
This will eventually cause ODL agents to be marked as down.
mysql> select * from opendaylight_periodic_task;
+------------+-----------------------------+-------------+---------------------+
| state | processing_operation | task | lock_updated |
+------------+-----------------------------+-------------+---------------------+
| processing | _get_and_update_hostconfigs | hostconfig | 2018-12-03 05:31:39 | ====> 05:31:39
| pending | NULL | maintenance | 2018-12-03 08:16:18 | ====> 08:16:18
+------------+-----------------------------+-------------+---------------------+
2 rows in set (0.00 sec)
Neutron Log Snippet
WARNING neutron.db.agents_db [req-c872e719-1268-4aff-853f-9a954f05cecc - - - - -] Agent healthcheck: found 3 dead agents out of 9:
Type Last heartbeat host
ODL L2 2018-12-03 07:53:58 compute-0-3.domain.tld
ODL L2 2018-12-03 07:52:32 compute-0-1.domain.tld
ODL L2 2018-12-03 07:54:09 compute-0-2.domain.tld
It is not certain as to what is the exact cause of the problem for state to remain in processing but there needs to be a mechanism for the task to come out
of the processing state beyond an interval and prevent the ODL L2 agent from
going down permanently |
|