ML2 | Exception from mechanism driver: TypeError: <type 'NoneType'> can't be decoded

Bug #1628408 reported by Omer Anson
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
DragonFlow
Fix Released
High
Hong Hui Xiao

Bug Description

The following exceptions appear in Neutron service due to Dragonflow mechanism driver.

See log file here: http://logs.openstack.org/59/376159/1/check/gate-dragonflow-dsvm-fullstack-ml2-nv/9cc9ef0/logs/screen-q-svc.txt.gz?level=ERROR

Exception:
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers [req-4adeb0aa-2826-4019-9ec7-653b675e4dd0 - -] Mechanism driver 'df' failed in update_port_postcommit
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers Traceback (most recent call last):
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers File "/opt/stack/new/neutron/neutron/plugins/ml2/managers.py", line 433, in _call_on_drivers
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers getattr(driver.obj, method_name)(context)
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers File "/opt/stack/new/dragonflow/dragonflow/db/neutron/lockedobjects_db.py", line 87, in wrap_db_lock
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers ctxt.reraise = True
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 220, in __exit__
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers self.force_reraise()
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers File "/usr/local/lib/python2.7/dist-packages/oslo_utils/excutils.py", line 196, in force_reraise
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers six.reraise(self.type_, self.value, self.tb)
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers File "/opt/stack/new/dragonflow/dragonflow/db/neutron/lockedobjects_db.py", line 84, in wrap_db_lock
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers result = f(*args, **kwargs)
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers File "/opt/stack/new/dragonflow/dragonflow/neutron/ml2/mech_driver.py", line 606, in update_port_postcommit
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers version=updated_port['db_version'])
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers File "/opt/stack/new/dragonflow/dragonflow/db/api_nb.py", line 489, in update_lport
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers lport = jsonutils.loads(lport_json)
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers File "/usr/local/lib/python2.7/dist-packages/oslo_serialization/jsonutils.py", line 241, in loads
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers return json.loads(encodeutils.safe_decode(s, encoding), **kwargs)
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers File "/usr/local/lib/python2.7/dist-packages/oslo_utils/encodeutils.py", line 39, in safe_decode
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers raise TypeError("%s can't be decoded" % type(text))
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers TypeError: <type 'NoneType'> can't be decoded
2016-09-26 05:02:58.598 26155 ERROR neutron.plugins.ml2.managers
2016-09-26 05:02:58.600 26155 ERROR neutron.plugins.ml2.plugin [req-4adeb0aa-2826-4019-9ec7-653b675e4dd0 - -] mechanism_manager.update_port_postcommit failed for port bcf8a342-8867-40bb-8ed9-384369206fda

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to dragonflow (master)

Fix proposed to branch: master
Review: https://review.openstack.org/379182

Revision history for this message
Hong Hui Xiao (xiaohhui) wrote :

The cause of this error log is because concurrent operation. The steps are

1) A router is created.
2) router interface is added, and neutron will schedule the router to a l3-agent. It is the df-l3-agent in dragonflow's case. See [1]
3) When the l3-agent is chosen, and it will sync router from neutron-server. At the same time, it will update the router interface's bind-host, see [2].
4) In the update_port, neutron will check the port existence at the very beginning, see [3]
5) But the fullstack test will delete router soon after it is created. So, let's assume that the router and its interfaces are deleted after [3] but before [4]. From the log, DELETE request can be observed before the error update_port_postcommit. So, the port will be deleted from nb db before code hit [4]
6) When code hit [4], it will report error as port can't be found.

[1] https://github.com/openstack/neutron/blob/a80b89b6fe1611a68d34684d9e80ad606f115366/neutron/api/rpc/agentnotifiers/l3_rpc_agent_api.py#L101

[2] https://github.com/openstack/neutron/blob/a80b89b6fe1611a68d34684d9e80ad606f115366/neutron/api/rpc/handlers/l3_rpc.py#L152-L155

[3] https://github.com/openstack/neutron/blob/a80b89b6fe1611a68d34684d9e80ad606f115366/neutron/plugins/ml2/plugin.py#L1390-L1392

[4] https://github.com/openstack/neutron/blob/a80b89b6fe1611a68d34684d9e80ad606f115366/neutron/plugins/ml2/plugin.py#L1484

Revision history for this message
Hong Hui Xiao (xiaohhui) wrote :

The reason that the fullstack don't fail, is because the update_port is not an explicit call from fullstack. It is just a background call from l3-agent to neutron-server. And neutron ml2 will gracefully ignore exceptions in the *post_commit event.

https://review.openstack.org/#/c/379182/

Changed in dragonflow:
status: New → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to dragonflow (master)

Reviewed: https://review.openstack.org/379182
Committed: https://git.openstack.org/cgit/openstack/dragonflow/commit/?id=b06a885b7660e24b837595d5ff3ba14fcdba4ff3
Submitter: Jenkins
Branch: master

commit b06a885b7660e24b837595d5ff3ba14fcdba4ff3
Author: Hong Hui Xiao <email address hidden>
Date: Thu Sep 29 14:36:12 2016 +0800

    Check port existence before updating it in NB DB

    The logical port might already been deleted due to concurrent operation.
    DF mech_driver should check its existence before updating it.

    Change-Id: I110b4d3adcd5c169a65917422c1f2c8fc3bac993
    Close-Bug: #1628408

Omer Anson (omer-anson)
Changed in dragonflow:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to dragonflow (master)

Fix proposed to branch: master
Review: https://review.openstack.org/387868

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on dragonflow (master)

Change abandoned by Hong Hui Xiao (<email address hidden>) on branch: master
Review: https://review.openstack.org/387868

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.