RYU CI fails with PortNotFound

Bug #1343750 reported by YAMAMOTO Takashi
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
Undecided
YAMAMOTO Takashi

Bug Description

recently merged "RPC additions to support DVR" change make ofagent CI fail.
https://review.openstack.org/#/c/102332/

http://180.37.183.32/ryuci/32/102332/38/check/check-tempest-dsvm-ofagent/d88f439/logs/screen-q-svc.txt.gz

2014-07-18 03:53:27.765 16087 ERROR oslo.messaging.rpc.dispatcher [req-a13fe10b-aae5-458a-a44f-564563368b56 ] Exception during message handling: Port 7d526916-5c could not be found
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/oslo.messaging/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher incoming.message))
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/oslo.messaging/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/oslo.messaging/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args)
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/neutron/neutron/plugins/ml2/rpc.py", line 191, in update_device_up
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher l3plugin.dvr_vmarp_table_update(rpc_context, port_id, "add")
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/neutron/neutron/db/l3_dvr_db.py", line 439, in dvr_vmarp_table_update
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher port_dict = self._core_plugin._get_port(context, port_id)
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/neutron/neutron/db/db_base_plugin_v2.py", line 109, in _get_port
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher raise n_exc.PortNotFound(port_id=id)
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher PortNotFound: Port 7d526916-5c could not be found
2014-07-18 03:53:27.765 16087 TRACE oslo.messaging.rpc.dispatcher

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/107884

Changed in neutron:
assignee: nobody → YAMAMOTO Takashi (yamamoto)
status: New → In Progress
tags: added: ml2 openflowagent
Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote : Re: DVR regression found by ryu ci

This has been broken for a while, long before any of the DVR patches merged; see patch:

https://review.openstack.org/#/c/84223/

It looks like Neutron RYU/ofagent has been broken for at least a few days. Last successful run was recorded around July 11th.

Hard to pinpoint the culprit.

summary: - DVR regression found by ryu ci
+ RYU CI fails with PortNotFound
Revision history for this message
YAMAMOTO Takashi (yamamoto) wrote :

Armando,

i guess you should take a look deeper than success/failure.
i cited the relevant part of the log in the bug description.

otoh, i have to admit our CI occasionally fails for unknown reasons.
"recheck-ryu" works in that case.

Revision history for this message
Itsuro Oda (oda-g) wrote :

yamamoto is right.

"/opt/stack/new/neutron/neutron/plugins/ml2/rpc.py", line 191, in update_device_up
l3plugin.dvr_vmarp_table_update(rpc_context, port_id, "add")

is regression.

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Itsuro, how so?

The method:

https://github.com/openstack/neutron/blob/master/neutron/db/l3_dvr_db.py#L433

does nothing to stop the ofagent from working. When the RYU CI succeeded last (PS55) the code looked exactly the same :(

https://review.openstack.org/#/c/84223/55/neutron/db/l3_dvr_db.py

Revision history for this message
Armando Migliaccio (armando-migliaccio) wrote :

Sorry I meant PS63 on review https://review.openstack.org/#/c/84223/

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/107884
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=8e31122d36ce5c9d367696921ed92c50cb062b5f
Submitter: Jenkins
Branch: master

commit 8e31122d36ce5c9d367696921ed92c50cb062b5f
Author: YAMAMOTO Takashi <email address hidden>
Date: Fri Jul 11 09:15:58 2014 +0900

    Fix DVR regression for ofagent

    Background:
        ML2 plugin sometimes uses truncated port uuids.
        For example, in the case of ofagent and linuxbridge,
        if port id is 804ceaa1-0e3e-11e4-b537-08606e7f74e7,
        an agent would send "tap804ceaa1-0e" to the plugin.
        ML2 plugin's _device_to_port_id() would restore it to
        "804ceaa1-0e". While it's still truncated, ML2 plugin's
        get_port() handles that by using "startswith".

    The recently merged DVR change (https://review.openstack.org/#/c/102332/)
    assumes that port_id is always a complete uuid (it's the case
    for openvswitch) and fails to handle the above mentioned case.
    This commit fixes the regression.

    Change-Id: I9c0845be606969068ab5d13c0165e76760378500
    Closes-Bug: #1343750

Changed in neutron:
status: In Progress → Fix Committed
Changed in neutron:
milestone: none → juno-2
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: juno-2 → 2014.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.