fdb_chg_ip_tun throwing exception because fdb_entries not in correct format

Bug #1538387 reported by Brian Haley on 2016-01-27
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
High
Kevin Benton
Kilo
Undecided
Unassigned

Bug Description

I've been trying to track down failures in the DVR multinode job. I'm now tripping over this one.

For context I've been focusing on a single change, but if you see a failure in the gate-tempest-dsvm-neutron-dvr-multinode-full job you'll probably be able to find similar info. This is the change:

http://logs.openstack.org/77/177777/4/check/gate-tempest-dsvm-neutron-dvr-multinode-full/5abca7b/logs/

The screen-q-agt log shows a traceback here:

http://logs.openstack.org/77/177777/4/check/gate-tempest-dsvm-neutron-dvr-multinode-full/5abca7b/logs/screen-q-agt.txt.gz#_2016-01-18_10_11_29_715

<snip>
2016-01-18 10:11:29.724 12932 ERROR oslo_messaging.rpc.dispatcher File "/opt/stack/new/neutron/neutron/plugins/ml2/drivers/l2pop/rpc_manager/l2population_rpc.py", line 312, in fdb_chg_ip_tun
2016-01-18 10:11:29.724 12932 ERROR oslo_messaging.rpc.dispatcher mac_ip.mac_address,
2016-01-18 10:11:29.724 12932 ERROR oslo_messaging.rpc.dispatcher AttributeError: 'list' object has no attribute 'mac_address'
2016-01-18 10:11:29.724 12932 ERROR oslo_messaging.rpc.dispatcher

The info passed to fdb_chg_ip_tun() should have a "PortInfo" namedtuple as data, but from the line before we can see it doesn't:

DEBUG neutron.plugins.ml2.drivers.l2pop.rpc_manager.l2population_rpc [req-671e8634-c753-4002-acfd-68515dd44f29 None None] neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent.OVSNeutronAgent method fdb_chg_ip_tun called with arguments (<neutron.context.Context object at 0x7fa2ee6988d0>, <neutron.plugins.ml2.drivers.openvswitch.agent.openflow.ovs_ofctl.br_tun.DeferredOVSTunnelBridge object at 0x7fa2ee79abd0>, {u'c4ae0757-e3e5-419c-b2ba-4d7817964237': {u'10.209.192.28': {u'before': [[u'fa:16:3e:8d:2e:48', u'2003:0:0:1::1']]}}}, '10.208.193.94', {u'7ca0dcf2-fb63-4959-92ee-cc757da8d120':

So from this it's clear that _unmarshall_fdb_entries() either hasn't been called, or didn't do anything.

Looking over in screen-q-svc.log for the info before the RPC call finds:

DEBUG neutron.plugins.ml2.drivers.l2pop.rpc [req-f32790a5-0160-47b9-89b4-763b9c23bc08 tempest-TestGettingAddress-2071048693 tempest-TestGettingAddress-1817548879] Fanout notify l2population agents at q-agent-notifier the message update_fdb_entries with {'chg_ip': {u'c4ae0757-e3e5-419c-b2ba-4d7817964237': {u'10.208.193.94': {'before': [PortInfo(mac_address=u'fa:16:3e:8d:2e:48', ip_address=u'2003:0:0:1::1')]}}}} _notification_fanout /opt/stack/new/neutron/neutron/plugins/ml2/drivers/l2pop/rpc.py:47

This is the message right before _marshall_fdb_entries() was called to convert the PortInfo into [<mac>, <ip>] pairs, and from the above it looks like it did.

I'm just starting to look at this now, but maybe someone more familiar with l2pop has a guess at what's broken.

Changed in neutron:
assignee: nobody → Brian Haley (brian-haley)

Fix proposed to branch: master
Review: https://review.openstack.org/272986

Changed in neutron:
assignee: Brian Haley (brian-haley) → Kevin Benton (kevinbenton)
status: New → In Progress
Changed in neutron:
importance: Undecided → High
Changed in neutron:
assignee: Kevin Benton (kevinbenton) → Brian Haley (brian-haley)
Changed in neutron:
assignee: Brian Haley (brian-haley) → venkata anil (anil-venkata)
venkata anil (anil-venkata) wrote :

Kevin and Brian, sorry for submitting the patchset without informing you. I think you are not correctly addressing the issue and I am afraid that I may not properly explain through comments, and with code changes only I could convey more clearly, so submitted the patchset. Please revert it back if my approach is wrong.

venkata anil (anil-venkata) wrote :

Kevin, please assign bug back to yourself.

Changed in neutron:
assignee: venkata anil (anil-venkata) → Kevin Benton (kevinbenton)

Reviewed: https://review.openstack.org/272986
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=8dcf39aae7a099e01bd322891526c134e87e6b1b
Submitter: Jenkins
Branch: master

commit 8dcf39aae7a099e01bd322891526c134e87e6b1b
Author: Kevin Benton <email address hidden>
Date: Wed Jan 27 02:17:01 2016 -0800

    Unmarshall portinfo on update_fdb_entries calls

    The unmarshalling function was not aware of the data
    structure used by update_fdb_entries, so it would not
    setup PortInfo named tuples in the 'before' and 'after'
    fields. This would break the fdb_chg_ip_tun function
    which expected to be able to use named attributes.

    This patch adjusts the unmarshalling function to be aware
    of this datastrucure.

    This has likely been broken since the change that added
    named tuples here: I7f8c93b0e12ee0179bb23dfbb3a3d814615b1c2e
    It probably went undetected for so long because the exception
    will only be observed when the updated entry does not have
    an agent IP that matches the local agent's (i.e. not single-node).
    Even in a multi-node environment, this would only trigger an
    error when the fixed_ips of a port changed so it wouldn't show
    up in a normal port wiring life-cycle.

    Closes-Bug: #1538387
    Change-Id: I0aacb3af9ebd160ebfb801f77b186075303c3df5

Changed in neutron:
status: In Progress → Fix Released

This issue was fixed in the openstack/neutron 8.0.0.0b3 development milestone.

Reviewed: https://review.openstack.org/285383
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=fe0a1cafd3427084362323093d78560c8594a2c4
Submitter: Jenkins
Branch: stable/liberty

commit fe0a1cafd3427084362323093d78560c8594a2c4
Author: Kevin Benton <email address hidden>
Date: Wed Jan 27 02:17:01 2016 -0800

    Unmarshall portinfo on update_fdb_entries calls

    The unmarshalling function was not aware of the data
    structure used by update_fdb_entries, so it would not
    setup PortInfo named tuples in the 'before' and 'after'
    fields. This would break the fdb_chg_ip_tun function
    which expected to be able to use named attributes.

    This patch adjusts the unmarshalling function to be aware
    of this datastrucure.

    This has likely been broken since the change that added
    named tuples here: I7f8c93b0e12ee0179bb23dfbb3a3d814615b1c2e
    It probably went undetected for so long because the exception
    will only be observed when the updated entry does not have
    an agent IP that matches the local agent's (i.e. not single-node).
    Even in a multi-node environment, this would only trigger an
    error when the fixed_ips of a port changed so it wouldn't show
    up in a normal port wiring life-cycle.

    Closes-Bug: #1538387
    Change-Id: I0aacb3af9ebd160ebfb801f77b186075303c3df5
    (cherry picked from commit 8dcf39aae7a099e01bd322891526c134e87e6b1b)

tags: added: in-stable-liberty

This issue was fixed in the openstack/neutron 7.0.4 release.

Reviewed: https://review.openstack.org/297023
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=fc3559ba96f6fc066fba1fc26386d6e063e946b6
Submitter: Jenkins
Branch: stable/kilo

commit fc3559ba96f6fc066fba1fc26386d6e063e946b6
Author: Kevin Benton <email address hidden>
Date: Wed Jan 27 02:17:01 2016 -0800

    Unmarshall portinfo on update_fdb_entries calls

    The unmarshalling function was not aware of the data
    structure used by update_fdb_entries, so it would not
    setup PortInfo named tuples in the 'before' and 'after'
    fields. This would break the fdb_chg_ip_tun function
    which expected to be able to use named attributes.

    This patch adjusts the unmarshalling function to be aware
    of this datastrucure.

    This has likely been broken since the change that added
    named tuples here: I7f8c93b0e12ee0179bb23dfbb3a3d814615b1c2e
    It probably went undetected for so long because the exception
    will only be observed when the updated entry does not have
    an agent IP that matches the local agent's (i.e. not single-node).
    Even in a multi-node environment, this would only trigger an
    error when the fixed_ips of a port changed so it wouldn't show
    up in a normal port wiring life-cycle.

    Closes-Bug: #1538387
    Change-Id: I0aacb3af9ebd160ebfb801f77b186075303c3df5
    (cherry picked from commit 8dcf39aae7a099e01bd322891526c134e87e6b1b)

tags: added: in-stable-kilo

This issue was fixed in the openstack/neutron 2015.1.4 release.

This issue was fixed in the openstack/neutron 2015.1.4 release.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers