Functional ARPSpoofTestCase's occasionally fail

Bug #1550623 reported by Kevin Benton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Kevin Benton
Kilo
Fix Released
Undecided
Unassigned

Bug Description

We occasionally get failures in the gate like the one below. Unfortunately they are difficult to reproduce locally.

ft32.3: neutron.tests.functional.agent.test_ovs_flows.ARPSpoofOFCtlTestCase.test_arp_spoof_allowed_address_pairs(native)_StringException: Empty attachments:
  stderr
  stdout

pythonlogging:'': {{{
DEBUG [oslo_policy._cache_handler] Reloading cached file /opt/stack/new/neutron/neutron/tests/etc/policy.json
   DEBUG [oslo_policy.policy] Reloaded policy file: /opt/stack/new/neutron/neutron/tests/etc/policy.json
}}}

Traceback (most recent call last):
  File "neutron/tests/functional/agent/test_ovs_flows.py", line 202, in test_arp_spoof_allowed_address_pairs
    net_helpers.assert_ping(self.src_namespace, self.dst_addr, count=2)
  File "neutron/tests/common/net_helpers.py", line 102, in assert_ping
    dst_ip])
  File "neutron/agent/linux/ip_lib.py", line 885, in execute
    log_fail_as_error=log_fail_as_error, **kwargs)
  File "neutron/agent/linux/utils.py", line 140, in execute
    raise RuntimeError(msg)
RuntimeError: Exit code: 1; Stdin: ; Stdout: PING 192.168.0.2 (192.168.0.2) 56(84) bytes of data.

--- 192.168.0.2 ping statistics ---
2 packets transmitted, 0 received, 100% packet loss, time 1006ms

; Stderr:

Changed in neutron:
assignee: nobody → Kevin Benton (kevinbenton)
Changed in neutron:
status: New → In Progress
Revision history for this message
Jakub Libosvar (libosvar) wrote :

I'm suspicious this happens with native interface only. On one of failures I saw, there was this trace in test's log:

2016-02-27 10:05:28.708 3274 ERROR ryu.lib.hub [-] hub: uncaught exception: Traceback (most recent call last):
  File "/opt/stack/new/neutron/.tox/dsvm-functional-constraints/local/lib/python2.7/site-packages/ryu/lib/hub.py", line 52, in _launch
    func(*args, **kwargs)
  File "/opt/stack/new/neutron/.tox/dsvm-functional-constraints/local/lib/python2.7/site-packages/ryu/base/app_manager.py", line 532, in close
    self.uninstantiate(app_name)
  File "/opt/stack/new/neutron/.tox/dsvm-functional-constraints/local/lib/python2.7/site-packages/ryu/base/app_manager.py", line 515, in uninstantiate
    app = self.applications.pop(name)
KeyError: 'OVSNeutronAgentRyuApp'

Any chance this is related to any specific ryu version?

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/285181
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=1b5fbd9d0e25308d27eb3f9aa75878b5df6aa59f
Submitter: Jenkins
Branch: master

commit 1b5fbd9d0e25308d27eb3f9aa75878b5df6aa59f
Author: Kevin Benton <email address hidden>
Date: Wed Feb 24 01:27:35 2016 -0800

    Collect details on ARP spoof functional failures

    There seems to be some race condition or corner case in the
    ARP spoofing functional tests that cause them to randomly
    fail in the gate but it's difficult to reproduce them
    locally. This patch adds a bunch of details on failures so
    we can maybe get some hints about the unexpected state of
    the bridge or interfaces that is causing the failure.

    Partial-Bug: #1550623
    Change-Id: I15b7ab3ce2a95d2b432239d535e3700f28ad21de

Changed in neutron:
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote :

Reviewed: https://review.openstack.org/286428
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=a6c231cdd731dae900c372799a890407bbc24a2c
Submitter: Jenkins
Branch: master

commit a6c231cdd731dae900c372799a890407bbc24a2c
Author: Kevin Benton <email address hidden>
Date: Mon Feb 29 21:28:48 2016 -0800

    Make run_ofctl check for socket error

    When the OVS bridge is still being initialized we get
    a "failed to connect to socket" error when running ovs-ofctl.
    This shows up quite frequently in our functional tests and
    may be the source of their high failure rate.

    Ultimately we need to change the behavior of run_ofctl to not
    ignore errors by default, but this will require a lot of effort
    because there are many places that likely expect this behavior.

    As a workaround, this patch checks for the specific socket failure
    and attempts the command again up to 10 times, sleeping for 1
    second between each attempt to wait for the bridge to be ready.

    Closes-Bug: #1550623
    Closes-Bug: #1551593
    Change-Id: I663a54608ed96133014104fe033ecea0a867ac4c

Changed in neutron:
status: In Progress → Fix Released
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/neutron 8.0.0.0b3

This issue was fixed in the openstack/neutron 8.0.0.0b3 development milestone.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/liberty)

Fix proposed to branch: stable/liberty
Review: https://review.openstack.org/297034

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (stable/kilo)

Fix proposed to branch: stable/kilo
Review: https://review.openstack.org/297035

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/kilo)

Reviewed: https://review.openstack.org/297035
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=5390cc8af2dba69f3823fe8fdaffb579d528fe62
Submitter: Jenkins
Branch: stable/kilo

commit 5390cc8af2dba69f3823fe8fdaffb579d528fe62
Author: Kevin Benton <email address hidden>
Date: Mon Feb 29 21:28:48 2016 -0800

    Make run_ofctl check for socket error

    When the OVS bridge is still being initialized we get
    a "failed to connect to socket" error when running ovs-ofctl.
    This shows up quite frequently in our functional tests and
    may be the source of their high failure rate.

    Ultimately we need to change the behavior of run_ofctl to not
    ignore errors by default, but this will require a lot of effort
    because there are many places that likely expect this behavior.

    As a workaround, this patch checks for the specific socket failure
    and attempts the command again up to 10 times, sleeping for 1
    second between each attempt to wait for the bridge to be ready.

    Conflicts:
     neutron/agent/common/ovs_lib.py

    Closes-Bug: #1550623
    Closes-Bug: #1551593
    Change-Id: I663a54608ed96133014104fe033ecea0a867ac4c
    (cherry picked from commit a6c231cdd731dae900c372799a890407bbc24a2c)
    (cherry picked from commit eec85f361ecd8af96c73d0c8ff1c9fb1d347220a)

tags: added: in-stable-kilo
Revision history for this message
Doug Hellmann (doug-hellmann) wrote : Fix included in openstack/neutron 2015.1.4

This issue was fixed in the openstack/neutron 2015.1.4 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (stable/liberty)

Reviewed: https://review.openstack.org/297034
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=eec85f361ecd8af96c73d0c8ff1c9fb1d347220a
Submitter: Jenkins
Branch: stable/liberty

commit eec85f361ecd8af96c73d0c8ff1c9fb1d347220a
Author: Kevin Benton <email address hidden>
Date: Mon Feb 29 21:28:48 2016 -0800

    Make run_ofctl check for socket error

    When the OVS bridge is still being initialized we get
    a "failed to connect to socket" error when running ovs-ofctl.
    This shows up quite frequently in our functional tests and
    may be the source of their high failure rate.

    Ultimately we need to change the behavior of run_ofctl to not
    ignore errors by default, but this will require a lot of effort
    because there are many places that likely expect this behavior.

    As a workaround, this patch checks for the specific socket failure
    and attempts the command again up to 10 times, sleeping for 1
    second between each attempt to wait for the bridge to be ready.

    Conflicts:
     neutron/agent/common/ovs_lib.py

    Closes-Bug: #1550623
    Closes-Bug: #1551593
    Change-Id: I663a54608ed96133014104fe033ecea0a867ac4c
    (cherry picked from commit a6c231cdd731dae900c372799a890407bbc24a2c)

tags: added: in-stable-liberty
Revision history for this message
Thierry Carrez (ttx) wrote : Fix included in openstack/neutron 7.1.0

This issue was fixed in the openstack/neutron 7.1.0 release.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/neutron 2015.1.4

This issue was fixed in the openstack/neutron 2015.1.4 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.