Tempest runs: intermittent single test failure

Bug #1655618 reported by Bernard Cafarelli
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
networking-sfc
Fix Released
Undecided
Bernard Cafarelli

Bug Description

Intermittently, the tempest gate test run fails on a single test:
http://logs.openstack.org/72/415972/2/check/gate-tempest-dsvm-networking-sfc-ubuntu-xenial/7152b27/console.html#_2017-01-10_19_52_55_813734
http://logs.openstack.org/00/409800/2/gate/gate-tempest-dsvm-networking-sfc-ubuntu-xenial/e7994eb/console.html#_2017-01-11_09_57_35_797051
http://logs.openstack.org/94/411194/2/check/gate-tempest-dsvm-networking-sfc-ubuntu-xenial/b2c772d/console.html#_2017-01-11_10_21_28_548435

The test usually is either test_show_port_chain or test_update_port_chain (in networking_sfc.tests.tempest_plugin.tests.api.test_sfc_extensions.SfcExtensionTestJSON), with the traceback:
Details: {u'type': u'PortChainFlowClassifierInConflict', u'detail': u'', u'message': u'Flow Classifier fdce19cc-bc4c-432d-b4d0-bc5e1fb795a5 conflicts with Flow Classifier a5db4038-e49f-4c7e-9e5b-13b2beee3251 in port chain 07bd1ba0-202e-4551-b4ee-971f4f3c4ec2.'}

Revision history for this message
Bernard Cafarelli (bcafarel) wrote :

Taking a look at it

Changed in networking-sfc:
assignee: nobody → Bernard Cafarelli (bcafarel)
status: New → In Progress
Revision history for this message
Bernard Cafarelli (bcafarel) wrote :
Revision history for this message
Bernard Cafarelli (bcafarel) wrote :

This does seem to be a race condition problem: the created flow classifier for the new test collides with the not-yet deleted one from the previous test.

This could mean that fixing this bug may help into dropping the tempest concurrency limit:
https://github.com/openstack/networking-sfc/blob/d92123ab929651920c99fb8375bf3b6663037286/devstack/post_test_hook.sh#L31

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to networking-sfc (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/419525

Revision history for this message
Bernard Cafarelli (bcafarel) wrote :

Note that the conflict that appears in the race conditions is the same as mentioned by Igor in:
http://lists.openstack.org/pipermail/openstack-dev/2017-January/109873.html

The fix is to use flowclassifier_conflict() and not flowclassifier_basic_conflict() (which does not check logical port conflicts), feedback welcome on the review!

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Change abandoned on networking-sfc (master)

Change abandoned by Bernard Cafarelli (<email address hidden>) on branch: master
Review: https://review.openstack.org/419525
Reason: With the current implementation, the check must be on the N-tuple only:
http://eavesdrop.openstack.org/irclogs/%23openstack-meeting-4/%23openstack-meeting-4.2017-01-12.log.html#t2017-01-12T17:26:30

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix proposed to networking-sfc (master)

Related fix proposed to branch: master
Review: https://review.openstack.org/419928

Revision history for this message
Bernard Cafarelli (bcafarel) wrote :

OK, so the current check is correct (see the ML thread for details).

After checking through some failures logs again, I found that the the conflict actually happens between an API test and a scenario one, not between two scenario tests. So the "--concurrency=0" parameter is not enough for separate tests suits.

The new review I sent separates these suits for a quick fix, a proper fix would be nice to drop the parallel limitation but this can be done in a new bug (as it would be a bigger rewrite).

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to networking-sfc (master)

Reviewed: https://review.openstack.org/419928
Committed: https://git.openstack.org/cgit/openstack/networking-sfc/commit/?id=d31008bc0c8a829995a310b64519d125f6c055c1
Submitter: Jenkins
Branch: master

commit d31008bc0c8a829995a310b64519d125f6c055c1
Author: Bernard Cafarelli <email address hidden>
Date: Fri Jan 13 13:36:38 2017 +0100

    Fix intermittent tempest test failures

    Ensure API and scenario tests do not run in parallel, as this can
    trigger a flow classifier conflict

    Change-Id: Ic9f71d1704c6850e79192e07c6072b33aa578c79
    Closes-Bug: 1655618

Changed in networking-sfc:
status: In Progress → Fix Released
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Related fix merged to networking-sfc (stable/newton)

Reviewed: https://review.openstack.org/418926
Committed: https://git.openstack.org/cgit/openstack/networking-sfc/commit/?id=405607f11e74ad62fc7d9d4238c4c5643c5e33fe
Submitter: Jenkins
Branch: stable/newton

commit 405607f11e74ad62fc7d9d4238c4c5643c5e33fe
Author: Bernard Cafarelli <email address hidden>
Date: Wed Jan 4 10:43:45 2017 +0100

    Fix tempest tests on stable/newton

    * Remove unused DB query call in OVS driver
    * Ensure API and scenario tests do not run in parallel, as this can
      trigger a flow classifier conflict

    Change-Id: I712f84f6d58f17142f11d333b6ac26ac3d6de6cf
    Related-Bug: 1630503
    Related-Bug: 1655618
    (cherry picked from commit 9b39c43139342c3de26186ee6e4168d441839f42)
    (cherry picked from commit d31008bc0c8a829995a310b64519d125f6c055c1)

tags: added: in-stable-newton
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix included in openstack/networking-sfc 4.0.0

This issue was fixed in the openstack/networking-sfc 4.0.0 release.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.