Open vSwitch commands timeout on gate tests

Bug #1254520 reported by Salvatore Orlando
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Fix Released
High
Salvatore Orlando

Bug Description

Error 242 occurs fairly often in gate tests: http://logstash.openstack.org/#eyJzZWFyY2giOiJcIkV4aXQgY29kZTogMjQyXCIgIiwiZmllbGRzIjpbImZpbGVuYW1lIl0sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiNDMyMDAiLCJncmFwaG1vZGUiOiJjb3VudCIsInRpbWUiOnsidXNlcl9pbnRlcnZhbCI6MH0sInN0YW1wIjoxMzg1MzE3OTg4NTQxLCJtb2RlIjoiIiwiYW5hbHl6ZV9maWVsZCI6IiJ9

This is actual an ALARM_CLOCK error [142] (rootwrap adds 100 to the error code), and means open vswitch times out; as the default timeout is 2 seconds there is a chance they could occur quite often in a rather stressful scenario as the one represented by parallel tests in tenant isolation.

It might be therefore advisable to allow for a configurable timeout on ovs commands, and increase this timeout for gate tests.

Kernel logs do not provide enough additional information.
It might be also worth adding open vswitch logs to the logs collected by devstack-gate

Revision history for this message
Salvatore Orlando (salvatore-orlando) wrote :

Adding devstack to affected projects as a change in the default timeout needs to be addressed there.

Changed in neutron:
status: New → Triaged
importance: Undecided → High
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to neutron (master)

Fix proposed to branch: master
Review: https://review.openstack.org/61105

Changed in neutron:
status: Triaged → In Progress
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to devstack (master)

Fix proposed to branch: master
Review: https://review.openstack.org/61136

Changed in devstack:
assignee: nobody → Salvatore Orlando (salvatore-orlando)
status: New → In Progress
Changed in neutron:
milestone: none → icehouse-2
no longer affects: devstack
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to neutron (master)

Reviewed: https://review.openstack.org/61105
Committed: https://git.openstack.org/cgit/openstack/neutron/commit/?id=4af2163bd41648d64cf1c3c838990737955d2133
Submitter: Jenkins
Branch: master

commit 4af2163bd41648d64cf1c3c838990737955d2133
Author: Salvatore Orlando <email address hidden>
Date: Tue Dec 10 04:20:26 2013 -0800

    Make timeout for ovs-vsctl configurable

    This patch adds a new configuration variable for the timeout on
    ovs-vsctl commands, and sets the default timeout to 10 seconds.
    This is aimed at allowing users to tune the agents in order to avoid
    timeout errors on their deployments.

    Change-Id: I73ea0d0de49a4b4a118bc2d68ad9c093ea122717
    Closes-Bug: #1254520

Changed in neutron:
status: In Progress → Fix Committed
Thierry Carrez (ttx)
Changed in neutron:
status: Fix Committed → Fix Released
Thierry Carrez (ttx)
Changed in neutron:
milestone: icehouse-2 → 2014.1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.