openvswitch-agent unable to start because of timeout talking to ovsdb

Bug #1849732 reported by Ryan Farrell
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Neutron Gateway Charm
Fix Released
Low
Felipe Reyes
OpenStack Neutron Open vSwitch Charm
Invalid
Undecided
Unassigned

Bug Description

On a scaled out production deployment, we have encountered an issue were openvswitch-switch service was not able to start because the ovsdb query is exceeding the default 10 seconds allotted.

019-10-24 19:48:06.748 1144752 DEBUG ovsdbapp.backend.ovs_idl.transaction [-] Transaction caused no change do_commit /usr/lib/python2.7/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py:114
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command [req-465f9f5e-ff41-4af1-a5ea-e326b9629a85 - - - - -] Error executing command: TimeoutException: Commands [<ovsdbapp.backend.ovs_idl.command.DbListCommand object at 0x7f1ddb016610>] exceeded timeout 10 seconds
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command Traceback (most recent call last):
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python2.7/dist-packages/ovsdbapp/backend/ovs_idl/command.py", line 35, in execute
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command txn.add(self)
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command self.gen.next()
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python2.7/dist-packages/ovsdbapp/api.py", line 94, in transaction
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command self._nested_txn = None
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python2.7/dist-packages/ovsdbapp/api.py", line 54, in __exit__
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command self.result = self.commit()
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command File "/usr/lib/python2.7/dist-packages/ovsdbapp/backend/ovs_idl/transaction.py", line 57, in commit
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command timeout=self.timeout)
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command TimeoutException: Commands [<ovsdbapp.backend.ovs_idl.command.DbListCommand object at 0x7f1ddb016610>] exceeded timeout 10 seconds
2019-10-24 19:48:06.753 1144752 ERROR ovsdbapp.backend.ovs_idl.command
2019-10-24 19:48:06.754 1144752 ERROR neutron.plugins.ml2.drivers.openvswitch.agent.ovs_neutron_agent [req-465f9f5e-ff41-4af1-a5ea-e326b9629a85 - - - - -] Commands [<ovsdbapp.backend.ovs_idl.command.DbListCommand object at 0x7f1ddb016610>] exceeded timeout 10 seconds Agent terminated!: TimeoutException: Commands [<ovsdbapp.backend.ovs_idl.command.DbListCommand object at 0x7f1ddb016610>] exceeded timeout 10 seconds

The issue was resolved by editing /etc/neutron/plugins/ml2/openvswitch_agent.ini ans setting ovsdb_timeout = 60

If nothing else, this default should be set higher but we also expect that this should be tunable via charm config.

Revision history for this message
Ryan Farrell (whereisrysmind) wrote :

I think this may actually belong under neutron-gateway
The unit in question was running a charmed neutron-gateway 12.0.5 charm version 262.
We will need a back port for this fix.

Changed in charm-neutron-openvswitch:
status: New → Invalid
Changed in charm-neutron-gateway:
status: New → Triaged
importance: Undecided → Low
Felipe Reyes (freyes)
Changed in charm-neutron-gateway:
assignee: nobody → Felipe Reyes (freyes)
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to charm-neutron-gateway (master)

Fix proposed to branch: master
Review: https://review.opendev.org/693559

Changed in charm-neutron-gateway:
status: Triaged → In Progress
Changed in charm-neutron-gateway:
milestone: none → 20.01
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to charm-neutron-gateway (master)

Reviewed: https://review.opendev.org/693559
Committed: https://git.openstack.org/cgit/openstack/charm-neutron-gateway/commit/?id=17cbdb50a27662bae4ace3def022d20d21f60c99
Submitter: Zuul
Branch: master

commit 17cbdb50a27662bae4ace3def022d20d21f60c99
Author: Felipe Reyes <email address hidden>
Date: Fri Nov 8 13:18:16 2019 -0300

    Add ovsdb-timeout configuration option

    ovsdb-timeout sets ovsdb_timeout in openvswitch_agent.ini, this option
    is used to determine when ovsdb commands should be marked as fail. This
    is helpful for large clouds or where the node is under pressure.

    Change-Id: I0b0e397691c49d3fcebdd30bbe9b160789acf3c3
    Closes-Bug: #1849732

Changed in charm-neutron-gateway:
status: In Progress → Fix Committed
James Page (james-page)
Changed in charm-neutron-gateway:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.