Neutron 504 Gateway Timeout Openstack Kolla-Ansible : Ussuri

Bug #2025946 reported by Adelia Nurlina
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
neutron
Invalid
High
Unassigned

Bug Description

i have 3 controller openstack, but the network agent often 504 Gateway timeout. when i see neutron_server.log, this logs showed up in one of my controllers

2023-05-24 10:00:23.314 687 ERROR neutron.api.v2.resource [req-a1f3e58a-00f7-4ed9-b8e5-6c538dc5d5a3 3fe50ccef00f49e3b1b0bbd58705a930 c7d2001e7a2c4c32b9f2a3657f29b6b0 - default default] index failed: No details.: ovsdbapp.exceptions.TimeoutException: Commands [<neutron.plugins.ml2.drivers.ovn.mech_driver.ovsdb.commands.CheckLivenessCommand object at 0x7f5ce81a98d0>] exceeded timeout 180 seconds
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource Traceback (most recent call last):
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource File "/var/lib/kolla/venv/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 153, in queue_txn
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource self.txns.put(txn, timeout=self.timeout)
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource File "/var/lib/kolla/venv/lib/python3.6/site-packages/ovsdbapp/backend/ovs_idl/connection.py", line 51, in put
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource super(TransactionQueue, self).put(*args, **kwargs)
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/queue.py", line 264, in put
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource result = waiter.wait()
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/queue.py", line 141, in wait
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource return get_hub().switch()
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource File "/var/lib/kolla/venv/lib/python3.6/site-packages/eventlet/hubs/hub.py", line 298, in switch
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource return self.greenlet.switch()
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource queue.Full
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource During handling of the above exception, another exception occurred:
2023-07-05 09:49:03.453 670 ERROR neutron.api.v2.resource

How do i solve this?

Tags: ovn
Revision history for this message
Adelia Nurlina (adelianurlinap) wrote :

ovn-sb-db.log

2023-07-05T03:45:48.842Z|07041|raft|INFO|current entry eid fda48665-de70-42ff-863e-10d7f312a0a4 does not match prerequisite f4651d24-a156-4ccd-97f2-db15e80338ba in execute_command_request
2023-07-05T03:45:48.848Z|07042|raft|INFO|current entry eid dc4a25c0-3320-4d54-8316-118291114bd7 does not match prerequisite b0d5453a-5ebc-45d7-8011-ced535eeb298 in execute_command_request
2023-07-05T03:45:48.848Z|07043|raft|INFO|current entry eid dc4a25c0-3320-4d54-8316-118291114bd7 does not match prerequisite b0d5453a-5ebc-45d7-8011-ced535eeb298 in execute_command_request
2023-07-05T03:46:07.760Z|07044|reconnect|ERR|tcp:xxxx:40020: no response to inactivity probe after 5 seconds, disconnecting
2023-07-05T03:46:09.761Z|07045|reconnect|ERR|tcp:xxxx:40012: no response to inactivity probe after 5 seconds, disconnecting
2023-07-05T03:46:11.429Z|07046|reconnect|ERR|tcp:xxx:40014: no response to inactivity probe after 5 seconds, disconnecting

ovn-nb-db.log

2023-07-05T03:47:22.119Z|03349|reconnect|ERR|tcp:xxxx:42132: no response to inactivity probe after 5 seconds, disconnecting
2023-07-05T03:47:26.750Z|03350|reconnect|ERR|tcp:xxxx:46362: no response to inactivity probe after 5.09 seconds, disconnecting
2023-07-05T03:47:28.740Z|03351|reconnect|ERR|tcp:xxxx:42160: no response to inactivity probe after 5 seconds, disconnecting
2023-07-05T03:47:31.121Z|03352|reconnect|ERR|tcp:xxxx:59026: no response to inactivity probe after 5 seconds, disconnecting
2023-07-05T03:47:33.120Z|03353|reconnect|ERR|tcp:xxxx:59038: no response to inactivity probe after 5 seconds, disconnecting
2023-07-05T03:47:36.792Z|03354|reconnect|ERR|tcp:xxxx:59048: no response to inactivity probe after 5.12 seconds, disconnecting
2023-07-05T03:47:40.126Z|03355|reconnect|ERR|tcp:xxxx:47302: no response to inactivity probe after 5.28 seconds, disconnecting

tags: added: ovn
Changed in neutron:
importance: Undecided → Critical
importance: Critical → High
Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Adelia:

Checking the logs provided, it seems that the inactivity probe timeout is too short. Did you try increasing this value? For example, in the "Manual install & Configuration" guide [1], the time used is 60 seconds (60000ms). Please try this first.

Regards.

[1]https://docs.openstack.org/neutron/yoga/install/ovn/manual_install.html

Revision history for this message
Adelia Nurlina (adelianurlinap) wrote :

Hello, thank you for your help. I've been trying to set ptcp:6641:0.0.0.0 and ptcp:6642:0.0.0.0 but the logs says `err|6641:0.0.0.0: bind: address already in use`.
If i change the address to one of my controller IP (ex 10.0.0.1), when that controller isn't leader of ovn_sb and ovn_nb, it will cause dropped connection. How can i set-connection using 3 IPs?

Revision history for this message
Rodolfo Alonso (rodolfo-alonso-hernandez) wrote :

Hello Adelia:

The problem you have is that there are other processes running in this host on this TCP port [1]. Try first to list what processes are using this port, stop them and then restart the OVN controller.

I'm closing this bug because it doesn't seem to be a Neutron problem but a system/backend issue.

Regards.

[1]https://mail.openvswitch.org/pipermail/ovs-discuss/2017-February/043597.html

Changed in neutron:
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.