Cloned from https://bugzilla.redhat.com/show_bug.cgi?id=2311146
During the execution of a BGP downstream job, the following error logs have been found on a compute node:
2024-09-10T04:51:06.495170198+00:00 stdout F 2024-09-10 04:51:06.495 31304 DEBUG oslo_concurrency.lockutils [-] Lock "bgp" released by "ovn_bgp_agent.drivers.openstack.ovn_bgp_driver.OVNBGPDriver.sync" :: held 0.159s inner /usr/lib/python3.9/site-packages/oslo_concurrency/lockutils.py:367[00m
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent [-] Unexpected exception while running the sync: Unexpected error while running command.
2024-09-10T04:51:06.500782975+00:00 stdout F Command: ovs-vsctl get Interface patch-provnet-ec5f2613-76de-4880-b4e4-9523e2c04f43-to-br-int ofport -O OpenFlow13
2024-09-10T04:51:06.500782975+00:00 stdout F Exit code: 1
2024-09-10T04:51:06.500782975+00:00 stdout F Stdout: ''
2024-09-10T04:51:06.500782975+00:00 stdout F Stderr: 'ovs-vsctl: Interface does not contain a column whose name matches "-O"\n': oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
2024-09-10T04:51:06.500782975+00:00 stdout F Command: ovs-vsctl get Interface patch-provnet-ec5f2613-76de-4880-b4e4-9523e2c04f43-to-br-int ofport -O OpenFlow13
2024-09-10T04:51:06.500782975+00:00 stdout F Exit code: 1
2024-09-10T04:51:06.500782975+00:00 stdout F Stdout: ''
2024-09-10T04:51:06.500782975+00:00 stdout F Stderr: 'ovs-vsctl: Interface does not contain a column whose name matches "-O"\n'
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent Traceback (most recent call last):
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent File "/usr/lib/python3.9/site-packages/ovn_bgp_agent/agent.py", line 53, in sync
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent self.agent_driver.sync()
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent File "/usr/lib/python3.9/site-packages/oslo_concurrency/lockutils.py", line 360, in inner
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent return f(*args, **kwargs)
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent File "/usr/lib/python3.9/site-packages/ovn_bgp_agent/drivers/openstack/ovn_bgp_driver.py", line 203, in sync
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent ovs.get_ovs_patch_ports_info(bridge))
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent File "/usr/lib/python3.9/site-packages/ovn_bgp_agent/drivers/openstack/utils/ovs.py", line 61, in get_ovs_patch_ports_info
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent ovs_ofport = get_device_port_at_ovs(ovs_port)
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent File "/usr/lib/python3.9/site-packages/ovn_bgp_agent/drivers/openstack/utils/ovs.py", line 51, in get_device_port_at_ovs
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent return ovn_bgp_agent.privileged.ovs_vsctl.ovs_cmd(
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent File "/usr/lib/python3.9/site-packages/oslo_privsep/priv_context.py", line 253, in _wrap
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent return self.channel.remote_call(name, args, kwargs)
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent File "/usr/lib/python3.9/site-packages/oslo_privsep/daemon.py", line 226, in remote_call
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent raise exc_type(*result[2])
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent oslo_concurrency.processutils.ProcessExecutionError: Unexpected error while running command.
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent Command: ovs-vsctl get Interface patch-provnet-ec5f2613-76de-4880-b4e4-9523e2c04f43-to-br-int ofport -O OpenFlow13
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent Exit code: 1
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent Stdout: ''
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent Stderr: 'ovs-vsctl: Interface does not contain a column whose name matches "-O"\n'
2024-09-10T04:51:06.500782975+00:00 stdout F 2024-09-10 04:51:06.495 31304 ERROR ovn_bgp_agent.agent [00m
This issue is not critical, because the next sync did not fail.
Perhaps we should add retries to `get_device_port_at_ovs` as we do with `get_ovs_patch_port_ofport`?
https://github.com/openstack/ovn-bgp-agent/blob/master/ovn_bgp_agent/drivers/openstack/utils/ovs.py#L53
How reproducible:
Only once, so far.
Steps to Reproduce:
Unfortunately, we don't have a simple reproducer. It has been reproduced running a job that runs tempest and neutron-tempest-plugin tests.
Fix proposed to branch: master /review. opendev. org/c/openstack /ovn-bgp- agent/+ /928822
Review: https:/