[N-D-R] The dynamic routing service is not resilient to infrastructure outage

Bug #2039812 reported by Roberto Bartzen Acosta
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
neutron
New
Undecided
Unassigned

Bug Description

Nowadays, the n-d-r service architecture depends of some kind of messaging between the DRAgent service and Neutron server side. However, this communications is strongly depent of the messaging service availability (RabbitMQ by default), and any transient/permanent failures in openstack infrastructure nodes may affect prefix advertising via BGP.

The issue here is not related to communication dependent on the messaging service itself, as this is the common design of OpenStack modules. I'm talking about how the control plane service (n-d-r) can actively affect the data plane.

I understand that the application design drop BGP peer connection after a certain timeout without RMQ communication (in my tests it took 1 hour) but as a result, all the prefixes/FIPs will stop to advertising (dropping the external connectivity). To be clear, lack of messages between the Neutron server and DRAgent via RMQ will cause a general unavailability of the whole North/South data plane.

IMO: it would be helpfully for the DRAgent service to implement a resilience solution for the data plane, keeping sessions with BGP peers and waiting for te RMQ communication back (A large timeout can be help here). Additionally, the n-d-r on the Neutron side needs to keep the bgp speaker alive in infrastructure failure cases because if the speaker is removed the DRAgent will no longer work.

I know that RMQ being out of servive for long periods is critical for many parts of OpenStack, but even with HA in the DRAgents depoloyment, we will have a single point of failure, as n-d-r agent needs to communicate with Neutron via the messaging service.

Has anyone else had this problem? Does it make sense to you?

---------------------------------------------------------------------------------------------

logs of the ndr service being stopped and closing the BGP peers connections:

Aug 22 04:08:24 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:08:24.099 913828 ERROR oslo.messaging._drivers.impl_rabbit [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] [354770b6-dc97-439a-b059-91eb3be6b2f4] AMQP server on 10.36.16.246:5671 is unreachable: . Trying again in 1 seconds.: TimeoutError
Aug 22 04:08:25 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:08:25.231 913828 INFO oslo.messaging._drivers.impl_rabbit [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] [354770b6-dc97-439a-b059-91eb3be6b2f4] Reconnected to AMQP server on 10.36.16.246:5671 via [amqp] client with port 53326.
Aug 22 04:08:25 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:08:25.271 913828 WARNING oslo.service.loopingcall [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Function 'neutron.service.Service.periodic_tasks' run outlasted interval by 27.11 sec

Aug 22 04:09:25 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:09:25.275 913828 ERROR neutron_lib.rpc [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Timeout in RPC method get_bgp_speakers. Waiting for 58 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID a2732db52e964c87aebf283a0d8a9f62
Aug 22 04:09:25 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:09:25.279 913828 WARNING neutron_lib.rpc [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Increasing timeout for get_bgp_speakers calls to 120 seconds. Restart the agent to restore it to the default value.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID a2732db52e964c87aebf283a0d8a9f62

Aug 22 04:10:23 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:10:23.314 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Unable to sync BGP speaker state.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID a2732db52e964c87aebf283a0d8a9f62
Aug 22 04:10:23 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:10:23.317 913828 WARNING oslo.service.loopingcall [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Function 'neutron.service.Service.periodic_tasks' run outlasted interval by 78.05 sec

Aug 22 04:12:23 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:12:23.324 913828 ERROR neutron_lib.rpc [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Timeout in RPC method get_bgp_speakers. Waiting for 33 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID e4938c710cd9469eb10518be008658ad
Aug 22 04:12:23 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:12:23.325 913828 WARNING neutron_lib.rpc [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Increasing timeout for get_bgp_speakers calls to 240 seconds. Restart the agent to restore it to the default value.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID e4938c710cd9469eb10518be008658ad
Aug 22 04:12:55 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:12:55.942 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Unable to sync BGP speaker state.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID e4938c710cd9469eb10518be008658ad
Aug 22 04:12:55 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:12:55.944 913828 WARNING oslo.service.loopingcall [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Function 'neutron.service.Service.periodic_tasks' run outlasted interval by 112.63 sec

Aug 22 04:16:55 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:16:55.952 913828 ERROR neutron_lib.rpc [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Timeout in RPC method get_bgp_speakers. Waiting for 43 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 5c9b8b987c7a4c39a010f951fdb4d76c
Aug 22 04:16:55 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:16:55.955 913828 WARNING neutron_lib.rpc [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Increasing timeout for get_bgp_speakers calls to 480 seconds. Restart the agent to restore it to the default value.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 5c9b8b987c7a4c39a010f951fdb4d76c

Aug 22 04:17:39 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:17:39.438 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Unable to sync BGP speaker state.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 5c9b8b987c7a4c39a010f951fdb4d76c
Aug 22 04:17:39 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:17:39.441 913828 WARNING oslo.service.loopingcall [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Function 'neutron.service.Service.periodic_tasks' run outlasted interval by 243.50 sec

Aug 22 04:22:47 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent [req-dbc95626-cc0a-4b2e-b236-899237e093f1 - - - - -] Failed reporting state!: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 03ed494d17d84502871b667005e9c2d5
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent Traceback (most recent call last):
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent File "/usr/lib/python3/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 441, in get
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent return self._queues[msg_id].get(block=True, timeout=timeout)
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent File "/usr/lib/python3/dist-packages/eventlet/queue.py", line 322, in get
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent return waiter.wait()
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent File "/usr/lib/python3/dist-packages/eventlet/queue.py", line 141, in wait
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent return get_hub().switch()
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent File "/usr/lib/python3/dist-packages/eventlet/hubs/hub.py", line 313, in switch
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent return self.greenlet.switch()
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent _queue.Empty
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent During handling of the above exception, another exception occurred:
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent Traceback (most recent call last):
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent File "/usr/lib/python3/dist-packages/neutron_dynamic_routing/services/bgp/agent/bgp_dragent.py", line 697, in _report_state
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent agent_status = self.state_rpc.report_state(ctx, self.agent_state,
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent File "/usr/lib/python3/dist-packages/neutron/agent/rpc.py", line 104, in report_state
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent return method(context, 'report_state', **kwargs)
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent File "/usr/lib/python3/dist-packages/oslo_messaging/rpc/client.py", line 189, in call
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent result = self.transport._send(
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent File "/usr/lib/python3/dist-packages/oslo_messaging/transport.py", line 123, in _send
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent return self._driver.send(target, ctxt, message,
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent File "/usr/lib/python3/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 689, in send
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent return self._send(target, ctxt, message, wait_for_reply, timeout,
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent File "/usr/lib/python3/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 678, in _send
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent result = self._waiter.wait(msg_id, timeout,
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent File "/usr/lib/python3/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 567, in wait
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent message = self.waiters.get(msg_id, timeout=timeout)
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent File "/usr/lib/python3/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 443, in get
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent raise oslo_messaging.MessagingTimeout(
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 03ed494d17d84502871b667005e9c2d5
                                                                     2023-08-22 04:22:47.146 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent
Aug 22 04:22:47 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:22:47.158 913828 WARNING oslo.service.loopingcall [req-dbc95626-cc0a-4b2e-b236-899237e093f1 - - - - -] Function 'neutron_dynamic_routing.services.bgp.agent.bgp_dragent.BgpDrAgentWithStateReport._report_state' run outlasted interval by 0.02 sec

Aug 22 04:25:39 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:25:39.451 913828 ERROR neutron_lib.rpc [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Timeout in RPC method get_bgp_speakers. Waiting for 27 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID e3c6a36356dd47afb08fca31760b3514
Aug 22 04:25:39 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:25:39.456 913828 WARNING neutron_lib.rpc [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Increasing timeout for get_bgp_speakers calls to 600 seconds. Restart the agent to restore it to the default value.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID e3c6a36356dd47afb08fca31760b3514

Aug 22 04:26:06 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:26:06.322 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Unable to sync BGP speaker state.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID e3c6a36356dd47afb08fca31760b3514
Aug 22 04:26:06 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:26:06.325 913828 WARNING oslo.service.loopingcall [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Function 'neutron.service.Service.periodic_tasks' run outlasted interval by 466.88 sec

Aug 22 04:32:47 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:32:47.175 913828 WARNING oslo.service.loopingcall [req-bc2c34db-c2bd-44f6-8298-2ac7efe0456a - - - - -] Function 'neutron_dynamic_routing.services.bgp.agent.bgp_dragent.BgpDrAgentWithStateReport._report_state' run outlasted interval by 0.02 sec

Aug 22 04:36:06 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:36:06.334 913828 ERROR neutron_lib.rpc [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Timeout in RPC method get_bgp_speakers. Waiting for 58 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 2d225072f0be42caa1425f7728e5d612
Aug 22 04:37:04 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:37:04.018 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Unable to sync BGP speaker state.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 2d225072f0be42caa1425f7728e5d612
Aug 22 04:37:04 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:37:04.022 913828 WARNING oslo.service.loopingcall [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Function 'neutron.service.Service.periodic_tasks' run outlasted interval by 617.70 sec

Aug 22 04:42:47 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:42:47.188 913828 WARNING oslo.service.loopingcall [req-a14ad202-4c67-46d2-8f1c-d85e80766201 - - - - -] Function 'neutron_dynamic_routing.services.bgp.agent.bgp_dragent.BgpDrAgentWithStateReport._report_state' run outlasted interval by 0.01 sec

Aug 22 04:47:04 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:47:04.033 913828 ERROR neutron_lib.rpc [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Timeout in RPC method get_bgp_speakers. Waiting for 32 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID d33e3beb754742a39756e440694c52d0
Aug 22 04:47:36 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:47:36.491 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Unable to sync BGP speaker state.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID d33e3beb754742a39756e440694c52d0
Aug 22 04:47:36 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:47:36.495 913828 WARNING oslo.service.loopingcall [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Function 'neutron.service.Service.periodic_tasks' run outlasted interval by 592.47 sec

Aug 22 04:52:47 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:52:47.202 913828 WARNING oslo.service.loopingcall [req-7fea5ac1-e7f5-4276-804f-3991b22b286b - - - - -] Function 'neutron_dynamic_routing.services.bgp.agent.bgp_dragent.BgpDrAgentWithStateReport._report_state' run outlasted interval by 0.01 sec

Aug 22 04:57:36 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:57:36.505 913828 ERROR neutron_lib.rpc [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Timeout in RPC method get_bgp_speakers. Waiting for 30 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 6a1de38f26074c5e95c959c2c52ce71e
Aug 22 04:58:06 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:58:06.207 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Unable to sync BGP speaker state.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 6a1de38f26074c5e95c959c2c52ce71e
Aug 22 04:58:06 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 04:58:06.211 913828 WARNING oslo.service.loopingcall [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Function 'neutron.service.Service.periodic_tasks' run outlasted interval by 589.71 sec

Aug 22 05:02:47 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 05:02:47.218 913828 WARNING oslo.service.loopingcall [req-40496068-0ca9-49a9-86c1-9b976b7a848c - - - - -] Function 'neutron_dynamic_routing.services.bgp.agent.bgp_dragent.BgpDrAgentWithStateReport._report_state' run outlasted interval by 0.01 sec

Aug 22 05:08:06 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 05:08:06.227 913828 ERROR neutron_lib.rpc [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Timeout in RPC method get_bgp_speakers. Waiting for 54 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 0be3189224cc4d55bbf97a9afb9efd3d
Aug 22 05:09:00 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 05:09:00.501 913828 ERROR neutron_dynamic_routing.services.bgp.agent.bgp_dragent [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Unable to sync BGP speaker state.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 0be3189224cc4d55bbf97a9afb9efd3d
Aug 22 05:09:00 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 05:09:00.505 913828 WARNING oslo.service.loopingcall [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] Function 'neutron.service.Service.periodic_tasks' run outlasted interval by 614.29 sec
Aug 22 05:09:00 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 05:09:00.547 913828 INFO bgpspeaker.api.base [req-e307c164-023d-4a74-94cb-40ccea100eb5 - - - - -] API method core.stop called with args: {}
Aug 22 05:09:00 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 05:09:00.552 913828 INFO bgpspeaker.peer [-] Connection to peer fc00:ca5a:ca5a:1004::1a lost, reason: Connection lost as protocol is no longer active Resetting retry connect loop: False
Aug 22 05:09:00 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 05:09:00.553 913828 INFO neutron_dynamic_routing.services.bgp.agent.driver.os_ken.driver [-] BGP Peer fc00:ca5a:ca5a:1004::1a for remote_as=64664 went DOWN.
Aug 22 05:09:00 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 05:09:00.557 913828 INFO bgpspeaker.speaker [-] Connection lost as protocol is no longer active
Aug 22 05:09:00 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 05:09:00.557 913828 INFO bgpspeaker.peer [-] Connection to peer fc00:ca5a:ca5a:1004::1b lost, reason: Connection lost as protocol is no longer active Resetting retry connect loop: False
Aug 22 05:09:00 dragent-prod-1001 neutron-bgp-dragent[913828]: 2023-08-22 05:09:00.557 913828 INFO neutron_dynamic_routing.services.bgp.agent.driver.os_ken.driver [-] BGP Peer fc00:ca5a:ca5a:1004::1b for remote_as=64664 went DOWN.

Revision history for this message
Brian Haley (brian-haley) wrote :

Should this be closed as a duplicate of https://bugs.launchpad.net/neutron/+bug/2006145 ?

Revision history for this message
Roberto Bartzen Acosta (rbartzen) wrote :

Yes, the simulation process is slightly different (RMQ offline vs rpc timeouts/intermittent) but the behavior is essentially the same.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.