Activity log for bug #2054844

Date Who What changed Old value New value Message
2024-02-23 17:38:21 Shrishail Kariyappanavar bug added bug
2024-02-25 18:33:49 Shrishail Kariyappanavar description While attempting to do an in-series upgrade(using kolla deploy) with new set of images, we noticed with good consistency that the rabbitmq would get into an unstable state post deploy. The direct impact of this is generally between neutron-server and neutron-agents. All or some neutron agents are not able to reach neutron-server and hence declared dead by neutron-server. I am trying to go from 2023.1-cad045b26-20231101 to 2023.1-95b7c30cf-20240222. I have also hit the issue when going from 2023.1-cad045b26-20231101 to 2023.1-cad045b26-<different-date> with some local changes for glance-api. As a workaround, I stopped all rabbitmq containers first, before starting them one by one. I was able to edit the deploy steps to use this logic and have not seen the issue. Adding some details from neutron-serve, neutron-l3-agent and rabbitmq. root@5ebbf78d5d16:/muon# openstack network agent list +--------------------------------------+---------------------------+----------------------------+-------------------+-------+-------+---------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+---------------------------+----------------------------+-------------------+-------+-------+---------------------------+ | 09f2626a-7793-4329-a453-bb6338247a92 | Metadata agent | sandbox-skariyap1-muon1003 | None | XXX | UP | neutron-metadata-agent | | 21650b7b-f4e5-4fc9-a7c3-f6fe082c4962 | BGP dynamic routing agent | sandbox-skariyap1-muon1003 | None | XXX | UP | neutron-bgp-dragent | | 2a69a575-9035-4db2-bd29-641c792825d5 | Open vSwitch agent | sandbox-skariyap1-muon1002 | None | XXX | UP | neutron-openvswitch-agent | | 41519097-2a99-480b-92d5-35aca78a0bc7 | L3 agent | sandbox-skariyap1-muon1003 | nova | XXX | UP | neutron-l3-agent | | 49dd98fd-d123-4b60-b6f8-fa689368cf19 | NIC Switch agent | sandbox-skariyap1-muon1006 | None | XXX | UP | neutron-sriov-nic-agent | | 6138cdd4-0972-41fa-baf7-c23442c1fff3 | Open vSwitch agent | sandbox-skariyap1-muon1006 | None | XXX | UP | neutron-openvswitch-agent | | 638fe504-a5f1-49e0-be64-d311d7cb9749 | Metadata agent | sandbox-skariyap1-muon1002 | None | XXX | UP | neutron-metadata-agent | | 64d0283e-aa50-4f97-8313-4987443c3d67 | L3 agent | sandbox-1001 | nova | XXX | UP | neutron-l3-agent | | 700513c7-21dd-4ca2-8b2c-4d69195377bd | NIC Switch agent | sandbox-1004 | None | XXX | UP | neutron-sriov-nic-agent | | 785b4321-e685-4204-9ccd-46e0d61809a6 | BGP dynamic routing agent | sandbox-1001 | None | XXX | UP | neutron-bgp-dragent | | 84d49924-7da4-4675-861d-2ac7e5ad7a28 | NIC Switch agent | sandbox-1005 | None | XXX | UP | neutron-sriov-nic-agent | | 8cc54104-9ef8-4b2e-be43-b4c1bf6d9d9d | Open vSwitch agent | sandbox-1001 | None | XXX | UP | neutron-openvswitch-agent | | a8ba3583-19f9-487f-a9a3-504f3ad3aea5 | Metadata agent | sandbox-1001 | None | XXX | UP | neutron-metadata-agent | | aa1b4184-d6e5-4913-8083-e53455f19abc | BGP dynamic routing agent | sandbox-1002 | None | XXX | UP | neutron-bgp-dragent | | c85e6021-5bcb-496a-a4cc-4944955687c0 | DHCP agent | sandbox-1003 | nova | XXX | UP | neutron-dhcp-agent | | d61bec18-bfe0-44a6-bd86-a154d4450c97 | DHCP agent | sandbox-1001 | nova | XXX | UP | neutron-dhcp-agent | | d7bb5ec9-85b9-4897-a1be-d8572f9128f3 | DHCP agent | sandbox-1002 | nova | XXX | UP | neutron-dhcp-agent | | d9dadeda-0f44-4c43-be83-8408bf75e9b4 | L3 agent | sandbox-1002 | nova | XXX | UP | neutron-l3-agent | | dae82255-73a2-4dc7-8045-3f04047f953e | Open vSwitch agent | sandbox-1003 | None | XXX | UP | neutron-openvswitch-agent | | f446695a-9533-4b89-97a2-6a0f367a5fbd | Open vSwitch agent | sandbox-1004 | None | XXX | UP | neutron-openvswitch-agent | | f8765d4b-a2e3-438d-a915-3bc59f5ed3f6 | Open vSwitch agent | sandbox-1005 | None | XXX | UP | neutron-openvswitch-agent | +--------------------------------------+---------------------------+----------------------------+-------------------+-------+-------+---------------------------+ Neutron-server logs: 2024-02-23 17:27:51.838 1025 WARNING neutron.db.agents_db [None req-7722a349-d54f-4b84-bfda-eeef570c0c63 - - - - - -] Agent healthcheck: found 21 dead agents out of 21: Type Last heartbeat host Metadata agent 2024-02-23 03:58:00 sandbox-1003 BGP dynamic routing agent 2024-02-23 03:58:03 sandbox-1003 Open vSwitch agent 2024-02-23 03:57:17 sandbox-1002 L3 agent 2024-02-23 03:57:24 sandbox-1003 NIC Switch agent 2024-02-23 03:57:54 sandbox-1006 Open vSwitch agent 2024-02-23 03:57:46 sandbox-1006 Metadata agent 2024-02-23 03:58:00 sandbox-1002 L3 agent 2024-02-23 03:57:55 sandbox-1001 NIC Switch agent 2024-02-23 03:57:24 sandbox-1004 BGP dynamic routing agent 2024-02-23 03:58:02 sandbox-1001 NIC Switch agent 2024-02-23 03:57:56 sandbox-1005 Open vSwitch agent 2024-02-23 03:57:47 sandbox-1001 Metadata agent 2024-02-23 03:58:01 sandbox-1001 BGP dynamic routing agent 2024-02-23 03:58:02 sandbox-1002 DHCP agent 2024-02-23 03:57:45 sandbox-1003 DHCP agent 2024-02-23 03:57:45 sandbox-1001 DHCP agent 2024-02-23 03:57:45 sandbox-1002 L3 agent 2024-02-23 03:57:23 sandbox-1002 Open vSwitch agent 2024-02-23 03:57:17 sandbox-1003 Open vSwitch agent 2024-02-23 03:57:46 sandbox-1004 Open vSwitch agent 2024-02-23 03:57:46 sandbox-1005 2024-02-23 17:27:51.885 1025 WARNING neutron.db.agentschedulers_db [None req-53fb4a11-1c42-49b2-bd33-a15168410fff - - - - - -] Agent d61bec18-bfe0-44a6-bd86-a154d4450c97 is down. Type: DHCP agent, host: sandbox-1001, heartbeat: 2024-02-23 03:57:45 2024-02-23 17:27:51.889 1025 WARNING neutron.db.agentschedulers_db [None req-53fb4a11-1c42-49b2-bd33-a15168410fff - - - - - -] Agent d7bb5ec9-85b9-4897-a1be-d8572f9128f3 is down. Type: DHCP agent, host: sandbox-1002, heartbeat: 2024-02-23 03:57:45 2024-02-23 17:27:51.892 1025 WARNING neutron.db.agentschedulers_db [None req-53fb4a11-1c42-49b2-bd33-a15168410fff - - - - - -] Agent c85e6021-5bcb-496a-a4cc-4944955687c0 is down. Type: DHCP agent, host: sandbox-1003, heartbeat: 2024-02-23 03:57:45 2024-02-23 17:27:51.895 1025 WARNING neutron.db.agentschedulers_db [None req-53fb4a11-1c42-49b2-bd33-a15168410fff - - - - - -] No DHCP agents available, skipping rescheduling 2024-02-23 17:27:52.524 1015 INFO neutron.wsgi [None req-8c7530f6-6297-4aa9-919d-6c178a59684f 3c6d7d854110451eaafcac1a84d61ba4 bf32d504d5c7413b9bb89007389d3f1d - - default default] 169.254.101.12,127.0.0.1 "GET /v2.0/routers HTTP/1.1" status: 200 len: 1668 time: 0.1564512 2024-02-23 17:27:52.788 1015 INFO neutron.wsgi [None req-dacfaf45-d63f-40e1-8dd4-48bfe93d75bf 3c6d7d854110451eaafcac1a84d61ba4 bf32d504d5c7413b9bb89007389d3f1d - - default default] 169.254.101.12,127.0.0.1 "GET /v2.0/network-ip-availabilities HTTP/1.1" status: 200 len: 3060 time: 0.0114617 2024-02-23 17:27:52.876 1015 INFO neutron.wsgi [None req-88212439-8200-43d8-ba5f-b0be15862a38 3c6d7d854110451eaafcac1a84d61ba4 bf32d504d5c7413b9bb89007389d3f1d - - default default] 169.254.101.12,127.0.0.1 "GET /v2.0/floatingips HTTP/1.1" status: 200 len: 193 time: 0.0225327 2024-02-23 17:27:53.008 1015 INFO neutron neutron-l3-agent logs: 2024-02-23 16:30:08.306 1055 WARNING neutron.agent.l3.agent [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] l3-agent cannot contact neutron server to retrieve HA router count. Check connectivity to neutron server. Retrying... Detailed message: Timed out waiting for a reply to message ID 1d634d394abf468b8475b741bd9b27a8.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 1d634d394abf468b8475b741bd9b27a8 2024-02-23 16:40:08.309 1055 ERROR neutron_lib.rpc [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] Timeout in RPC method get_host_ha_router_count. Waiting for 15 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID fad31618c5294f6a9e2f1c37d442c959 2024-02-23 16:40:22.925 1055 WARNING neutron.agent.l3.agent [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] l3-agent cannot contact neutron server to retrieve HA router count. Check connectivity to neutron server. Retrying... Detailed message: Timed out waiting for a reply to message ID fad31618c5294f6a9e2f1c37d442c959.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID fad31618c5294f6a9e2f1c37d442c959 2024-02-23 16:50:22.933 1055 ERROR neutron_lib.rpc [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] Timeout in RPC method get_host_ha_router_count. Waiting for 9 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 9fbb1e107e56464883a7e32dae311967 2024-02-23 16:50:32.349 1055 WARNING neutron.agent.l3.agent [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] l3-agent cannot contact neutron server to retrieve HA router count. Check connectivity to neutron server. Retrying... Detailed message: Timed out waiting for a reply to message ID 9fbb1e107e56464883a7e32dae311967.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 9fbb1e107e56464883a7e32dae311967 2024-02-23 17:00:32.353 1055 ERROR neutron_lib.rpc [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] Timeout in RPC method get_host_ha_router_count. Waiting for 28 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 38d94fd2657b479b8980aafca6813de7 2024-02-23 17:01:00.725 1055 WARNING neutron.agent.l3.agent [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] l3-agent cannot contact neutron server to retrieve HA router count. Check connectivity to neutron server. Retrying... Detailed message: Timed out waiting for a reply to message ID 38d94fd2657b479b8980aafca6813de7.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 38d94fd2657b479b8980aafca6813de7 2024-02-23 17:11:00.732 1055 ERROR neutron_lib.rpc [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] Timeout in RPC method get_host_ha_router_count. Waiting for 60 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 38deafbd33254872bc2d955a01cce2c9 2024-02-23 17:12:00.690 1055 WARNING neutron.agent.l3.agent [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] l3-agent cannot contact neutron server to retrieve HA router count. Check connectivity to neutron server. Retrying... Detailed message: Timed out waiting for a reply to message ID 38deafbd33254872bc2d955a01cce2c9.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 38deafbd33254872bc2d955a01cce2c9 2024-02-23 17:22:00.694 1055 ERROR neutron_lib.rpc [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] Timeout in RPC method get_host_ha_router_count. Waiting for 58 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 8d206bc20222443c81525fdd93d276eb 2024-02-23 17:22:58.324 1055 WARNING neutron.agent.l3.agent [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] l3-agent cannot contact neutron server to retrieve HA router count. Check connectivity to neutron server. Retrying... Detailed message: Timed out waiting for a reply to message ID 8d206bc20222443c81525fdd93d276eb.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 8d206bc20222443c81525fdd93d276eb rabbitmq cluster status: (rabbitmq)[rabbitmq@sandbox-1001 /]$ rabbitmqctl cluster_status Cluster status of node rabbit@sandbox-1001 ... Basics Cluster name: rabbit@sandbox-1001 Total CPU cores available cluster-wide: 30 Disk Nodes rabbit@sandbox-1001 rabbit@sandbox-1002 rabbit@sandbox-1003 Running Nodes rabbit@sandbox-1001 rabbit@sandbox-1002 rabbit@sandbox-1003 Versions rabbit@sandbox-1001: RabbitMQ 3.11.28 on Erlang 25.3.2.9 rabbit@sandbox-1002: RabbitMQ 3.11.28 on Erlang 25.3.2.9 rabbit@sandbox-1003: RabbitMQ 3.11.28 on Erlang 25.3.2.9 CPU Cores Node: rabbit@sandbox-1001, available CPU cores: 10 Node: rabbit@sandbox-1002, available CPU cores: 10 Node: rabbit@sandbox-1003, available CPU cores: 10 Maintenance status Node: rabbit@sandbox-1001, status: not under maintenance Node: rabbit@sandbox-1002, status: not under maintenance Node: rabbit@sandbox-1003, status: not under maintenance Alarms (none) Network Partitions (none) Listeners Node: rabbit@sandbox-1001, interface: 169.254.101.11, port: 15672, protocol: http, purpose: HTTP API Node: rabbit@sandbox-1001, interface: 169.254.101.11, port: 15692, protocol: http/prometheus, purpose: Prometheus exporter API over HTTP Node: rabbit@sandbox-1001, interface: 169.254.101.11, port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication Node: rabbit@sandbox-1001, interface: 169.254.101.11, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0 Node: rabbit@sandbox-1002, interface: 169.254.101.12, port: 15672, protocol: http, purpose: HTTP API Node: rabbit@sandbox-1002, interface: 169.254.101.12, port: 15692, protocol: http/prometheus, purpose: Prometheus exporter API over HTTP Node: rabbit@sandbox-1002, interface: 169.254.101.12, port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication Node: rabbit@sandbox-1002, interface: 169.254.101.12, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0 Node: rabbit@sandbox-1003, interface: 169.254.101.13, port: 15672, protocol: http, purpose: HTTP API Node: rabbit@sandbox-1003, interface: 169.254.101.13, port: 15692, protocol: http/prometheus, purpose: Prometheus exporter API over HTTP Node: rabbit@sandbox-1003, interface: 169.254.101.13, port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication Node: rabbit@sandbox-1003, interface: 169.254.101.13, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0 Feature flags Flag: classic_mirrored_queue_version, state: enabled Flag: classic_queue_type_delivery_support, state: enabled Flag: direct_exchange_routing_v2, state: enabled Flag: drop_unroutable_metric, state: enabled Flag: empty_basic_get_metric, state: enabled Flag: feature_flags_v2, state: enabled Flag: implicit_default_bindings, state: enabled Flag: listener_records_in_ets, state: enabled Flag: maintenance_mode_status, state: enabled Flag: quorum_queue, state: enabled Flag: stream_queue, state: enabled Flag: stream_sac_coordinator_unblock_group, state: enabled Flag: stream_single_active_consumer, state: enabled Flag: tracking_records_in_ets, state: enabled Flag: user_limits, state: enabled Flag: virtual_host_metadata, state: enabled While attempting to do an in-series upgrade(using kolla deploy) with new set of images, we noticed with good consistency that the rabbitmq would get into an unstable state post deploy. The direct impact of this is generally between neutron-server and neutron-agents. All or some neutron agents are not able to reach neutron-server and hence declared dead by neutron-server. I am trying to go from 2023.1-cad045b26-20231101 to 2023.1-95b7c30cf-20240222. I have also hit the issue when going from 2023.1-cad045b26-20231101 to 2023.1-cad045b26-<different-date> with some local changes for glance-api. As a workaround, I stopped all rabbitmq containers first, before starting them one by one. I was able to edit the deploy steps to use this logic and have not seen the issue. Adding some details from neutron-serve, neutron-l3-agent and rabbitmq. root@5ebbf78d5d16:/# openstack network agent list +--------------------------------------+---------------------------+----------------------------+-------------------+-------+-------+---------------------------+ | ID | Agent Type | Host | Availability Zone | Alive | State | Binary | +--------------------------------------+---------------------------+----------------------------+-------------------+-------+-------+---------------------------+ | 09f2626a-7793-4329-a453-bb6338247a92 | Metadata agent | sandbox-1003 | None | XXX | UP | neutron-metadata-agent | | 21650b7b-f4e5-4fc9-a7c3-f6fe082c4962 | BGP dynamic routing agent | sandbox-1003 | None | XXX | UP | neutron-bgp-dragent | | 2a69a575-9035-4db2-bd29-641c792825d5 | Open vSwitch agent | sandbox-1002 | None | XXX | UP | neutron-openvswitch-agent | | 41519097-2a99-480b-92d5-35aca78a0bc7 | L3 agent | sandbox-1003 | nova | XXX | UP | neutron-l3-agent | | 49dd98fd-d123-4b60-b6f8-fa689368cf19 | NIC Switch agent | sandbox-1006 | None | XXX | UP | neutron-sriov-nic-agent | | 6138cdd4-0972-41fa-baf7-c23442c1fff3 | Open vSwitch agent | sandbox-1006 | None | XXX | UP | neutron-openvswitch-agent | | 638fe504-a5f1-49e0-be64-d311d7cb9749 | Metadata agent | sandbox-1002 | None | XXX | UP | neutron-metadata-agent | | 64d0283e-aa50-4f97-8313-4987443c3d67 | L3 agent | sandbox-1001 | nova | XXX | UP | neutron-l3-agent | | 700513c7-21dd-4ca2-8b2c-4d69195377bd | NIC Switch agent | sandbox-1004 | None | XXX | UP | neutron-sriov-nic-agent | | 785b4321-e685-4204-9ccd-46e0d61809a6 | BGP dynamic routing agent | sandbox-1001 | None | XXX | UP | neutron-bgp-dragent | | 84d49924-7da4-4675-861d-2ac7e5ad7a28 | NIC Switch agent | sandbox-1005 | None | XXX | UP | neutron-sriov-nic-agent | | 8cc54104-9ef8-4b2e-be43-b4c1bf6d9d9d | Open vSwitch agent | sandbox-1001 | None | XXX | UP | neutron-openvswitch-agent | | a8ba3583-19f9-487f-a9a3-504f3ad3aea5 | Metadata agent | sandbox-1001 | None | XXX | UP | neutron-metadata-agent | | aa1b4184-d6e5-4913-8083-e53455f19abc | BGP dynamic routing agent | sandbox-1002 | None | XXX | UP | neutron-bgp-dragent | | c85e6021-5bcb-496a-a4cc-4944955687c0 | DHCP agent | sandbox-1003 | nova | XXX | UP | neutron-dhcp-agent | | d61bec18-bfe0-44a6-bd86-a154d4450c97 | DHCP agent | sandbox-1001 | nova | XXX | UP | neutron-dhcp-agent | | d7bb5ec9-85b9-4897-a1be-d8572f9128f3 | DHCP agent | sandbox-1002 | nova | XXX | UP | neutron-dhcp-agent | | d9dadeda-0f44-4c43-be83-8408bf75e9b4 | L3 agent | sandbox-1002 | nova | XXX | UP | neutron-l3-agent | | dae82255-73a2-4dc7-8045-3f04047f953e | Open vSwitch agent | sandbox-1003 | None | XXX | UP | neutron-openvswitch-agent | | f446695a-9533-4b89-97a2-6a0f367a5fbd | Open vSwitch agent | sandbox-1004 | None | XXX | UP | neutron-openvswitch-agent | | f8765d4b-a2e3-438d-a915-3bc59f5ed3f6 | Open vSwitch agent | sandbox-1005 | None | XXX | UP | neutron-openvswitch-agent | +--------------------------------------+---------------------------+----------------------------+-------------------+-------+-------+---------------------------+ Neutron-server logs: 2024-02-23 17:27:51.838 1025 WARNING neutron.db.agents_db [None req-7722a349-d54f-4b84-bfda-eeef570c0c63 - - - - - -] Agent healthcheck: found 21 dead agents out of 21:                 Type Last heartbeat host       Metadata agent 2024-02-23 03:58:00 sandbox-1003 BGP dynamic routing agent 2024-02-23 03:58:03 sandbox-1003   Open vSwitch agent 2024-02-23 03:57:17 sandbox-1002             L3 agent 2024-02-23 03:57:24 sandbox-1003     NIC Switch agent 2024-02-23 03:57:54 sandbox-1006   Open vSwitch agent 2024-02-23 03:57:46 sandbox-1006       Metadata agent 2024-02-23 03:58:00 sandbox-1002             L3 agent 2024-02-23 03:57:55 sandbox-1001     NIC Switch agent 2024-02-23 03:57:24 sandbox-1004 BGP dynamic routing agent 2024-02-23 03:58:02 sandbox-1001     NIC Switch agent 2024-02-23 03:57:56 sandbox-1005   Open vSwitch agent 2024-02-23 03:57:47 sandbox-1001       Metadata agent 2024-02-23 03:58:01 sandbox-1001 BGP dynamic routing agent 2024-02-23 03:58:02 sandbox-1002           DHCP agent 2024-02-23 03:57:45 sandbox-1003           DHCP agent 2024-02-23 03:57:45 sandbox-1001           DHCP agent 2024-02-23 03:57:45 sandbox-1002             L3 agent 2024-02-23 03:57:23 sandbox-1002   Open vSwitch agent 2024-02-23 03:57:17 sandbox-1003   Open vSwitch agent 2024-02-23 03:57:46 sandbox-1004   Open vSwitch agent 2024-02-23 03:57:46 sandbox-1005 2024-02-23 17:27:51.885 1025 WARNING neutron.db.agentschedulers_db [None req-53fb4a11-1c42-49b2-bd33-a15168410fff - - - - - -] Agent d61bec18-bfe0-44a6-bd86-a154d4450c97 is down. Type: DHCP agent, host: sandbox-1001, heartbeat: 2024-02-23 03:57:45 2024-02-23 17:27:51.889 1025 WARNING neutron.db.agentschedulers_db [None req-53fb4a11-1c42-49b2-bd33-a15168410fff - - - - - -] Agent d7bb5ec9-85b9-4897-a1be-d8572f9128f3 is down. Type: DHCP agent, host: sandbox-1002, heartbeat: 2024-02-23 03:57:45 2024-02-23 17:27:51.892 1025 WARNING neutron.db.agentschedulers_db [None req-53fb4a11-1c42-49b2-bd33-a15168410fff - - - - - -] Agent c85e6021-5bcb-496a-a4cc-4944955687c0 is down. Type: DHCP agent, host: sandbox-1003, heartbeat: 2024-02-23 03:57:45 2024-02-23 17:27:51.895 1025 WARNING neutron.db.agentschedulers_db [None req-53fb4a11-1c42-49b2-bd33-a15168410fff - - - - - -] No DHCP agents available, skipping rescheduling 2024-02-23 17:27:52.524 1015 INFO neutron.wsgi [None req-8c7530f6-6297-4aa9-919d-6c178a59684f 3c6d7d854110451eaafcac1a84d61ba4 bf32d504d5c7413b9bb89007389d3f1d - - default default] 169.254.101.12,127.0.0.1 "GET /v2.0/routers HTTP/1.1" status: 200 len: 1668 time: 0.1564512 2024-02-23 17:27:52.788 1015 INFO neutron.wsgi [None req-dacfaf45-d63f-40e1-8dd4-48bfe93d75bf 3c6d7d854110451eaafcac1a84d61ba4 bf32d504d5c7413b9bb89007389d3f1d - - default default] 169.254.101.12,127.0.0.1 "GET /v2.0/network-ip-availabilities HTTP/1.1" status: 200 len: 3060 time: 0.0114617 2024-02-23 17:27:52.876 1015 INFO neutron.wsgi [None req-88212439-8200-43d8-ba5f-b0be15862a38 3c6d7d854110451eaafcac1a84d61ba4 bf32d504d5c7413b9bb89007389d3f1d - - default default] 169.254.101.12,127.0.0.1 "GET /v2.0/floatingips HTTP/1.1" status: 200 len: 193 time: 0.0225327 2024-02-23 17:27:53.008 1015 INFO neutron neutron-l3-agent logs: 2024-02-23 16:30:08.306 1055 WARNING neutron.agent.l3.agent [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] l3-agent cannot contact neutron server to retrieve HA router count. Check connectivity to neutron server. Retrying... Detailed message: Timed out waiting for a reply to message ID 1d634d394abf468b8475b741bd9b27a8.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 1d634d394abf468b8475b741bd9b27a8 2024-02-23 16:40:08.309 1055 ERROR neutron_lib.rpc [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] Timeout in RPC method get_host_ha_router_count. Waiting for 15 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID fad31618c5294f6a9e2f1c37d442c959 2024-02-23 16:40:22.925 1055 WARNING neutron.agent.l3.agent [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] l3-agent cannot contact neutron server to retrieve HA router count. Check connectivity to neutron server. Retrying... Detailed message: Timed out waiting for a reply to message ID fad31618c5294f6a9e2f1c37d442c959.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID fad31618c5294f6a9e2f1c37d442c959 2024-02-23 16:50:22.933 1055 ERROR neutron_lib.rpc [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] Timeout in RPC method get_host_ha_router_count. Waiting for 9 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 9fbb1e107e56464883a7e32dae311967 2024-02-23 16:50:32.349 1055 WARNING neutron.agent.l3.agent [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] l3-agent cannot contact neutron server to retrieve HA router count. Check connectivity to neutron server. Retrying... Detailed message: Timed out waiting for a reply to message ID 9fbb1e107e56464883a7e32dae311967.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 9fbb1e107e56464883a7e32dae311967 2024-02-23 17:00:32.353 1055 ERROR neutron_lib.rpc [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] Timeout in RPC method get_host_ha_router_count. Waiting for 28 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 38d94fd2657b479b8980aafca6813de7 2024-02-23 17:01:00.725 1055 WARNING neutron.agent.l3.agent [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] l3-agent cannot contact neutron server to retrieve HA router count. Check connectivity to neutron server. Retrying... Detailed message: Timed out waiting for a reply to message ID 38d94fd2657b479b8980aafca6813de7.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 38d94fd2657b479b8980aafca6813de7 2024-02-23 17:11:00.732 1055 ERROR neutron_lib.rpc [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] Timeout in RPC method get_host_ha_router_count. Waiting for 60 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 38deafbd33254872bc2d955a01cce2c9 2024-02-23 17:12:00.690 1055 WARNING neutron.agent.l3.agent [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] l3-agent cannot contact neutron server to retrieve HA router count. Check connectivity to neutron server. Retrying... Detailed message: Timed out waiting for a reply to message ID 38deafbd33254872bc2d955a01cce2c9.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 38deafbd33254872bc2d955a01cce2c9 2024-02-23 17:22:00.694 1055 ERROR neutron_lib.rpc [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] Timeout in RPC method get_host_ha_router_count. Waiting for 58 seconds before next attempt. If the server is not down, consider increasing the rpc_response_timeout option as Neutron server(s) may be overloaded and unable to respond quickly enough.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 8d206bc20222443c81525fdd93d276eb 2024-02-23 17:22:58.324 1055 WARNING neutron.agent.l3.agent [None req-812757eb-1f4a-46c7-9a78-9939dd5901e5 - - - - - -] l3-agent cannot contact neutron server to retrieve HA router count. Check connectivity to neutron server. Retrying... Detailed message: Timed out waiting for a reply to message ID 8d206bc20222443c81525fdd93d276eb.: oslo_messaging.exceptions.MessagingTimeout: Timed out waiting for a reply to message ID 8d206bc20222443c81525fdd93d276eb rabbitmq cluster status: (rabbitmq)[rabbitmq@sandbox-1001 /]$ rabbitmqctl cluster_status Cluster status of node rabbit@sandbox-1001 ... Basics Cluster name: rabbit@sandbox-1001 Total CPU cores available cluster-wide: 30 Disk Nodes rabbit@sandbox-1001 rabbit@sandbox-1002 rabbit@sandbox-1003 Running Nodes rabbit@sandbox-1001 rabbit@sandbox-1002 rabbit@sandbox-1003 Versions rabbit@sandbox-1001: RabbitMQ 3.11.28 on Erlang 25.3.2.9 rabbit@sandbox-1002: RabbitMQ 3.11.28 on Erlang 25.3.2.9 rabbit@sandbox-1003: RabbitMQ 3.11.28 on Erlang 25.3.2.9 CPU Cores Node: rabbit@sandbox-1001, available CPU cores: 10 Node: rabbit@sandbox-1002, available CPU cores: 10 Node: rabbit@sandbox-1003, available CPU cores: 10 Maintenance status Node: rabbit@sandbox-1001, status: not under maintenance Node: rabbit@sandbox-1002, status: not under maintenance Node: rabbit@sandbox-1003, status: not under maintenance Alarms (none) Network Partitions (none) Listeners Node: rabbit@sandbox-1001, interface: 169.254.101.11, port: 15672, protocol: http, purpose: HTTP API Node: rabbit@sandbox-1001, interface: 169.254.101.11, port: 15692, protocol: http/prometheus, purpose: Prometheus exporter API over HTTP Node: rabbit@sandbox-1001, interface: 169.254.101.11, port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication Node: rabbit@sandbox-1001, interface: 169.254.101.11, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0 Node: rabbit@sandbox-1002, interface: 169.254.101.12, port: 15672, protocol: http, purpose: HTTP API Node: rabbit@sandbox-1002, interface: 169.254.101.12, port: 15692, protocol: http/prometheus, purpose: Prometheus exporter API over HTTP Node: rabbit@sandbox-1002, interface: 169.254.101.12, port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication Node: rabbit@sandbox-1002, interface: 169.254.101.12, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0 Node: rabbit@sandbox-1003, interface: 169.254.101.13, port: 15672, protocol: http, purpose: HTTP API Node: rabbit@sandbox-1003, interface: 169.254.101.13, port: 15692, protocol: http/prometheus, purpose: Prometheus exporter API over HTTP Node: rabbit@sandbox-1003, interface: 169.254.101.13, port: 25672, protocol: clustering, purpose: inter-node and CLI tool communication Node: rabbit@sandbox-1003, interface: 169.254.101.13, port: 5672, protocol: amqp, purpose: AMQP 0-9-1 and AMQP 1.0 Feature flags Flag: classic_mirrored_queue_version, state: enabled Flag: classic_queue_type_delivery_support, state: enabled Flag: direct_exchange_routing_v2, state: enabled Flag: drop_unroutable_metric, state: enabled Flag: empty_basic_get_metric, state: enabled Flag: feature_flags_v2, state: enabled Flag: implicit_default_bindings, state: enabled Flag: listener_records_in_ets, state: enabled Flag: maintenance_mode_status, state: enabled Flag: quorum_queue, state: enabled Flag: stream_queue, state: enabled Flag: stream_sac_coordinator_unblock_group, state: enabled Flag: stream_single_active_consumer, state: enabled Flag: tracking_records_in_ets, state: enabled Flag: user_limits, state: enabled Flag: virtual_host_metadata, state: enabled