Activity log for bug #1666765

Date Who What changed Old value New value Message
2017-02-22 05:19:28 Bjoern bug added bug
2017-02-22 05:19:34 Bjoern openstack-ansible: assignee Bjoern Teipel (bjoern-teipel)
2017-02-22 05:30:00 Bjoern description After debugging a the MessagingTimeout: Timed out waiting for a reply to message ID issue in Kilo I realized that we do not configure the rpc settings like rpc_response_timeout for the neutron agents, which indeed use few RPC settings like rpc_workers, rpc_response_timeout and possibly others. After I used the same rpc_response_timeout as the neutron server, the L3 agent became operation. Error: 2017-02-21 06:26:49.503 13484 ERROR neutron.agent.l3.agent [req-d37cf492-e1cc-49ef-b729-d0f7055e238c ] Failed synchronizing routers due to RPC error 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent Traceback (most recent call last): 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/neutron/agent/l3/agent.py", line 523, in fetch_and_sync_all_routers 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent routers = self.plugin_rpc.get_routers(context) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/neutron/agent/l3/agent.py", line 92, in get_routers 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent router_ids=router_ids) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 156, in call 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent retry=self.retry) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent timeout=timeout, retry=retry) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 350, in send 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent retry=retry) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 339, in _send 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent result = self._waiter.wait(msg_id, timeout) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 243, in wait 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent message = self.waiters.get(msg_id, timeout=timeout) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 149, in get 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent 'to message ID %s' % msg_id) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent MessagingTimeout: Timed out waiting for a reply to message ID 86808c8fcfc9443c84a2b0fd6e6f1710 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent It is not clear why we fixed this in master but not, even partially, back port it into the active branches. Considering the amount if time it took to troubleshoot this issue. I will go ahead and submit a fix for Mitaka since Newton and newer is already corrected. After debugging a the MessagingTimeout: Timed out waiting for a reply to message ID issue in Kilo I realized that we do not configure the rpc settings like rpc_response_timeout for the neutron agents, which indeed use few RPC settings like rpc_workers, rpc_response_timeout and possibly others. After I used the same rpc_response_timeout as the neutron server, the L3 agent became operational again. Error: 2017-02-21 06:26:49.503 13484 ERROR neutron.agent.l3.agent [req-d37cf492-e1cc-49ef-b729-d0f7055e238c ] Failed synchronizing routers due to RPC error 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent Traceback (most recent call last): 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/neutron/agent/l3/agent.py", line 523, in fetch_and_sync_all_routers 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent routers = self.plugin_rpc.get_routers(context) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/neutron/agent/l3/agent.py", line 92, in get_routers 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent router_ids=router_ids) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/rpc/client.py", line 156, in call 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent retry=self.retry) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/transport.py", line 90, in _send 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent timeout=timeout, retry=retry) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 350, in send 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent retry=retry) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 339, in _send 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent result = self._waiter.wait(msg_id, timeout) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 243, in wait 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent message = self.waiters.get(msg_id, timeout=timeout) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent File "/usr/local/lib/python2.7/dist-packages/oslo_messaging/_drivers/amqpdriver.py", line 149, in get 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent 'to message ID %s' % msg_id) 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent MessagingTimeout: Timed out waiting for a reply to message ID 86808c8fcfc9443c84a2b0fd6e6f1710 2017-02-21 06:26:49.503 13484 TRACE neutron.agent.l3.agent It is not clear why we fixed this in master but not, even partially, back port it into the active branches. Considering the amount if time it took to troubleshoot this issue. I will go ahead and submit a fix for Mitaka since Newton and newer is already corrected.
2017-02-22 19:44:15 OpenStack Infra tags in-stable-mitaka
2017-02-28 16:53:02 Jean-Philippe Evrard openstack-ansible: status New Fix Committed
2018-02-12 16:33:12 Bjoern openstack-ansible: status Fix Committed Fix Released