neutron-l3-agent crashes on boot without a running neutron-server

Bug #1368795 reported by Derek Higgins
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
tripleo
Fix Released
High
Unassigned

Bug Description

Restarting the neutron-l3-agent without a running neutron server (or presumably before its ready for connections) results in the traceback below and the l3 agent terminating after 1 minute

This in turn is causing VM's on the overcloud to fail the boot process (we just see)

2014-09-11 19:08:49.993 | + wait_for 30 10 ping -c 1 192.0.2.46
2014-09-11 19:15:20.422 | Timing out after 300 seconds:
2014-09-11 19:15:20.422 | COMMAND=ping -c 1 192.0.2.46
2014-09-11 19:15:20.422 | OUTPUT=PING 192.0.2.46 (192.0.2.46) 56(84) bytes of data.
2014-09-11 19:15:20.422 | From 192.168.1.110 icmp_seq=1 Destination Host Unreachable
2014-09-11 19:15:20.422 |
2014-09-11 19:15:20.422 | --- 192.0.2.46 ping statistics ---
2014-09-11 19:15:20.422 | 1 packets transmitted, 0 received, +1 errors, 100% packet loss, time 0ms
2014-09-11 19:15:20.463 | + get_state_from_hosts

Which can mean any number of things

== l3 agent traceback

CRITICAL neutron [req-1c4f9f4a-3c57-462c-96a0-4ff88dd3ba49 None] MessagingTimeout: Timed out waiting for a reply to message ID e42af92d590b43c8b227630de7624206
 TRACE neutron Traceback (most recent call last):
 TRACE neutron File "/opt/stack/venvs/openstack/bin/neutron-l3-agent", line 10, in <module>
 TRACE neutron sys.exit(main())
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/neutron/agent/l3_agent.py", line 1936, in main
 TRACE neutron manager=manager)
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/neutron/service.py", line 264, in create
 TRACE neutron periodic_fuzzy_delay=periodic_fuzzy_delay)
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/neutron/service.py", line 197, in __init__
 TRACE neutron self.manager = manager_class(host=host, *args, **kwargs)
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/neutron/agent/l3_agent.py", line 1852, in __init__
 TRACE neutron super(L3NATAgentWithStateReport, self).__init__(host=host, conf=conf)
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/neutron/agent/l3_agent.py", line 526, in __init__
 TRACE neutron self.plugin_rpc.get_service_plugin_list(self.context))
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/neutron/agent/l3_agent.py", line 144, in get_service_plugin_list
 TRACE neutron version='1.3')
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/neutron/common/log.py", line 36, in wrapper
 TRACE neutron return method(*args, **kwargs)
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/neutron/common/rpc.py", line 161, in call
 TRACE neutron context, msg, rpc_method='call', **kwargs)
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/neutron/common/rpc.py", line 187, in __call_rpc_method
 TRACE neutron return func(context, msg['method'], **msg['args'])
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/oslo/messaging/rpc/client.py", line 152, in call
 TRACE neutron retry=self.retry)
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/oslo/messaging/transport.py", line 90, in _send
 TRACE neutron timeout=timeout, retry=retry)
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 408, in send
 TRACE neutron retry=retry)
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 397, in _send
 TRACE neutron result = self._waiter.wait(msg_id, timeout)
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 285, in wait
 TRACE neutron reply, ending = self._poll_connection(msg_id, timeout)
 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 235, in _poll_connection
 TRACE neutron % msg_id)
 TRACE neutron MessagingTimeout: Timed out waiting for a reply to message ID e42af92d590b43c8b227630de7624206
 TRACE neutron

Tags: ci
Revision history for this message
Derek Higgins (derekh) wrote :
Download full text (14.7 KiB)

Example from
http://logs.openstack.org/99/120799/3/check-tripleo/check-tripleo-novabm-overcloud-f20-nonha/66d3c9e/logs/overcloud-controller0_logs/

2014-09-11 19:08:49.993 | + wait_for 30 10 ping -c 1 192.0.2.46
2014-09-11 19:15:20.422 | Timing out after 300 seconds:
2014-09-11 19:15:20.422 | COMMAND=ping -c 1 192.0.2.46
2014-09-11 19:15:20.422 | OUTPUT=PING 192.0.2.46 (192.0.2.46) 56(84) bytes of data.
2014-09-11 19:15:20.422 | From 192.168.1.110 icmp_seq=1 Destination Host Unreachable
2014-09-11 19:15:20.422 |
2014-09-11 19:15:20.422 | --- 192.0.2.46 ping statistics ---

-- Logs begin at Thu 2014-09-11 18:53:55 UTC, end at Thu 2014-09-11 19:10:07 UTC. --
Sep 11 18:58:06 overcloud-controller0-vktbhu7b7qww systemd[1]: Starting neutron-l3-agent Service...
Sep 11 18:58:06 overcloud-controller0-vktbhu7b7qww systemd[1]: Started neutron-l3-agent Service.
Sep 11 18:58:18 overcloud-controller0-vktbhu7b7qww neutron-l3-agent[30996]: 2014-09-11 18:58:18.723 30996 INFO oslo.messaging._drivers.impl_rabbit [req-25335a75-0bc8-4224-935b-4b29b349a6d7 ] Connecting to AMQP server on 192.0.2.3:5672
Sep 11 18:58:18 overcloud-controller0-vktbhu7b7qww neutron-l3-agent[30996]: 2014-09-11 18:58:18.840 30996 INFO oslo.messaging._drivers.impl_rabbit [req-25335a75-0bc8-4224-935b-4b29b349a6d7 ] Connected to AMQP server on 192.0.2.3:5672
Sep 11 18:58:18 overcloud-controller0-vktbhu7b7qww neutron-l3-agent[30996]: 2014-09-11 18:58:18.864 30996 INFO oslo.messaging._drivers.impl_rabbit [req-25335a75-0bc8-4224-935b-4b29b349a6d7 ] Connecting to AMQP server on 192.0.2.3:5672
Sep 11 18:58:18 overcloud-controller0-vktbhu7b7qww neutron-l3-agent[30996]: 2014-09-11 18:58:18.923 30996 INFO oslo.messaging._drivers.impl_rabbit [req-25335a75-0bc8-4224-935b-4b29b349a6d7 ] Connected to AMQP server on 192.0.2.3:5672
Sep 11 18:59:19 overcloud-controller0-vktbhu7b7qww neutron-l3-agent[30996]: 2014-09-11 18:59:18.989 30996 CRITICAL neutron [req-25335a75-0bc8-4224-935b-4b29b349a6d7 None] MessagingTimeout: Timed out waiting for a reply to message ID 3ca27749373d4520b6bcb197aa928bb4
Sep 11 18:59:19 overcloud-controller0-vktbhu7b7qww neutron-l3-agent[30996]: 2014-09-11 18:59:18.989 30996 TRACE neutron Traceback (most recent call last):
Sep 11 18:59:19 overcloud-controller0-vktbhu7b7qww neutron-l3-agent[30996]: 2014-09-11 18:59:18.989 30996 TRACE neutron File "/opt/stack/venvs/openstack/bin/neutron-l3-agent", line 10, in <module>
Sep 11 18:59:19 overcloud-controller0-vktbhu7b7qww neutron-l3-agent[30996]: 2014-09-11 18:59:18.989 30996 TRACE neutron sys.exit(main())
Sep 11 18:59:19 overcloud-controller0-vktbhu7b7qww neutron-l3-agent[30996]: 2014-09-11 18:59:18.989 30996 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/neutron/agent/l3_agent.py", line 1936, in main
Sep 11 18:59:19 overcloud-controller0-vktbhu7b7qww neutron-l3-agent[30996]: 2014-09-11 18:59:18.989 30996 TRACE neutron manager=manager)
Sep 11 18:59:19 overcloud-controller0-vktbhu7b7qww neutron-l3-agent[30996]: 2014-09-11 18:59:18.989 30996 TRACE neutron File "/opt/stack/venvs/openstack/lib/python2.7/site-packages/neutron/service.py", line 264, in create
Sep 11 18:59:19 overcloud-controller0-vkt...

Revision history for this message
Derek Higgins (derekh) wrote :

A commit to deal with this has merged into neutron
  https://review.openstack.org/#/c/121492/

Changed in tripleo:
status: Triaged → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.