neutron metadata service running but not responding properly

Bug #1494227 reported by Darren Birkett
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Invalid
Undecided
Unassigned
Trunk
Invalid
Undecided
Unassigned

Bug Description

The neutron-ns-metadata-proxy service was running, but not listening:

root@infra-node1:/opt/os-ansible-deployment/rpc_deployment# ansible neutron_agent -m shell -a "ip netns exec qdhcp-a5ad7a1d-d3a6-4180-8d61-07a23f6fb449 netstat -ntlp"
infra-node2_neutron_agents_container-544efdd7 | success | rc=0 >>
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 20706/python
tcp 0 0 172.22.89.19:53 0.0.0.0:* LISTEN 31527/dnsmasq
tcp 0 0 169.254.169.254:53 0.0.0.0:* LISTEN 31527/dnsmasq
tcp6 0 0 fe80::f816:3eff:fe16:53 :::* LISTEN 31527/dnsmasq

infra-node1_neutron_agents_container-1c74674d | success | rc=0 >>
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 172.22.110.222:53 0.0.0.0:* LISTEN 2186/dnsmasq
tcp 0 0 169.254.169.254:53 0.0.0.0:* LISTEN 2186/dnsmasq
tcp6 0 0 fe80::f816:3eff:fec3:53 :::* LISTEN 2186/dnsmasq

infra-node3_neutron_agents_container-69cf97bb | success | rc=0 >>
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN 1138/python
tcp 0 0 172.22.64.4:53 0.0.0.0:* LISTEN 17718/dnsmasq
tcp 0 0 169.254.169.254:53 0.0.0.0:* LISTEN 17718/dnsmasq
tcp6 0 0 fe80::f816:3eff:fea1:53 :::* LISTEN 17718/dnsmasq

I had to kill it, then restart the neutron-dhcp-agent which resolved the problem.

Found this in the neutron-metadata-agent logs:

2015-05-29 03:25:06.270 648 ERROR oslo.messaging.drivers.impl_rabbit [-] Failed to publish message to topic 'q-plugin': [Errno 104] Connection reset by peer
2015-05-29 03:25:06.270 648 TRACE oslo.messaging.drivers.impl_rabbit Traceback (most recent call last):
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/impl_rabbit.py", line 655, in ensure
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit return method()
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/impl_rabbit.py", line 752, in _publish
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit publisher = cls(self.conf, self.channel, topic=topic, **kwargs)
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/impl_rabbit.py", line 378, in __init
2015-05-29 03:25:06.270 648 TRACE oslo.messaging.drivers.impl_rabbit **options)
2015-05-29 03:25:06.270 648 TRACE oslo.messaging.drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/impl_rabbit.py", line 326, in __init
2015-05-29 03:25:06.270 648 TRACE oslo.messaging.drivers.impl_rabbit self.reconnect(channel)
2015-05-29 03:25:06.270 648 TRACE oslo.messaging.drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/impl_rabbit.py", line 334, in reconnect
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit routing_key=self.routing_key)
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/kombu/messaging.py", line 82, in __init
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit self.revive(self._channel)
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/kombu/messaging.py", line 216, in revive
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit self.declare()
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/kombu/messaging.py", line 102, in declare
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit self.exchange.declare()
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/kombu/entity.py", line 166, in declare
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit nowait=nowait, passive=passive,
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/amqp/channel.py", line 613, in exchange_declare
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit self._send_method((40, 10), args)
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 56, in _send_method
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit self.channel_id, method_sig, args, content,
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/amqp/method_framing.py", line 221, in write_method
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit write_frame(1, channel, payload)
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/amqp/transport.py", line 182, in write_frame
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit frame_type, channel, size, payload, 0xce,
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/eventlet/greenio.py", line 359, in sendall
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit tail = self.send(data, flags)
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit File "/usr/local/lib/python2.7/dist-packages/eventlet/greenio.py", line 342, in send
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit total_sent += fd.send(data[total_sent:], flags)
2015-05-29 03:25:06.270 648 TRACE oslo.messaging._drivers.impl_rabbit error: [Errno 104] Connection reset by peer

And this in the neutron-dhcp-agent log:

2015-07-25 08:06:27.452 1863 ERROR neutron.agent.dhcp_agent [-] Unable to sync network state.
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent Traceback (most recent call last):
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent File "/usr/local/lib/python2.7/dist-packages/neutron/agent/dhcp_agent.py", line 164, in sync_state
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent active_networks = self.plugin_rpc.get_active_networks_info()
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent File "/usr/local/lib/python2.7/dist-packages/neutron/agent/dhcp_agent.py", line 421, in get_active_networks_info
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent host=self.host))
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent File "/usr/local/lib/python2.7/dist-packages/neutron/common/log.py", line 34, in wrapper
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent return method(args, *kwargs)
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent File "/usr/local/lib/python2.7/dist-packages/neutron/common/rpc.py", line 161, in call
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent context, msg, rpc_method='call', *kwargs)
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent File "/usr/local/lib/python2.7/dist-packages/neutron/common/rpc.py", line 187, in __call_rpc_method
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent return func(context, msg['method'], *msg['args'])
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/client.py", line 389, in call
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent return self.prepare().call(ctxt, method, *kwargs)
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/client.py", line 152, in call
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent retry=self.retry)
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/transport.py", line 90, in _send
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent timeout=timeout, retry=retry)
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 416, in send
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent retry=retry)
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/_drivers/amqpdriver.py", line 407, in _send
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent raise result
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent RemoteError: Remote error: OperationalError (OperationalError) (1040, 'Too many connections') None None
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent [u'Traceback (most recent call last):\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply\n incoming.message))\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch\n return self._do_dispatch(endpoint, method, ctxt, args)\n', u' File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch\n result = getattr(endpoint, method)(ctxt, *new_args)\n', u' File "/usr/local/lib/python2.7/dist-packages/neutron/api/rpc/handlers/dhcp_rpc.py", line 100, in get_active_networks_info\n networks = self.get_active_networks(context, **kwargs)\n', u' File "/usr/local/lib/python2.7/dist-packages/neutron/api/rpc/handlers/dhcp_rpc.py", line 49, in get_active_networks\n plugin.auto_schedule_networks(context, host)\n', u' File "/usr/local/lib/python2.7/dist-packages/neutron/db/agentschedulers_db.py", line 226, in auto_schedule_networks\n self.network_scheduler.auto_schedule_networks(self, context, host)\n', u' File "/usr/local/lib/python2.7/dist-packages/neutron/scheduler/dhcp_agent_scheduler.py", line 104, in auto_schedule_networks\n subnets = plugin.get_subnets(context, fields=fields)\n', u' File "/usr/local/lib/python2.7/dist-packages/neutron/db/db_base_plugin_v2.py", line 1302, in get_subnets\n self._make_subnet_dict,\n', u' File "/usr/local/lib/python2.7/dist-packages/neutron/db/common_db_mixin.py", line 176, in _get_collection\n items = [dict_func(c, fields) for c in query]\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2441, in __iter\n return self.execute_and_instances(context)\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2454, in execute_and_instances\n close_with_result=True)\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/query.py", line 2445, in _connection_from_session\n *kw)\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 880, in connection\n execution_options=execution_options)\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 885, in _connection_for_bind\n engine, execution_options)\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/orm/session.py", line 326, in _connection_for_bind\n conn = bind.contextual_connect()\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/base.py", line 1910, in contextual_connect\n self.pool.connect(),\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 338, in connect\n return _ConnectionFairy._checkout(self)\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 645, in _checkout\n fairy = _ConnectionRecord.checkout(pool)\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 442, in checkout\n dbapi_connection = rec.get_connection()\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 526, in get_connection\n self.connection = self.__connect()\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/pool.py", line 539, in __connect\n connection = self.__pool._creator()\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/strategies.py", line 96, in connect\n connection_invalidated=invalidated\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause\n reraise(type(exception), exception, tb=exc_tb)\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/strategies.py", line 90, in connect\n return dialect.connect(cargs, *cparams)\n', u' File "/usr/local/lib/python2.7/dist-packages/sqlalchemy/engine/default.py", line 377, in connect\n return self.dbapi.connect(cargs, **cparams)\n', u' File "/usr/local/lib/python2.7/dist-packages/MySQLdb/__init.py", line 81, in Connect\n return Connection(args, *kwargs)\n', u' File "/usr/local/lib/python2.7/dist-packages/MySQLdb/connections.py", line 187, in init\n super(Connection, self).init(args, *kwargs2)\n', u"OperationalError: (OperationalError) (1040, 'Too many connections') None None\n"].
2015-07-25 08:06:27.452 1863 TRACE neutron.agent.dhcp_agent

We probably want to add a maas check for this service. The generic service plugin would do since we just need to check IP/port 80

no longer affects: openstack-ansible/juno
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.