While testing Sahara @ 25 Nodes scale lab the following error was reproducing constantly.
This error prevents Sahara from starting clusters normally.
Env: 25 nodes lab
MOS: 6.0.1_130
Network: Neutron GRE
Trace
167>Mar 10 10:41:01 node-7 neutron-l3-agent 2015-03-10 10:41:01.376 16088 DEBUG neutron.agent.linux.utils [-] Running command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-78d2babf-690f-45cc-8883-391ef7716505', 'ip', 'addr', 'show', 'qg-8ede2a76-a6'] create_process /usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py:46
...skipping...
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent Traceback (most recent call last):
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent File "/usr/lib/python2.6/site-packages/neutron/agent/l3_agent.py", line 1849, in _process_router_update
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent [update.id])
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent File "/usr/lib/python2.6/site-packages/neutron/agent/l3_agent.py", line 105, in get_routers
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent router_ids=router_ids))
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent File "/usr/lib/python2.6/site-packages/neutron/common/log.py", line 34, in wrapper
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent return method(*args, **kwargs)
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent File "/usr/lib/python2.6/site-packages/neutron/common/rpc.py", line 161, in call
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent context, msg, rpc_method='call', **kwargs)
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent File "/usr/lib/python2.6/site-packages/neutron/common/rpc.py", line 187, in __call_rpc_method
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent return func(context, msg['method'], **msg['args'])
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/client.py", line 389, in call
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent return self.prepare().call(ctxt, method, **kwargs)
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/client.py", line 152, in call
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent retry=self.retry)
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent File "/usr/lib/python2.6/site-packages/oslo/messaging/transport.py", line 90, in _send
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent timeout=timeout, retry=retry)
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent File "/usr/lib/python2.6/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 434, in send
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent retry=retry)
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent File "/usr/lib/python2.6/site-packages/oslo/messaging/_drivers/amqpdriver.py", line 425, in _send
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent raise result
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent RemoteError: Remote error: PortNotFound Port 3204f1e2-ed48-47b5-a1e2-7d6a0edeaa41 could not be found
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent [u'Traceback (most recent call last):\n', u' File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 137, in _dispatch_and_reply\n incoming.message))\n', u' File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 180, in _dispatch\n return self._do_dispatch(endpoint, method, ctxt, args)\n', u' File "/usr/lib/python2.6/site-packages/oslo/messaging/rpc/dispatcher.py", line 126, in _do_dispatch\n result = getattr(endpoint, method)(ctxt, **new_args)\n', u' File "/usr/lib/python2.6/site-packages/neutron/api/rpc/handlers/l3_rpc.py", line 78, in sync_routers\n context, host, router_ids))\n', u' File "/usr/lib/python2.6/site-packages/neutron/db/l3_agentschedulers_db.py", line 313, in list_active_sync_routers_on_active_l3_agent\n active=True)\n', u' File "/usr/lib/python2.6/site-packages/neutron/db/l3_hamode_db.py", line 460, in get_ha_sync_data_for_host\n active)\n', u' File "/usr/lib/python2.6/site-packages/neutron/db/l3_dvr_db.py", line 367, in get_sync_data\n fip[\'host\'] = self.get_vm_port_hostid(context, fip[\'port_id\'])\n', u' File "/usr/lib/python2.6/site-packages/neutron/db/l3_dvr_db.py", line 375, in get_vm_port_hostid\n vm_port_db = port or self._core_plugin.get_port(context, port_id)\n', u' File "/usr/lib/python2.6/site-packages/neutron/db/db_base_plugin_v2.py", line 1445, in get_port\n port = self._get_port(context, id)\n', u' File "/usr/lib/python2.6/site-packages/neutron/db/db_base_plugin_v2.py", line 108, in _get_port\n raise n_exc.PortNotFound(port_id=id)\n', u'PortNotFound: Port 3204f1e2-ed48-47b5-a1e2-7d6a0edeaa41 could not be found\n'].
2015-03-10 10:41:01.847 16088 TRACE neutron.agent.l3_agent
<167>Mar 10 10:41:01 node-7 neutron-l3-agent 2015-03-10 10:41:01.855 16088 DEBUG neutron.agent.l3_agent [-] Starting router update for b2a92740-3015-4512-a9e7-e3dbd1d3773e _process_router_update /usr/lib/python2.6/site-packages/neutron/agent/l3_agent.py:1843
<167>Mar 10 10:41:01 node-7 neutron-l3-agent 2015-03-10 10:41:01.856 16088 DEBUG neutron.common.rpc [-] neutron.agent.l3_agent.L3PluginApi method call called with arguments (<neutron.context.ContextBase object at 0x230fcd0>, {'args': {'host': 'node-7.domain.tld', 'router_ids': [u'b2a92740-3015-4512-a9e7-e3dbd1d3773e']}, 'namespace': None, 'method': 'sync_routers'}) {} wrapper /usr/lib/python2.6/site-packages/neutron/common/log.py:33
<167>Mar 10 10:41:01 node-7 neutron-l3-agent 2015-03-10 10:41:01.887 16088 DEBUG neutron.agent.linux.utils [-]
Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qrouter-78d2babf-690f-45cc-8883-391ef7716505', 'conntrack', '-D', '-q', '172.16.46.237']
Exit code: 1
Stdout: ''
Stderr: 'conntrack v0.9.13 (conntrack-tools): Operation failed: Connection refused\n' execute /usr/lib/python2.6/site-packages/neutron/agent/linux/utils.py:81
<167>Mar 10 10:41:01 node-7 neutron-l3-agent 2015-03-10 10:41:01.888 16088 DEBUG neutron.common.rpc [-] neutron.agent.l3_agent.L3PluginApi method call called with arguments (<neutron.context.ContextBase object at 0x230fcd0>, {'args': {'router_id': u'78d2babf-690f-45cc-8883-391ef7716505', 'fip_statuses': {u'46178b56-569f-420e-a229-8c98441157bb': 'ACTIVE', u'4126078e-1267-40c3-a462-c4ee5fef3c0c': 'ACTIVE', u'b795eeaa-fd2a-4174-b8e2-db669f1e4951': 'ACTIVE', u'f046ee52-2d59-4942-a885-306648bd4b6c': 'DOWN', u'f230112b-800f-4dd2-b7f9-c23496384e4e': 'ACTIVE', u'e2eb1004-6c70-43be-8ed1-acb78231b2c4': 'ACTIVE'}}, 'namespace': None, 'method': 'update_floatingip_statuses'}) {'version': '1.1'} wrapper /usr/lib/python2.6/site-packages/neutron/common/log.py:33
<167>Mar 10 10:41:01 node-7 neutron-server 2015-03-10 10:41:01.895 11730 DEBUG neutron.context [req-7db59277-9fcd-4f0a-8981-7cc411813dbf None] Arguments dropped when creating context: {u'project_name': None, u'tenant': None} __init__ /usr/lib/python2.6/site-packages/neutron/context.py:83
Moved to scale team since we need a live environment or at least snapshot with repro