R2.1 Build 42 Ubuntu 14.04 multi-node icehouse setup
This is a tor scale setup with about 180 tors, and 40K VMIs.
1K VMIs in each VN. 10 physical ports on each tor, 100 VMIs on each logical port
While adding vmis incrementally, it was seen that api-server connection to zookeeper was lost
After about 10-15 mins, api-server returned to normalcy
Logs will be in http://10.204.216.50/Docs/bugs/#
root@nodei34:/var/log/contrail# neutron net-list
500-{u'NeutronError': {u'message': u'An unknown exception occurred.', u'type': u'NeutronException', u'detail': u''}}
root@nodei34:/var/log/contrail#
env.roledefs = {
'all': [host2, host3, host4, host5, host6],
'cfgm': [host2, host3],
'openstack': [host2],
'webui': [host3],
'control': [host3, host4],
'compute': [host5, host6],
'collector': [host2, host3],
'database': [host2, host3, host4],
'toragent': [host6],
'tsn': [host6],
'build': [host_build],
}
env.hostnames = {
'all': ['nodei34', 'nodei35', 'nodei36', 'nodei37', 'nodei38']
}
supervisor-config manually stopped on nodei35.
contrail-api-0-stdout.log :
======================
192.168.1.2 - - [2015-03-19 09:52:43] "POST /fqname-to-id HTTP/1.1" 200 156 0.003323
192.168.1.2 - - [2015-03-19 09:52:43] "GET /virtual-machine-interface/639ee13c-9df4-49d5-b042-2859776dce10?exclude_back_refs=True&exclude_children=True HTTP/1.1" 200 1998 0.006464
192.168.1.2 - - [2015-03-19 09:52:43] "POST /fqname-to-id HTTP/1.1" 200 156 0.003882
WARNING:api-0:Connection dropped: socket connection broken
WARNING:api-0:Transition to CONNECTING
INFO:api-0:Zookeeper connection lost
INFO:api-0:Connecting to 192.168.1.2:2181
192.168.1.2 - - [2015-03-19 09:53:04] "GET /virtual-machine-interface/857931f8-a335-4f6a-a357-fd0b18fcae98?exclude_back_refs=True&exclude_children=True HTTP/1.1" 200 1627 21.055019
INFO:api-0:Sending request(xid=None): Connect(protocol_version=0, last_zxid_seen=4295044169, time_out=20000, session_id=237604193645428757, passwd='S\xef(\xeao\xfe\xeaL\x82\x86T\x1a/)\x18A', read_only=None)
127.0.0.1 - - [2015-03-19 09:53:04] "POST /virtual-machine-interfaces HTTP/1.1" 200 552 21.316497
WARNING:api-0:Session has expired
INFO:api-0:Zookeeper session lost, state: EXPIRED_SESSION
03/19/2015 09:53:06 AM [nodei34:contrail-api:Config:0]: SANDESH: CONNECT TO COLLECTOR: True
INFO:nodei34:contrail-api:Config:0:SANDESH: CONNECT TO COLLECTOR: True
03/19/2015 09:53:17 AM [nodei34:contrail-api:Config:0]: SANDESH: CONNECT TO COLLECTOR: True
INFO:nodei34:contrail-api:Config:0:SANDESH: CONNECT TO COLLECTOR: True
03/19/2015 09:53:17 AM [nodei34:contrail-api:Config:0]: SANDESH: Logging: False -> 1
INFO:nodei34:contrail-api:Config:0:SANDESH: Logging: False -> 1
03/19/2015 09:53:17 AM [nodei34:contrail-api:Config:0]: SANDESH: Logging: LEVEL: [SYS_INFO] -> [SYS_NOTICE]
INFO:nodei34:contrail-api:Config:0:SANDESH: Logging: LEVEL: [SYS_INFO] -> [SYS_NOTICE]
WARNING:keystoneclient.middleware.auth_token:Configuring auth_uri to point to the public identity endpoint is required; clients may not be able to authenticate against an admin endpoint
Bottle v0.12.0 server starting up (using GeventServer())...
Listening on http://0.0.0.0:8084/
Hit Ctrl-C to quit.
WARNING:nodei34:contrail-api:Config:0:No Connection: __default__ [SYS_NOTICE]: VncApiError: Auth token fetched from keystone.
WARNING:nodei34:contrail-api:Config:0:No Connection: __default__ [SYS_NOTICE]: VncApiError: Connecting to ifmap on 192.168.1.2:8443 as api-server
Bottle v0.12.0 server starting up (using GeventServer())...
Listening on http://localhost:8095/
Hit Ctrl-C to quit.
ERROR:cfgm_common.ifmap.client:Uknown error sending IF-MAP message to server [Errno 111] Connection refused
WARNING:nodei34:contrail-api:Config:0:No Connection: __default__ [SYS_NOTICE]: VncApiError: IFMAP connection ESTABLISHED
WARNING:nodei34:contrail-api:Config:0:No Connection: __default__ [SYS_NOTICE]: VncApiError: Connecting to cassandra on ['192.168.1.2:9160', '192.168.1.3:9160', '192.168.1.4:9160']
WARNING:nodei34:contrail-api:Config:0:No Connection: __default__ [SYS_WARN]: VncApiError: Warning! InvalidRequestException(why='Keyspace names must be case-insensitively unique ("useragent" conflicts with "useragent")')
WARNING:nodei34:contrail-api:Config:0:No Connection: __default__ [SYS_WARN]: VncApiError: Warning! InvalidRequestException(why='Cannot add already existing column family "useragent_keyval_table" to keyspace "useragent"')
WARNING:nodei34:contrail-api:Config:0:No Connection: __default__ [SYS_WARN]: VncApiError: Warning! InvalidRequestException(why='Keyspace names must be case-insensitively unique ("config_db_uuid" conflicts with "config_db_uuid")')
WARNING:nodei34:contrail-api:Config:0:No Connection: __default__ [SYS_WARN]: VncApiError: Warning! InvalidRequestException(why='Cannot add already existing column family "obj_uuid_table" to keyspace "config_db_uuid"')
WARNING:nodei34:contrail-api:Config:0:No Connection: __default__ [SYS_WARN]: VncApiError: Warning! InvalidRequestException(why='Cannot add already existing column family "obj_fq_name_table" to keyspace "config_db_uuid"')
WARNING:nodei34:contrail-api:Config:0:__default__ [SYS_NOTICE]: VncApiError: Cassandra connection ESTABLISHED
WARNING:nodei34:contrail-api:Config:0:__default__ [SYS_NOTICE]: VncApiError: Connecting to zookeeper on 192.168.1.2:2181,192.168.1.3:2181,192.168.1.4:2181
INFO:api-0:Connecting to 192.168.1.4:2181
INFO:api-0:Sending request(xid=None): Connect(protocol_version=0, last_zxid_seen=0, time_out=20000, session_id=0, passwd='\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00', read_only=None)
INFO:api-0:Zookeeper connection established, state: CONNECTED
WARNING:nodei34:contrail-api:Config:0:__default__ [SYS_NOTICE]: VncApiError: Connecting to rabbitmq on 192.168.1.2:5672 user guest
WARNING:nodei34:contrail-api:Config:0:__default__ [SYS_NOTICE]: VncApiError: RabbitMQ connection ESTABLISHED
WARNING:nodei34:contrail-api:Config:0:__default__ [SYS_NOTICE]: VncApiError: Running greenlet _dbe_oper_subscribe
127.0.0.1 - - [2015-03-19 09:58:22] "GET /Snh_SandeshUVECacheReq?x=NodeStatus HTTP/1.1" 200 3374 0.002072
127.0.0.1 - - [2015-03-19 09:59:14] "GET /Snh_SandeshUVECacheReq?x=NodeStatus HTTP/1.1" 200 3374 0.002404
127.0.0.1 - - [2015-03-19 10:05:49] "GET /Snh_SandeshUVECacheReq?x=NodeStatus HTTP/1.1" 200 3374 0.003003
Bottle v0.12.0 server starting up (using GeventServer())...
Listening on http://0.0.0.0:9100/
Hit Ctrl-C to quit.
WARNING:keystoneclient.middleware.auth_token:Unable to find authentication token in headers
127.0.0.1 - - [2015-03-19 10:12:09] "GET / HTTP/1.1" 401 194 0.000826
WARNING:keystoneclient.middleware.auth_token:Unable to find authentication token in headers
127.0.0.1 - - [2015-03-19 10:12:09] "GET / HTTP/1.1" 401 194 0.000633
127.0.0.1 - - [2015-03-19 10:12:10] "GET / HTTP/1.1" 200 9743 1.000477
127.0.0.1 - - [2015-03-19 10:12:10] "GET / HTTP/1.1" 200 9743 0.652401
127.0.0.1 - - [2015-03-19 10:12:10] "POST /fqname-to-id HTTP/1.1" 200 156 0.011116
127.0.0.1 - - [2015-03-19 10:12:10] "GET /domains?count=False&detail=False HTTP/1.1" 200 280 0.005756
127.0.0.1 - - [2015-03-19 10:12:10] "POST /fqname-to-id HTTP/1.1" 200 156 0.005885
127.0.0.1 - - [2015-03-19 10:12:10] "GET /projects?count=False&detail=False HTTP/1.1" 200 469 0.003296
WARNING:keystoneclient.middleware.auth_token:Unable to find authentication token in headers
192.168.1.2 - - [2015-03-19 10:12:11] "GET / HTTP/1.1" 401 194 0.000655
WARNING:keystoneclient.middleware.auth_token:Unable to find authentication token in headers
192.168.1.2 - - [2015-03-19 10:12:12] "GET / HTTP/1.1" 401 194 0.001126
192.168.1.2 - - [2015-03-19 10:12:12] "GET / HTTP/1.1" 200 9919 0.276946
192.168.1.2 - - [2015-03-19 10:12:12] "GET /virtual-networks?count=False&detail=False HTTP/1.1" 200 12148 0.008781
192.168.1.2 - - [2015-03-19 10:12:12] "GET / HTTP/1.1" 200 9919 0.245284
192.168.1.2 - - [2015-03-19 10:12:12] "GET /routing-instances?count=False&detail=True HTTP/1.1" 200 69059 0.071517
192.168.1.2 - - [2015-03-19 10:12:12] "POST /fqname-to-id HTTP/1.1" 200 156 0.061192
-------------------------------
zookeeper logs on nodei35 (others too have similar logs )
=======================
2015-03-19 09:02:49,290 - WARN [SyncThread:2:FileTxnLog@321] - fsync-ing the write ahead log in SyncThread:2 took 1068ms which will adversely effect operation latency. See the ZooKeeper troubleshooting guide
2015-03-19 09:53:04,001 - INFO [SessionTracker:ZooKeeperServer@325] - Expiring session 0x34c23c157610015, timeout of 20000ms exceeded
2015-03-19 09:53:04,001 - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@476] - Processed session termination for sessionid: 0x34c23c157610015
2015-03-19 09:53:08,559 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x24c23c147630017, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2015-03-19 09:53:08,560 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /192.168.1.2:38941 which had sessionid 0x24c23c147630017
2015-03-19 09:53:08,573 - WARN [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@349] - caught end of stream exception
EndOfStreamException: Unable to read additional data from client sessionid 0x24c23c147630015, likely client has closed socket
at org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:220)
at org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:208)
at java.lang.Thread.run(Thread.java:745)
2015-03-19 09:53:08,574 - INFO [NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1001] - Closed socket connection for client /192.168.1.2:58058 which had sessionid 0x24c23c147630015
2015-03-19 09:53:26,000 - INFO [SessionTracker:ZooKeeperServer@325] - Expiring session 0x34c23c157610014, timeout of 20000ms exceeded
2015-03-19 09:53:26,001 - INFO [SessionTracker:ZooKeeperServer@325] - Expiring session 0x24c23c147630015, timeout of 20000ms exceeded
2015-03-19 09:53:26,001 - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@476] - Processed session termination for sessionid: 0x34c23c157610014
2015-03-19 09:53:26,002 - INFO [ProcessThread(sid:2 cport:-1)::PrepRequestProcessor@476] - Processed session termination for sessionid: 0x24c23c147630015
2015-03-19 09:53:28,000 - INFO [SessionTracker:ZooKeeperServer@325] - Expiring session 0x24c23c147630017, timeout of 20000m
The same issue is seen even when booting a VM on a VN with 10K VMis. (with pre-allocated port)