Juniper Openstack

RHOSP8-kafka connection not up, analytics api service went down

Bug #1678288 reported by shajuvk on 2017-03-31

This bug affects 1 person

	Status	Importance	Assigned to	Milestone
Juniper Openstack	Status tracked in Trunk
R3.2	Invalid	High	Anish Mehta	Juniper Openstack r3.2.3.0
Trunk	Invalid	High	Anish Mehta	Juniper Openstack r4.0.0.0-fcs "r4.0.0.0"

Bug Description

[2017-03-30 18:08:08,647] WARN [Controller-0-to-broker-0-send-thread], Controller 0's connection to broker Node(0, 10.87.67.181, 9092) was unsuccessful (kafka.controller.RequestSendThread)
java.io.IOException: Connection to Node(0, 10.87.67.181, 9092) failed
        at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingReady$extension$1.apply(NetworkClientBlockingOps.scala:62)
        at kafka.utils.NetworkClientBlockingOps$$anonfun$blockingReady$extension$1.apply(NetworkClientBlockingOps.scala:58)
        at kafka.utils.NetworkClientBlockingOps$$anonfun$kafka$utils$NetworkClientBlockingOps$$pollUntil$extension$2.apply(NetworkClientBlockingOps.scala:106)
        at kafka.utils.NetworkClientBlockingOps$$anonfun$kafka$utils$NetworkClientBlockingOps$$pollUntil$extension$2.apply(NetworkClientBlockingOps.scala:105)
        at kafka.utils.NetworkClientBlockingOps$.recurse$1(NetworkClientBlockingOps.scala:129)
        at kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollUntilFound$extension(NetworkClientBlockingOps.scala:139)
        at kafka.utils.NetworkClientBlockingOps$.kafka$utils$NetworkClientBlockingOps$$pollUntil$extension(NetworkClientBlockingOps.scala:105)
        at kafka.utils.NetworkClientBlockingOps$.blockingReady$extension(NetworkClientBlockingOps.scala:58)
        at kafka.controller.RequestSendThread.brokerReady(ControllerChannelManager.scala:225)
        at kafka.controller.RequestSendThread.liftedTree1$1(ControllerChannelManager.scala:172)
        at kafka.controller.RequestSendThread.doWork(ControllerChannelManager.scala:171)
        at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:63)
====

03/31/2017 02:11:05 PM [contrail-alarm-gen]: An exception of type NotLeaderForPartitionError occured. Arguments:
(OffsetResponsePayload(topic=u'-uve-13', partition=0, error=6, offsets=()),) : traceback Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/opserver/partition_handler.py", line 597, in _run
    consumer.seek(-1,2)
  File "/usr/lib/python2.7/site-packages/kafka/consumer/simple.py", line 245, in seek
    resps = self.client.send_offset_request(reqs)
  File "/usr/lib/python2.7/site-packages/kafka/client.py", line 641, in send_offset_request
    if not fail_on_error or not self._raise_on_response_error(resp)]
  File "/usr/lib/python2.7/site-packages/kafka/client.py", line 384, in _raise_on_response_error
    kafka.common.check_error(resp)
  File "/usr/lib/python2.7/site-packages/kafka/common.py", line 473, in check_error
    raise error_class(response)
NotLeaderForPartitionError: NotLeaderForPartitionError - 6 - This error is thrown if the client attempts to send messages to a replica that is not the leader for some partition. It indicates that the clients metadata is out of date.

03/31/2017 02:11:05 PM [contrail-alarm-gen]: Stopping part 13 collectors []
03/31/2017 02:11:05 PM [contrail-alarm-gen]: Stopping part 13 UVEs []
03/31/2017 02:11:06 PM [contrail-alarm-gen]: New KafkaClient -uve-18
^C
[root@contrail-controller2 bin]# pwd
/usr/share/kafka/bin
[root@contrail-controller2 bin]# ./kafka-topics.sh --zookeeper 127.0.0.1 --describe
Topic:-uve-0 PartitionCount:1 ReplicationFactor:2 Configs:
        Topic: -uve-0 Partition: 0 Leader: 1 Replicas: 0,1 Isr: 1
Topic:-uve-1 PartitionCount:1 ReplicationFactor:2 Configs:
        Topic: -uve-1 Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic:-uve-10 PartitionCount:1 ReplicationFactor:2 Configs:
        Topic: -uve-10 Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic:-uve-11 PartitionCount:1 ReplicationFactor:2 Configs:
        Topic: -uve-11 Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic:-uve-12 PartitionCount:1 ReplicationFactor:2 Configs:
        Topic: -uve-12 Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic:-uve-13 PartitionCount:1 ReplicationFactor:2 Configs:
        Topic: -uve-13 Partition: 0 Leader: 1 Replicas: 0,1 Isr: 1
Topic:-uve-14 PartitionCount:1 ReplicationFactor:2 Configs:
        Topic: -uve-14 Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic:-uve-15 PartitionCount:1 ReplicationFactor:2 Configs:
        Topic: -uve-15 Partition: 0 Leader: 1 Replicas: 0,1 Isr: 1
Topic:-uve-16 PartitionCount:1 ReplicationFactor:2 Configs:
        Topic: -uve-16 Partition: 0 Leader: 1 Replicas: 1,2 Isr: 1,2
Topic:-uve-17 PartitionCount:1 ReplicationFactor:2 Configs:
        Topic: -uve-17 Partition: 0 Leader: 1 Replicas: 0,1 Isr: 1
Topic:-uve-18 PartitionCount:1 ReplicationFactor:2 Configs:
        Topic: -uve-18 Partition: 0 Leader: 1 Replicas: 0,1 Isr: 1
Topic:-uve-19 PartitionCount:1 ReplicationFactor:2 Configs:
        Topic: -uve-19 Partition: 0 Leader: 2 Replicas: 2,0 Isr: 2

Tags:

Revision history for this message

shajuvk (shajuvk) wrote on 2017-03-31:

/cs-shared/bugs/1678288 logs are available here. 3 node analytics setup. cfgm and analytics are same nodes

shajuvk (shajuvk) on 2017-03-31

information type:

Proprietary → Public

Raj Reddy (rajreddy) on 2017-04-03

Changed in juniperopenstack:
assignee:	nobody → Anish Mehta (amehta00)

Jeba Paulaiyan (jebap) on 2017-04-03

summary:

- RHOSP8-kafka not connectiong not up, analytics api service went down
+ RHOSP8-kafka connection not up, analytics api service went down

Revision history for this message

Raj Reddy (rajreddy) wrote on 2017-04-04:

The systems have too less memory..

[root@contrail-controller1 ~]# free -m
total used free shared buff/cache available
Mem: 11854 6697 396 395 4760 4132
Swap: 0 0 0
[root@contrail-controller1 ~]# free -mh
total used free shared buff/cache available
Mem: 11G 6.5G 395M 395M 4.6G 4.0G
Swap: 0B 0B 0B
[root@contrail-controller1 ~]#

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.