Comment 2 for bug 1612545

Revision history for this message
Raj Reddy (rajreddy) wrote :

+Ignatious

It may be true in other places also.. most often what we do for setup, we have to do it for upgrade also..

-Raj

On Aug 12, 2016, at 8:30 AM, Raj Reddy <email address hidden> wrote:

Hi Nikhil,

Thanks for the analysis..

I guess the statement should have been

if (parent_cmd == "setup-vnc-database” or parent_cmd == “upgrade-vnc-database”) and get_kafka_enabled() is not None:
        cmd += " --kafka_broker_id %d" % broker_id

Ranjeet, can you please fix it..

thanks,
-Raj

On Aug 12, 2016, at 4:05 AM, Nikhil Bansal <email address hidden> wrote:

I looked into the code and it seems that broker_id is not being passed as an argument due to a change in fabfile/utils/commandline.py:
https://review.opencontrail.org/#/c/18040/

Now we pass broker id only in this case:
    if parent_cmd == "setup-vnc-database" and get_kafka_enabled() is not None:
        cmd += " --kafka_broker_id %d" % broker_id

For upgrade case, we are not passing kafka broker id. I am not sure about above mentioned change so maybe Raj can take it forward now. I am also including the committer of this code to get better understanding.

Thanks,
Nikhil

From: Nikhil Bansal <email address hidden>
Date: Friday, August 12, 2016 at 2:50 PM
To: Sarathbabu Narasimhan <email address hidden>, Raj Reddy <email address hidden>
Subject: Re: Bug #1612545 : Vcenter-only: upgrade from 2.21.2 to 3.1.0.0 (build#14) failed Analytics/collector

It seems that kafka config is out of sync. Every node has got brokerid of 0 which is causing kafka to go down:

java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/0. This probably indicates that you either have configured a brokerid that is already in use, or else you have shutdown this broker and restarted it faster than the zookeeper timeout so it appears to be re-registering.

Corresponding config files have brokerid of 0 on all the nodes. I am not much familiar with the details of upgrade in kafka so will be looking into the code to figure out possible reason.

Thanks,
Nikhil
PS: we can recover from it by deleting zookeeper ephemeral node.

From: Sarathbabu Narasimhan <email address hidden>
Date: Friday, August 12, 2016 at 2:16 PM
To: Raj Reddy <email address hidden>, Nikhil Bansal <email address hidden>
Cc: Sarathbabu Narasimhan <email address hidden>
Subject: Bug #1612545 : Vcenter-only: upgrade from 2.21.2 to 3.1.0.0 (build#14) failed Analytics/collector

Hi Raj/Nikhil,

Bug #1612545 : Vcenter-only: upgrade from 2.21.2 to 3.1.0.0 (build#14) failed Analytics/collector

As this issue not recovering inspite service restart and Ashish mentioned we should support 2.21 upgrade.
I kept the setup in problem state for triaging, please find below,
10.87.26.197 / .208 / .199

Thanks
*Sarath