SM mitaka HA R3.1 build 25 provision gets stuck at config_started due to keystone conflict among Openstack nodes

Bug #1613159 reported by sundarkh
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Juniper Openstack
Status tracked in Trunk
R3.1
Fix Committed
Critical
Dheeraj Gautam
Trunk
Fix Committed
Critical
Dheeraj Gautam

Bug Description

SM mitaka HA R3.1 build 25 provision gets stuck at config_started due to keystone conflict among Openstack nodes

setup
SM : nodej8

Targets : nodeg21,nodeg33,nodec58,nodeg17,nodec38

root@nodej8:~# server-manager show server --select id,roles,cluster_id,ip_address
{
    "server": [
        {
            "cluster_id": "cluster5sanity",
            "id": "nodeg21",
            "ip_address": "10.204.217.61",
            "roles": [
                "control",
                "webui",
                "openstack",
                "database",
                "collector",
                "config"
            ]
        },
        {
            "cluster_id": "cluster5sanity",
            "id": "nodec38",
            "ip_address": "10.204.217.23",
            "roles": [
                "compute"
            ]
        },
        {
            "cluster_id": "cluster5sanity",
            "id": "nodeg17",
            "ip_address": "10.204.217.57",
            "roles": [
                "compute"
            ]
        },
        {
            "cluster_id": "cluster5sanity",
            "id": "nodec58",
            "ip_address": "10.204.217.98",
            "roles": [
                "control",
                "webui",
                "openstack",
                "database",
                "collector",
                "config"
            ]
        },
        {
            "cluster_id": "cluster5sanity",
            "id": "nodeg33",
            "ip_address": "10.204.217.73",
            "roles": [
                "control",
                "webui",
                "openstack",
                "database",
                "collector",
                "config"
            ]
        }
    ]
}
root@nodej8:~#

cfgm0 node nodeg21 syslog

Aug 14 11:20:29 nodeg21 puppet-agent[28239]: contrail contrail_exec_provision_control is python exec_provision_control.py --api_server_ip "10.204.217.176" --api_server_port 8082 --host_name_list "nodeg21,nodec58,nodeg33" --host_ip_list "10.204.217.61,10.204.217.98,10.204.217.73" --router_asn "64512" --mt_options "admin,contrail123,admin" && echo exec-provision-control >> /etc/contrail/contrail_config_exec.out
Aug 14 11:20:29 nodeg21 puppet-agent[28239]: (/Stage[config]/Contrail::Exec_provision_control/Notify[contrail contrail_exec_provision_control is python exec_provision_control.py --api_server_ip "10.204.217.176" --api_server_port 8082 --host_name_list "nodeg21,nodec58,nodeg33" --host_ip_list "10.204.217.61,10.204.217.98,10.204.217.73" --router_asn "64512" --mt_options "admin,contrail123,admin" && echo exec-provision-control >> /etc/contrail/contrail_config_exec.out]/message) defined 'message' as 'contrail contrail_exec_provision_control is python exec_provision_control.py --api_server_ip "10.204.217.176" --api_server_port 8082 --host_name_list "nodeg21,nodec58,nodeg33" --host_ip_list "10.204.217.61,10.204.217.98,10.204.217.73" --router_asn "64512" --mt_options "admin,contrail123,admin" && echo exec-provision-control >> /etc/contrail/contrail_config_exec.out'
Aug 14 11:22:16 nodeg21 kernel: [ 1835.798551] init: supervisor-config main process (6341) killed by TERM signal
Aug 14 11:22:28 nodeg21 puppet-agent[28239]: python exec_provision_control.py --api_server_ip "10.204.217.176" --api_server_port 8082 --host_name_list "nodeg21,nodec58,nodeg33" --host_ip_list "10.204.217.61,10.204.217.98,10.204.217.73" --router_asn "64512" --mt_options "admin,contrail123,admin" && echo exec-provision-control >> /etc/contrail/contrail_config_exec.out returned 1 instead of one of [0]
Aug 14 11:22:28 nodeg21 puppet-agent[28239]: (/Stage[config]/Contrail::Exec_provision_control/Exec[exec-provision-control]/returns) change from notrun to 0 failed: python exec_provision_control.py --api_server_ip "10.204.217.176" --api_server_port 8082 --host_name_list "nodeg21,nodec58,nodeg33" --host_ip_list "10.204.217.61,10.204.217.98,10.204.217.73" --router_asn "64512" --mt_options "admin,contrail123,admin" && echo exec-provision-control >> /etc/contrail/contrail_config_exec.out returned 1 instead of one of [0]

Issue is due to the parameter default_domain_id /etc/keystone/keystone.conf is not same among the openstack nodes

WorkAround :

1) grep default_domain_id /etc/keystone/keystone.conf among the openstack nodes.
2) make sure all the openstack nodes have same default_domain_id ;
3) Restart the keystone service on all the openstack nodes
3) observe that the provision gets completed succesfully

sundarkh (sundar-kh)
description: updated
Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.1

Review in progress for https://review.opencontrail.org/23294
Submitter: Dheeraj Gautam (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/23294
Committed: http://github.org/Juniper/contrail-puppet/commit/847c421cfd4dddbe4458c53c9e9605412985bbe6
Submitter: Zuul
Branch: R3.1

commit 847c421cfd4dddbe4458c53c9e9605412985bbe6
Author: Dheeraj Gautam <email address hidden>
Date: Mon Aug 15 12:20:56 2016 -0700

SM-Mitaka: Fix for mitaka config_started issue

Closes-Bug: #1613159

During HA-Config, pull default_domain_id from nodes and configure all nodes
with it an restart keystone as well.

PATCH2: Fixed Sync_keystone.py for liberty case.

Change-Id: I27fe3a092e674a5d9cfa12580b002a9e01fc0745

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] R3.1

Review in progress for https://review.opencontrail.org/23558
Submitter: Dheeraj Gautam (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : [Review update] master

Review in progress for https://review.opencontrail.org/23584
Submitter: Dheeraj Gautam (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Review in progress for https://review.opencontrail.org/23585
Submitter: Dheeraj Gautam (<email address hidden>)

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote : A change has been merged

Reviewed: https://review.opencontrail.org/23558
Committed: http://github.org/Juniper/contrail-puppet/commit/f1611a24399bc956a982f1fb569659cdbe0f804a
Submitter: Zuul
Branch: R3.1

commit f1611a24399bc956a982f1fb569659cdbe0f804a
Author: Dheeraj Gautam <email address hidden>
Date: Tue Aug 23 22:40:09 2016 -0700

SM-Mitaka: sync_keystone.py may fail

Partial-Bug: #1613159

openstack-get-config might not be available on all nodes. this might cause
failure for the case if default_domain_id is created on another openstack[2]
or openstack[1] instead of openstack[0]

PATCH 2: Fixed conflict issue
PATCH 3: Fixed merging issue introduced by PATCH2

Fix merging conflict
Conflicts:
 contrail/environment/modules/contrail/manifests/ha_config.pp

Change-Id: I1c168eeb1814866fd10a3e78e66bfaaf72663f0d

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/23584
Committed: http://github.org/Juniper/contrail-puppet/commit/61231fffb20ae9b45ae4c36951d791e3a3e06cec
Submitter: Zuul
Branch: master

commit 61231fffb20ae9b45ae4c36951d791e3a3e06cec
Author: Dheeraj Gautam <email address hidden>
Date: Mon Aug 15 12:20:56 2016 -0700

SM-Mitaka: Fix for mitaka config_started issue

Closes-Bug: #1613159

During HA-Config, pull default_domain_id from nodes and configure all nodes
with it an restart keystone as well.

PATCH2: Fixed Sync_keystone.py for liberty case.

Change-Id: I27fe3a092e674a5d9cfa12580b002a9e01fc0745

Revision history for this message
OpenContrail Admin (ci-admin-f) wrote :

Reviewed: https://review.opencontrail.org/23585
Committed: http://github.org/Juniper/contrail-puppet/commit/55529b79f8e6f5d0da9a9c41a66bf12ef38e9284
Submitter: Zuul
Branch: master

commit 55529b79f8e6f5d0da9a9c41a66bf12ef38e9284
Author: Dheeraj Gautam <email address hidden>
Date: Tue Aug 23 22:40:09 2016 -0700

SM-Mitaka: sync_keystone.py may fail

Partial-Bug: #1613159

openstack-get-config might not be available on all nodes. this might cause
failure for the case if default_domain_id is created on another openstack[2]
or openstack[1] instead of openstack[0]

PATCH 2: Fixed conflict issue
PATCH 3: Fixed merging issue introduced by PATCH2

Fix merging conflict
Conflicts:
 contrail/environment/modules/contrail/manifests/ha_config.pp

Change-Id: I1c168eeb1814866fd10a3e78e66bfaaf72663f0d

Jeba Paulaiyan (jebap)
tags: removed: releasenote
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.