AIO subcloud offline as admin endpoints remain http

Bug #1890834 reported by Bin Qian
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Bin Qian

Bug Description

This issue happens occasionally on an AIO subcloud.
Brief Description
-----------------
Attempted to deploy a distributed cloud system with IPv4 configuration. The System Controller was deployed ok but the subcloud remain offline to the System Controller after deployment. Both System Controller and the subcloud can ping each other through the mgmt interface, so it does not appear to be connectivity issue.
------------+----------------------------------------------------

id name management availability deploy status sync
------------+----------------------------------------------------

2 subcloud1 unmanaged offline complete unknown
The following logs are seen in /var/log/dcmanager/dcmanager.log:
2020-07-21 17:14:04.389 127418 WARNING keystoneauth.identity.generic.base [-] Failed to discover available identity versions when contacting https://192.168.57.2:5001/v3. Attempting to parse version from URL.: ConnectTimeout: Request to https://192.168.57.2:5001/v3 timed out
2020-07-21 17:14:04.395 127418 WARNING keystoneauth.identity.generic.base [-] Failed to discover available identity versions when contacting https://192.168.62.2:5001/v3. Attempting to parse version from URL.: ConnectTimeout: Request to https://192.168.62.2:5001/v3 timed out

Initial assessment from Bin Qian:
the config was not applied... and I doubt it would even if rebooting the controller. John probably is the best source to ask for help... _controller_config_active_apply which converting admin endpoint to https, only runs for the initial configure. In this case, the runtime manifest was not applied, but the config-out-dated alarm was cleared. sounds like a bug to me now

sysinv.log reported that manifests ('openstack::keystone::endpoint::runtime','platform::firewall::runtime') were not applied from controller_config_active_apply. The manifests were to reconfigure the admin endpoints to https.

sysinv 2020-07-21 15:20:17.128 105412 INFO sysinv.conductor.manager [-] Setting config target of host 'controller-0' to '64fb5d52-1fac-4e76-8093-1e7d53950b2f'.
sysinv 2020-07-21 15:20:17.140 105412 WARNING sysinv.conductor.manager [-] controller-0: iconfig out of date: target 64fb5d52-1fac-4e76-8093-1e7d53950b2f, applied 8f822d79-a27b-4d02-b8ad-567042962eed
sysinv 2020-07-21 15:20:17.141 105412 WARNING sysinv.conductor.manager [-] SYS_I Raise system config alarm: host controller-0 config applied: 8f822d79-a27b-4d02-b8ad-567042962eed vs. target: 64fb5d52-1fac-4e76-8093-1e7d53950b2f.
sysinv 2020-07-21 15:20:17.157 105412 INFO sysinv.conductor.manager [-] _config_update_hosts config_uuid=64fb5d52-1fac-4e76-8093-1e7d53950b2f
sysinv 2020-07-21 15:20:17.158 105412 INFO sysinv.conductor.manager [-] applying runtime manifest config_uuid=64fb5d52-1fac-4e76-8093-1e7d53950b2f, classes: ['openstack::keystone::endpoint::runtime', 'platform::firewall::runtime']
sysinv 2020-07-21 15:20:17.171 105412 INFO sysinv.puppet.puppet [-] Updating hiera for host: controller-0 with config_uuid: 64fb5d52-1fac-4e76-8093-1e7d53950b2f
sysinv 2020-07-21 15:20:18.934 105412 WARNING sysinv.conductor.manager [-] controller-0: iconfig out of date: target 64fb5d52-1fac-4e76-8093-1e7d53950b2f, applied 8f822d79-a27b-4d02-b8ad-567042962eed
sysinv 2020-07-21 15:20:18.934 105412 WARNING sysinv.conductor.manager [-] SYS_I Raise system config alarm: host controller-0 config applied: 8f822d79-a27b-4d02-b8ad-567042962eed vs. target: 64fb5d52-1fac-4e76-8093-1e7d53950b2f.
sysinv 2020-07-21 15:20:22.046 105412 INFO sysinv.agent.rpcapi [-] config_apply_runtime_manifest: fanout_cast: sending config 64fb5d52-1fac-4e76-8093-1e7d53950b2f

{'classes': ['openstack::keystone::endpoint::runtime', 'platform::firewall::runtime'], 'force': False, 'personalities': ['controller'], 'host_uuids': [u'135252d1-4f0b-4cce-8ef5-e046b569fc75']}
to agent
sysinv 2020-07-21 15:20:22.049 99746 INFO sysinv.agent.manager [-] config_apply_runtime_manifest: 64fb5d52-1fac-4e76-8093-1e7d53950b2f

{u'classes': [u'openstack::keystone::endpoint::runtime', u'platform::firewall::runtime'], u'force': False, u'personalities': [u'controller'], u'host_uuids': [u'135252d1-4f0b-4cce-8ef5-e046b569fc75']}
controller
sysinv 2020-07-21 15:20:48.520 105412 WARNING sysinv.conductor.manager [-] controller-0: iconfig out of date: target 64fb5d52-1fac-4e76-8093-1e7d53950b2f, applied 8f822d79-a27b-4d02-b8ad-567042962eed
sysinv 2020-07-21 15:20:48.520 105412 WARNING sysinv.conductor.manager [-] SYS_I Raise system config alarm: host controller-0 config applied: 8f822d79-a27b-4d02-b8ad-567042962eed vs. target: 64fb5d52-1fac-4e76-8093-1e7d53950b2f.

Severity
--------
Major

Steps to Reproduce
------------------
Deploy DC system with ipv4 configuration.

Expected Behavior
-------------------
The DC system is deployed successfully and all subclouds are in a managed and insync status with the System Controller

Actual Behavior
----------------
The subclouds remain offline to the System Controller after deployment

Reproducibility
Intermittent

System Configuration
--------------------
Distributed Cloud System

Ghada Khalil (gkhalil)
Changed in starlingx:
importance: Undecided → Medium
assignee: nobody → Bin Qian (bqian20)
tags: added: stx.5.0 stx.distcloud
Ghada Khalil (gkhalil)
description: updated
Revision history for this message
Bin Qian (bqian20) wrote :
Changed in starlingx:
status: In Progress → Fix Committed
Ghada Khalil (gkhalil)
Changed in starlingx:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.