StarlingX

Containers: configuration cannot get openstack service list

Bug #1812519 reported by Erich Cordoba on 2019-01-19

This bug affects 1 person

Affects		Status	Importance	Assigned to	Milestone
	StarlingX	Invalid	High	Erich Cordoba

Bug Description

Brief Description
-----------------

On config_controller --kubernetes, the step 06 fails. Looking into /var/log/puppet/latest/puppet.log file it seems that the error is related with the openstack internal service list command.

Severity
--------
Major: As configuration cannot be completed.

Steps to Reproduce
------------------

Perform a fresh install and the run config_controller --kubernetes.

Expected Behavior
------------------
The configuration should be completed.

Actual Behavior
----------------
config_controller command fails, the log shows this error:

2019-01-18T23:04:57.517 Error: 2019-01-18 23:04:57 +0000 Failed to apply catalog: Command: 'openstack ["--os-interface", "internal", "service", "list", "--quiet", "--format", "csv", "--long"]' has been running for more than 40 seconds (tried 4, for a total of 170 seconds)

However, doing that command manually is possible:

[wrsroot@controller-0 ~(keystone_admin)]$ /usr/bin/openstack --os-interface internal service list --quiet --format csv --long
"ID","Name","Type","Description","Enabled"
"5579bbc4905448c0b65716614ac4fd68","barbican","key-manager","Key management Service",True
"f7099fb4e97a4924aff70ed595d3baa7","keystone","identity","OpenStack Identity Service",True
"c003dbc15db0479882e1b57ae643c009","sysinv","platform","SysInvService",True

Reproducibility
---------------
100% - However, I had only tried two times.

System Configuration
--------------------
All In One with containers.

Branch/Pull Time/Commit
-----------------------
Use this ISO: http://mirror.starlingx.cengn.ca/mirror/starlingx/master/centos/20190117T060000Z/outputs/iso/

Tags:

Revision history for this message

Erich Cordoba (ericho) wrote on 2019-01-19:

puppet.log Edit (854.9 KiB, application/octet-stream)

Adding log file.

Revision history for this message

Ghada Khalil (gkhalil) wrote on 2019-01-21:

Marking as release gating until further investigation.

tags:	added: stx.containers
summary:	- configuration cannot get openstack service list + Containers: configuration cannot get openstack service list
Changed in starlingx:
assignee:	nobody → Al Bailey (albailey1974)
importance:	Undecided → High
status:	New → Triaged
tags:	added: stx.2019.05

Revision history for this message

Al Bailey (albailey1974) wrote on 2019-01-21:

I have not able to reproduce this issue in AIO SX or DX using a load from Jan 21

I suspect the issue is that keystone is slow to restart.

Info about keystone service can be found in /var/log/daemon.log
(Starting openstack-keystone)

If there are errors executing keystone, they should be logged in
/var/log/keystone/keystone-api.log

Revision history for this message

Erich Cordoba (ericho) wrote on 2019-01-22:

Here some findings.

- First of all, I'm using a libvirt/qemu environment behind a proxy (not the most friendly setup, but good for finding corner cases..).

- I tried with VirtualBox as the wiki says on an environment with direct access to internet and everything worked.

- I tested two scenarios with the following results:

Using a custom no_proxy
=======================

- In my libvirt/qemu environment I did the following:

Edit the /etc/resolv.conf
Edit /etc/environment

http_proxy=<my-proxy-ip>
https_proxy=<my-proxy-ip>
no_proxy=no_proxy=localhost,127.0.0.1,192.168.206.2,172.16.0.0/16,10.96.0.0/12
# Here 192.168.206.2 was reported as non-reachable in the puppet.log, so I added into the no_proxy.

Edit same proxy settings on docker proxy configuration.

In these conditions is where the openstack service list fails, as reported originally in this bug.

Using a simple no_proxy=127.0.0.1
=================================

Same steps as above with the only difference of using no_proxy=127.0.0.1

In this case config_controller --kubernetes was able to finish with no errors, however it cannot continue due to this error:

controller-0:~$ . /etc/platform/openrc
Openstack Admin credentials can only be loaded from the active controller.

Summary
=======

- My libvirt/qemu setup works on terms of having internet connectivity, I can curl internet hosts and run docker pull manually. So I don't think this could be caused by not using Virtualbox.

- Changes on no_proxy caused different behavior. I've seen in the past some Golang Rest API behave weird living behind a proxy, I had to specify a specific no_proxy to get it work. I'm wondering if this configuration needs a specific no_proxy.

Revision history for this message

Erich Cordoba (ericho) wrote on 2019-01-22:

I think I found the root cause of this problem.

I was able to define a NAT network for libvirt/qemu with this xml:

<network>
<name>default</name>
<bridge name="stxbr1" stp="off"/>
<forward mode="nat"/>
<ip address="10.10.10.1" netmask="255.255.255.0">
</ip>
</network>

After testing that the network works (pinging an external IP) then I followed the steps in the wiki and the config_controller command succeed.

I think that as this was a setup issue we can lower the importance, also we will need to update the provision scripts to provide this setup for NAT network on libvirt.

Al Bailey (albailey1974) on 2019-01-23