Containers: configuration cannot get openstack service list

Bug #1812519 reported by Erich Cordoba
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
StarlingX
Invalid
High
Erich Cordoba

Bug Description

Brief Description
-----------------

On config_controller --kubernetes, the step 06 fails. Looking into /var/log/puppet/latest/puppet.log file it seems that the error is related with the openstack internal service list command.

Severity
--------
Major: As configuration cannot be completed.

Steps to Reproduce
------------------

Perform a fresh install and the run config_controller --kubernetes.

Expected Behavior
------------------
The configuration should be completed.

Actual Behavior
----------------
config_controller command fails, the log shows this error:

2019-01-18T23:04:57.517 Error: 2019-01-18 23:04:57 +0000 Failed to apply catalog: Command: 'openstack ["--os-interface", "internal", "service", "list", "--quiet", "--format", "csv", "--long"]' has been running for more than 40 seconds (tried 4, for a total of 170 seconds)

However, doing that command manually is possible:

[wrsroot@controller-0 ~(keystone_admin)]$ /usr/bin/openstack --os-interface internal service list --quiet --format csv --long
"ID","Name","Type","Description","Enabled"
"5579bbc4905448c0b65716614ac4fd68","barbican","key-manager","Key management Service",True
"f7099fb4e97a4924aff70ed595d3baa7","keystone","identity","OpenStack Identity Service",True
"c003dbc15db0479882e1b57ae643c009","sysinv","platform","SysInvService",True

Reproducibility
---------------
100% - However, I had only tried two times.

System Configuration
--------------------
All In One with containers.

Branch/Pull Time/Commit
-----------------------
Use this ISO: http://mirror.starlingx.cengn.ca/mirror/starlingx/master/centos/20190117T060000Z/outputs/iso/

Revision history for this message
Erich Cordoba (ericho) wrote :

Adding log file.

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating until further investigation.

tags: added: stx.containers
summary: - configuration cannot get openstack service list
+ Containers: configuration cannot get openstack service list
Changed in starlingx:
assignee: nobody → Al Bailey (albailey1974)
importance: Undecided → High
status: New → Triaged
tags: added: stx.2019.05
Revision history for this message
Al Bailey (albailey1974) wrote :

I have not able to reproduce this issue in AIO SX or DX using a load from Jan 21

I suspect the issue is that keystone is slow to restart.

Info about keystone service can be found in /var/log/daemon.log
(Starting openstack-keystone)

If there are errors executing keystone, they should be logged in
 /var/log/keystone/keystone-api.log

Revision history for this message
Erich Cordoba (ericho) wrote :

Here some findings.

- First of all, I'm using a libvirt/qemu environment behind a proxy (not the most friendly setup, but good for finding corner cases..).

- I tried with VirtualBox as the wiki says on an environment with direct access to internet and everything worked.

- I tested two scenarios with the following results:

Using a custom no_proxy
=======================

- In my libvirt/qemu environment I did the following:

Edit the /etc/resolv.conf
Edit /etc/environment

http_proxy=<my-proxy-ip>
https_proxy=<my-proxy-ip>
no_proxy=no_proxy=localhost,127.0.0.1,192.168.206.2,172.16.0.0/16,10.96.0.0/12
# Here 192.168.206.2 was reported as non-reachable in the puppet.log, so I added into the no_proxy.

Edit same proxy settings on docker proxy configuration.

In these conditions is where the openstack service list fails, as reported originally in this bug.

Using a simple no_proxy=127.0.0.1
=================================

Same steps as above with the only difference of using no_proxy=127.0.0.1

In this case config_controller --kubernetes was able to finish with no errors, however it cannot continue due to this error:

controller-0:~$ . /etc/platform/openrc
Openstack Admin credentials can only be loaded from the active controller.

Summary
=======

- My libvirt/qemu setup works on terms of having internet connectivity, I can curl internet hosts and run docker pull manually. So I don't think this could be caused by not using Virtualbox.

- Changes on no_proxy caused different behavior. I've seen in the past some Golang Rest API behave weird living behind a proxy, I had to specify a specific no_proxy to get it work. I'm wondering if this configuration needs a specific no_proxy.

Revision history for this message
Erich Cordoba (ericho) wrote :

I think I found the root cause of this problem.

I was able to define a NAT network for libvirt/qemu with this xml:

<network>
 <name>default</name>
 <bridge name="stxbr1" stp="off"/>
 <forward mode="nat"/>
 <ip address="10.10.10.1" netmask="255.255.255.0">
 </ip>
</network>

After testing that the network works (pinging an external IP) then I followed the steps in the wiki and the config_controller command succeed.

I think that as this was a setup issue we can lower the importance, also we will need to update the provision scripts to provide this setup for NAT network on libvirt.

Al Bailey (albailey1974)
Changed in starlingx:
assignee: Al Bailey (albailey1974) → nobody
Frank Miller (sensfan22)
Changed in starlingx:
status: Triaged → Invalid
assignee: nobody → Erich Cordoba (ericho)
Ken Young (kenyis)
tags: added: stx.2.0
removed: stx.2019.05
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.