[ansible] cannot start armada service when deploy with private registry

Bug #1830319 reported by Yan Chen
16
This bug affects 2 people
Affects Status Importance Assigned to Milestone
StarlingX
Fix Released
Medium
Tee Ngo

Bug Description

We are trying to use ansible deploy StarlingX with private registry.
And the registry config in localhost.yml file is as below:

docker_registries:
  - 10.239.40.221:5000

But when running "system application-upload", we found the following error log in sysinv.log:
2019-05-24 03:28:10.735 98356 ERROR sysinv.conductor.kube_app [-] Docker error while launching Armada container: 500 Server Error: Internal Server Error ("Get https://quay.io/v2/: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)")
2019-05-24 03:28:10.747 98356 ERROR sysinv.conductor.kube_app [-] Application upload aborted!.

And then application status shows "upload-failed".

Checked the sysinv.conf file, the registry location is not successfully modified:
controller-0:/$ sudo cat /etc/sysinv/sysinv.conf|grep armada
armada_image_tag=quay.io/airshipit/armada:af8a9ffd0873c2fbc915794e235dbd357f2adab1

But if we use "config_controller --force", the registry is modified as we expected:
controller-0:~$ sudo cat /etc/sysinv/sysinv.conf |grep armada
Password:
armada_image_tag=10.239.40.221:5000/airshipit/armada:af8a9ffd0873c2fbc915794e235dbd357f2adab1

Revision history for this message
Ghada Khalil (gkhalil) wrote :

Marking as release gating until further investigation; ansible issue with private registry which is a supported stx config.

summary: - [ansible] cannot start armada service when deploy with private registry.
+ [ansible] cannot start armada service when deploy with private registry
tags: added: stx.config
tags: added: stx.2.0
Changed in starlingx:
importance: Undecided → Medium
status: New → Triaged
assignee: nobody → Tee Ngo (teewrs)
Revision history for this message
Yan Chen (ychen2u) wrote :

My localhost.yml content is as below, we are running with a local dns and a local registry:

---
system_mode: duplex

management_subnet: 192.178.204.0/28

external_oam_subnet: 10.10.10.0/24
external_oam_gateway_address: 10.10.10.1
external_oam_floating_address: 10.10.10.2
external_oam_node_0_address: 10.10.10.3
external_oam_node_1_address: 10.10.10.4

admin_username: admin
admin_password: Local.123

dns_servers:
  - 10.248.2.1

docker_registries:
  - 10.239.40.221:5000

is_secure_registry: False

Revision history for this message
Yan Chen (ychen2u) wrote :

Accordingly, here's the config file for "config_controller --force":

[SYSTEM]
SYSTEM_MODE = duplex
SYSTEM_TYPE = All-in-one

[VERSION]
RELEASE = 19.01

[AUTHENTICATION]
ADMIN_PASSWORD = Local.123

[MGMT_NETWORK]
CIDR = 192.178.204.0/24
MULTICAST_CIDR = 239.1.1.0/28
DYNAMIC_ALLOCATION = Y
LOGICAL_INTERFACE = LOGICAL_INTERFACE_2

[LOGICAL_INTERFACE_2]
LAG_INTERFACE = N
INTERFACE_MTU = 1500
INTERFACE_LINK_CAPACITY = 10000
#virbr2
INTERFACE_PORTS = ens7

[OAM_NETWORK]
CIDR = 10.10.10.0/24
GATEWAY = 10.10.10.1
IP_FLOATING_ADDRESS = 10.10.10.2
IP_UNIT_0_ADDRESS = 10.10.10.3
IP_UNIT_1_ADDRESS = 10.10.10.4
LOGICAL_INTERFACE = LOGICAL_INTERFACE_1

[LOGICAL_INTERFACE_1]
LAG_INTERFACE = N
INTERFACE_MTU = 1500
#virbr1
INTERFACE_PORTS = ens6

[DNS]
# DNS Configuration
NAMESERVER_1=10.248.2.1
NAMESERVER_2=10.248.2.1
NAMESERVER_3=10.248.2.1

[DOCKER_REGISTRY]
# Docker Registry Configuration
DOCKER_K8S_REGISTRY=10.239.40.221:5000
DOCKER_GCR_REGISTRY=10.239.40.221:5000
DOCKER_QUAY_REGISTRY=10.239.40.221:5000
DOCKER_DOCKER_REGISTRY=10.239.40.221:5000
IS_SECURE_REGISTRY=False

Revision history for this message
Fernando Hernandez Gonzalez (fhernan2) wrote :
Download full text (3.3 KiB)

Hi guys, I faced similar issues when using ansible last 5/24 Friday, my comments:

• Build info: 20190521T132734Z

First try, steps I followed:
- Installed Stx and changed the wrsroot password.
- Followed "Ansible bootstrap install steps"
   ip address add 192.168.90.3/24 dev eno1
   ip link set up dev eno1
   route add default gw 192.168.90.1 eno1
- Ping successfully to docker registry 192.168.90.60
- Create /home/wrsroot/localhost.yml

-------------localhost.yml--------------------------
# Mandatory
system_mode: duplex

# Optional
external_oam_subnet: 192.168.90.0/24
external_oam_gateway_address: 192.168.90.1
external_oam_floating_address: 192.168.90.240
external_oam_node_0_address: 192.168.90.106
external_oam_node_1_address: 192.168.90.105
management_subnet: 10.10.62.0/24
docker_registries:
  - 192.168.90.60
dns_servers:
  - 192.168.90.60
admin_password: St4rlingX*
ansible_become_pass: St4rlingX*

-------------localhost.yml--------------------------

- Ran "ansible-playbook /usr/share/ansible/stx-ansible/playbooks/bootstrap/bootstrap.yml". Please check attached "Ansible playbook command output 1" file where is displaying following warning.
   [WARNING]: Module remote_tmp /tmp/.ansible-root/tmp did not exist and was
   created with a mode of 0700, this may cause issues when running as another
   user. To avoid this, create the remote_tmp dir with the correct permissions
   manually

- I did create /tmp/.ansible-root/tmp folders and did 'chmod 777' to all of them.

- I ran it for a second time and for some reason it seems the "localhost.yml" placed on HOME it was not taken since is pingin to 8.8.8.8 instead of my "192.168.90.60" docker registy ip. Please check attached "Ansible bootstrap install pings 8.8.8.8" file showing below error:

  TASK [validate-config : Fail if DNS Server is unreachable]
  **********************************************************************************************************
   fatal: [localhost]: FAILED! => {"changed": false, "msg": "The provided DNS Server 8.8.8.8 is unreachable."}

- Erich helped me out to do some troubleshooting and we added the "is_securr_registry" flag to the "localhost.yml" and ran it again, [it was like the fifth time, do not remember well the numbre after several tries]

Please look at the "Ansible bootstrap install output 5" for details.
Please see final "localhost.yml" used.

-------------localhost.yml--------------------------
# Mandatory
system_mode: duplex

# Optional
external_oam_subnet: 192.168.90.0/24
external_oam_gateway_address: 192.168.90.1
external_oam_floating_address: 192.168.90.240
external_oam_node_0_address: 192.168.90.106
external_oam_node_1_address: 192.168.90.105
management_subnet: 10.10.62.0/24
docker_registries:
  - 192.168.90.60
is_secure_registry: False
dns_servers:
  - 192.168.90.60
admin_password: St4rlingX*
ansible_become_pass: St4rlingX*
-------------localhost.yml--------------------------

- The wiki does not have instructions for local registry and it would be beneficial for the community to be updated it --> https://wiki.openstack.org/wiki/StarlingX/Containers/InstallationOnAIODX#Bootstrap_the_controller

-After all of this my assumption is that after yo...

Read more...

Revision history for this message
Fernando Hernandez Gonzalez (fhernan2) wrote :
Revision history for this message
Fernando Hernandez Gonzalez (fhernan2) wrote :
Revision history for this message
Fernando Hernandez Gonzalez (fhernan2) wrote :
Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix proposed to config (master)

Fix proposed to branch: master
Review: https://review.opendev.org/661657

Changed in starlingx:
status: Triaged → In Progress
Revision history for this message
Tee Ngo (teewrs) wrote :

Fernando, if your provided config values (dns_servers in your example) are not reflected in the ansible.log then there are 2 possibilities:

a) you have a format error (could be as subtle as extra space) in the override file in which case the playbook will skip the user input and use the values in default.yml.
b) Ansible cannot locate your override file. For instance, you run the bootstrap playbook as root in which case Ansible will look for localhost.yml under /root ($HOME of root user) instead of /home/wrsroot.

Based on what you described, a) is likely the cause of some of your failed attempts. I will make a change so it catches the override file with format error.

Regarding the upload failure as described in this LP, the code review is up. Please try again once the fix is in the build. Alternatively, you can download that review (git review -d <Change Id>, make a custom build and test it.

Revision history for this message
Fernando Hernandez Gonzalez (fhernan2) wrote :

Hi Tee, thanks for your feedback, I did check the localhost.yml format and it said it was correct, I will wait for the patch you mentioned on Tuesdays validation meeting, so far I used the sudo config_controller --force to continute with the install.

Revision history for this message
OpenStack Infra (hudson-openstack) wrote : Fix merged to config (master)

Reviewed: https://review.opendev.org/661657
Committed: https://git.openstack.org/cgit/starlingx/config/commit/?id=fc3eb127fc706bc3c6aa91424f966247bb29a7fc
Submitter: Zuul
Branch: master

commit fc3eb127fc706bc3c6aa91424f966247bb29a7fc
Author: Tee Ngo <email address hidden>
Date: Mon May 27 16:59:26 2019 -0400

    Set value for individual registry service parameter

    This commit implements a near/mid-term solution to help accelerate
    the switch in automation to using Ansible bootstrap.

    The long term solution is to allow dynamic addition of new
    registries for custom applications. This will require change to
    sysinv for a complete solution.

    Closes-Bug: 1830319
    Change-Id: I9b92fd0c5ddc226cb47e6586f8dfbdff44eed96b
    Signed-off-by: Tee Ngo <email address hidden>

Changed in starlingx:
status: In Progress → Fix Released
Revision history for this message
Tee Ngo (teewrs) wrote :

To use a unified private registry, the configuration format is

docker_registries:
  unified: <custom-ip-or-domain>

By default the is_secure_registry flag is True, if your private registry is insecure then add the following line to your localhost.yml

is_secure_registry: False

Refer to default.yml under /usr/share/ansible/stx-ansible/playbooks/bootstrap/host_vars for details.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Duplicates of this bug

Other bug subscribers