Rocky setup hosts fails to recreate destroyed container after upgrading from old versions

Bug #1794613 reported by Ionuț Bîru
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack-Ansible
Invalid
Undecided
Jesse Pretorius

Bug Description

I had an issue with repo install and I was trying to resolve it by recreating the container.

1) openstack-ansible lxc-container-destroy.yml -l ctrl1_repo_container-fd1b49dc
2) removed the facts from /etc/openstack_deploy/ansible_facts
3) openstack-ansible setup-hosts.yml

The container doesn't have any networking configured and it fails at task:
TASK [lxc_container_create : Execute first script] *******************************************************************************************************************************************
ok: [ctrl1_horizon_container-2a14570b -> myip]
ok: [ctrl1_utility_container-abc87ff8 -> myip]
ok: [ctrl1_keystone_container-1f323e9c -> myip]
ok: [ctrl1_neutron_server_container-aaf0592d -> myip]
ok: [ctrl1_glance_container-fc86c0d8 -> myip]
ok: [ctrl1_neutron_agents_container-8acff814 -> myip]
ok: [ctrl1_ceilometer_api_container-0d39d51c -> myip]
ok: [ctrl1_cinder_api_container-2b8f260e -> myip]
ok: [ctrl1_galera_container-2e70e1ef -> myip]
ok: [ctrl1_memcached_container-58ddea48 -> myip]
ok: [ctrl1_fleio_container-d248d25d -> myip]
ok: [ctrl1_gnocchi_container-ffe9e06c -> myip]
ok: [ctrl1_rsyslog_container-c355e62f -> myip]
ok: [ctrl1_rabbit_mq_container-442bbf9a -> myip]
ok: [ctrl1_designate_container-f1d63268 -> myip]
ok: [ctrl1_ceilometer_central_container-e7b9a1c6 -> myip]
ok: [ctrl1_nova_api_container-8e2aeccf -> myip]
ok: [ctrl1_heat_api_container-bbeabdf5 -> myip]
FAILED - RETRYING: Execute first script (5 retries left).
FAILED - RETRYING: Execute first script (4 retries left).
FAILED - RETRYING: Execute first script (3 retries left).
FAILED - RETRYING: Execute first script (2 retries left).
FAILED - RETRYING: Execute first script (1 retries left).
fatal: [ctrl1_repo_container-fd1b49dc -> myip]: FAILED! => {"attempts": 5, "changed": true, "cmd": ["/var/lib/lxc/ctrl1_repo_container-fd1b49dc/container-first-run.sh"], "delta": "0:00:01.543488", "end": "2018-09-26 23:23:29.123600", "msg": "non-zer

root@winterfell:/var/lib/lxc/ctrl1_repo_container-fd1b49dc# ls -lh
total 44K
-rwxr-xr-x 1 root root 99 Sep 26 23:19 autodev
-rw-r--r-- 1 root root 1.2K Sep 26 23:21 config
-rw-r--r-- 1 root root 929 Sep 26 23:20 config.13135.2018-09-26@23:21:14~
-rw-r--r-- 1 root root 945 Sep 26 23:21 config.13192.2018-09-26@23:21:14~
-rw-r--r-- 1 root root 960 Sep 26 23:21 config.13248.2018-09-26@23:21:15~
-rw-r--r-- 1 root root 973 Sep 26 23:21 config.13276.2018-09-26@23:21:15~
-rw-r--r-- 1 root root 1.1K Sep 26 23:21 config.13304.2018-09-26@23:21:15~
-rw-r--r-- 1 root root 1.1K Sep 26 23:21 config.14024.2018-09-26@23:21:30~
-rw-r--r-- 1 root root 1.1K Sep 26 23:21 config.15221.2018-09-26@23:21:59~
-rwxr-xr-x 1 root root 1.1K Sep 26 23:22 container-first-run.sh
drwxr-xr-x 21 root root 4.0K Sep 26 23:28 rootfs

root@winterfell:/var/lib/lxc/ctrl1_repo_container-fd1b49dc# lxc-attach --name ctrl1_repo_container-fd1b49dc
root@ctrl1_repo_container-fd1b49dc:/# ip a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever

Revision history for this message
Ionuț Bîru (ionut-3) wrote :

So I managed to create the container correctly but I had to add the old legacy configs manually for config and eth1.

It seems that the tasks only work when containers configs already exists and it tries to convert to new syntax.

Changed in openstack-ansible:
assignee: nobody → Jesse Pretorius (jesse-pretorius)
Revision history for this message
Jesse Pretorius (jesse-pretorius) wrote :

On a fresh Ubuntu Xenial host, I tried the following and was unable to replicate the issue:

cd /opt
git clone https://github.com/openstack/openstack-ansible
cd openstack-ansible
git checkout origin/stable/rocky # current SHA: c14e149d04a7a97c36ff4193ff59fdf6318a6d6d
./scripts/bootstrap-ansible.sh
./scripts/bootstrap-aio.sh
cd playbooks/
openstack-ansible setup-hosts.yml
openstack-ansible lxc-containers-destroy.yml --limit aio1_repo_container-b2f0a852 # answer yes,yes
rm -rf /etc/openstack_deploy/ansible_facts
openstack-ansible lxc-containers-create.yml --limit lxc_hosts,aio1_repo_container-b2f0a852

It seems that either the issue is resolved or more information is required to replicate the issue.

Changed in openstack-ansible:
status: New → Incomplete
Revision history for this message
Ionuț Bîru (ionut-3) wrote :

I found out why was happening. The container cache was too old and i guess wasn't prepared for the new configuration layout.
I fixed it by deleting /var/cache/lxc/download.

summary: - Rocky setup hosts fails to recreate destroyed container
+ Rocky setup hosts fails to recreate destroyed container after upgrading
+ from old versions
Revision history for this message
Jonathan Rosser (jrosser) wrote :

Local troubleshooting has resolved this

Changed in openstack-ansible:
status: Incomplete → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.