os-nova-install stalls for hours and re-run requires deleting cell
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack-Ansible |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
;tldr - install was going fine until it got stuck trying to create nova service/endpoints and did not fail, nova service/endpoints never got created, killed playbook run, when trying to re-run os-nova-install.yml after the fail, a cell was already created, required deleting the cell before nova install would complete
####
While trying to install nova, playbook is stuck here for hours
####
TASK [os_nova : Ensure nova service] *******
Friday 10 February 2017 08:03:14 +0000 (0:00:03.728) 0:16:14.138 *******
skipping: [infra01_
skipping: [infra03_
skipping: [infra02_
skipping: [infra01_
skipping: [infra03_
skipping: [infra02_
skipping: [infra03_
skipping: [infra01_
skipping: [infra03_
skipping: [infra02_
skipping: [compute02]
skipping: [compute03]
skipping: [compute01]
skipping: [infra01_
skipping: [infra03_
skipping: [infra02_
skipping: [infra01_
skipping: [infra03_
skipping: [infra02_
skipping: [infra02_
####
nova endpoint/service never got created
####
(rally) root@melv7301-
+------
| ID | Region | Service Name | Service Type | Enabled | Interface | URL |
+------
| 185eaaad25d54c7
| 4f5a30a3ff0d452
| 560ca37e821c45b
| 7b2d985d7ef74a0
| 84741cadc2df45b
| 9e873b34b9a344c
| bc24011f9732428
| be50be66aa1e406
| dbe077dfab504c3
| dc85a1c1b85f4cc
| e4890d4332e94c4
| ed86117fcca848c
+------
####
`ps faux` on the deployment host
####
root 25409 0.0 0.0 55564 32996 ? Ss 06:36 0:13 tmux
root 25410 0.0 0.0 28396 9484 pts/18 Ss 06:36 0:00 \_ -bash
root 32868 0.0 0.0 12688 1596 pts/18 S+ 07:46 0:00 \_ bash /usr/local/
root 32877 10.1 0.4 795292 566236 pts/18 Sl+ 07:46 28:40 \_ /opt/ansible-
root 10544 0.0 0.4 769692 537808 pts/18 S+ 08:03 0:00 \_ /opt/ansible-
root 10549 0.0 0.0 44024 2524 pts/18 S+ 08:03 0:00 \_ ssh -C -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyCh
root 32905 0.0 0.0 44760 2008 ? Ss 07:47 0:04 ssh: /root/.
####
`ps faux` on the container it is attached to
####
root@infra01:~# lxc-attach -n infra01_
root@infra01-
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 3055 0.0 0.0 21192 3556 ? Ss 06:30 0:00 /bin/bash
root 3065 0.0 0.0 37364 3368 ? R+ 06:30 0:00 \_ ps faux
root 2817 0.0 0.0 4508 708 ? S 02:03 0:00 /bin/sh -c /usr/bin/python && sleep 0
root 2818 0.0 0.1 32412 10188 ? S 02:03 0:00 \_ /usr/bin/python
root 2819 0.0 0.6 103196 53048 ? S 02:03 0:00 \_ /usr/bin/python /tmp/ansible_
root 1 0.0 0.0 37088 4592 ? Ss 01:56 0:00 /sbin/init
root 41 0.0 0.0 43844 6948 ? Ss 01:56 0:00 /lib/systemd/
root 76 0.0 0.0 28980 2796 ? Ss 01:56 0:00 /usr/sbin/cron -f
syslog 78 0.0 0.0 256400 2672 ? Ssl 01:56 0:00 /usr/sbin/rsyslogd -n
root 145 0.0 0.0 16128 2628 ? Ss 01:56 0:00 /sbin/dhclient -1 -v -pf /run/dhclient.
root 194 0.0 0.0 65520 3916 ? Ss 01:56 0:00 /usr/sbin/sshd -D
root 196 0.0 0.0 15756 2128 pts/2 Ss+ 01:56 0:00 /sbin/agetty --noclear --keep-baud pts/2 115200 38400 9600 vt220
root 197 0.0 0.0 15756 2124 pts/3 Ss+ 01:56 0:00 /sbin/agetty --noclear --keep-baud pts/3 115200 38400 9600 vt220
root 198 0.0 0.0 15756 2172 pts/1 Ss+ 01:56 0:00 /sbin/agetty --noclear --keep-baud pts/1 115200 38400 9600 vt220
root 199 0.0 0.0 15756 2132 pts/0 Ss+ 01:56 0:00 /sbin/agetty --noclear --keep-baud pts/0 115200 38400 9600 vt220
root 200 0.0 0.0 15756 2132 lxc/console Ss+ 01:56 0:00 /sbin/agetty --noclear --keep-baud console 115200 38400 9600 vt220
nova 2808 1.3 1.4 207312 122420 ? Ss 02:03 2:24 /openstack/
nova 2824 0.0 1.5 213796 124860 ? S 02:03 0:08 \_ /openstack/
nova 2825 0.0 1.5 214052 125004 ? S 02:03 0:08 \_ /openstack/
nova 2826 0.0 1.5 214052 125068 ? S 02:03 0:08 \_ /openstack/
nova 2827 0.0 1.5 213796 124860 ? S 02:03 0:08 \_ /openstack/
####
after killing the run and restarting
####
TASK [os_nova : Perform cell_v2 initial cell setup] *******
Friday 10 February 2017 13:27:50 +0000 (0:00:04.843) 0:02:28.021 *******
fatal: [infra01_
####
resolution
####
I resolved this by going on the mentioned nova_api container above and running:
root@infra02-
(nova-master) root@infra02-
+------
| Name | UUID |
+------
| cell0 | 00000000-
| cell1 | 4ca40cd9-
+------
(nova-master) root@infra02-
Then re-ran the os-nova-install.yml
Resolved by I078caf682aa01d b6d5a1472946b25 159f3473586