n-c-c/next fails cloud-compute-relation-changed when migration-auth-type set for Precise-Icehouse

Bug #1500589 reported by Ryan Beisner on 2015-09-28
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nova-cloud-controller (Juju Charms Collection)
Medium
Liam Young
nova-compute (Juju Charms Collection)
Undecided
Liam Young

Bug Description

When nova-compute/next has migration-auth-type set, nova-cloud-controller/next (rev188 or later) deploys in the following error state:

hook failed: "cloud-compute-relation-changed" for nova-compute:cloud-compute

** NOTE: This occurs even with forward and reverse dns being sane. **

Proof Scenarios
===============

1.
# PASS
# all stable charms
# with migration-auth-type set
http://10.245.162.77:8080/view/Dashboards/view/OpenStack%20Deploy/job/deploy_with_deployer/12145/
http://10.245.162.77:8080/view/Dashboards/view/OpenStack%20Deploy/job/deploy_with_deployer/12057/
http://10.245.162.77:8080/job/mojo_runner_baremetal/470/

2.
# FAIL
# all next charms
# with migration-auth-type set
http://10.245.162.77:8080/view/Dashboards/view/OpenStack%20Deploy/job/deploy_with_deployer/12111/
http://10.245.162.77:8080/view/Dashboards/view/OpenStack%20Deploy/job/deploy_with_deployer/12095/
http://10.245.162.77:8080/job/mojo_runner_baremetal/469/

3.
# PASS
# all next charms, except n-c-c --> substitute stable version
# with migration-auth-type set
# n-c/next
# n-c-c stable
http://10.245.162.77:8080/view/Dashboards/view/OpenStack%20Deploy/job/deploy_with_deployer/12170/

4.
# FAIL
# all next charms, except n-c --> substitute stable version
# with migration-auth-type set
# n-c stable
# n-c-c/next
http://10.245.162.77:8080/view/Dashboards/view/OpenStack%20Deploy/job/deploy_with_deployer/12169/

5.
# PASS
# all next charms
# *without* migration-auth-type set
http://10.245.162.77:8080/view/Dashboards/view/OpenStack%20Deploy/job/deploy_with_deployer/12168/

6.
# LAST KNOWN GOOD
# n-c-c/next: rev187
http://10.245.162.77:8080/view/Dashboards/view/OpenStack%20Deploy/job/deploy_with_deployer/11649/

...

Logic
=====

Given that:

* Precise-Icehouse does not deploy cleanly, on metal or on OpenStack, with the next n-c-c charm in the next.yaml bundle; and

* Precise-Icehouse begins to deploy cleanly, if the stable n-c-c charm is substituted into the next.yaml bundle; and

* Precise-Icehouse (next) last-known good next.yaml deploy was n-c-c/next rev 187.

We can assert that:

nova-cloud-controller/next, rev 188 or later, is where this breakage was introduced.

Because the amulet tests do not set migration-auth-type, this was not detected during the merge proposal testing process. Recommend adjusting n-c-c and n-c tests to exercise migration-auth-type.

Ryan Beisner (1chb1n) on 2015-09-28
summary: - cloud-compute-relation-changed -> nova_cc_utils.py -> ssh_known_host_key
- -> IndexError: list index out of range
+ precise-icehouse deploy fails (cloud-compute-relation-changed ->
+ nova_cc_utils.py -> ssh_known_host_key -> IndexError: list index out of
+ range)
Ryan Beisner (1chb1n) on 2015-10-05
Changed in nova-cloud-controller (Juju Charms Collection):
status: New → Invalid
Ryan Beisner (1chb1n) wrote :

Actually ... reopening, as this has also been observed and confirmed in our bare metal tests, as well as OpenStack-on-OpenStack -- both with sane A/PTR records verified.

Changed in nova-cloud-controller (Juju Charms Collection):
status: Invalid → New
description: updated
Ryan Beisner (1chb1n) wrote :

# n-c-c/next unit trace...

http://paste.ubuntu.com/12694636/

 -or-

2015-10-05 16:02:50 INFO cloud-compute-relation-changed Traceback (most recent call last):
2015-10-05 16:02:50 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/cloud-compute-relation-changed", line 1045, in <module>
2015-10-05 16:02:50 INFO cloud-compute-relation-changed main()
2015-10-05 16:02:50 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/cloud-compute-relation-changed", line 1039, in main
2015-10-05 16:02:50 INFO cloud-compute-relation-changed hooks.execute(sys.argv)
2015-10-05 16:02:50 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/charmhelpers/core/hookenv.py", line 704, in execute
2015-10-05 16:02:50 INFO cloud-compute-relation-changed self._hooks[hook_name]()
2015-10-05 16:02:50 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/cloud-compute-relation-changed", line 585, in compute_changed
2015-10-05 16:02:50 INFO cloud-compute-relation-changed ssh_compute_add(key, rid=rid, unit=unit)
2015-10-05 16:02:50 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/nova_cc_utils.py", line 804, in ssh_compute_add
2015-10-05 16:02:50 INFO cloud-compute-relation-changed add_known_host(host, unit, user)
2015-10-05 16:02:50 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/nova_cc_utils.py", line 761, in add_known_host
2015-10-05 16:02:50 INFO cloud-compute-relation-changed current_key = ssh_known_host_key(host, unit, user)
2015-10-05 16:02:50 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/nova_cc_utils.py", line 731, in ssh_known_host_key
2015-10-05 16:02:50 INFO cloud-compute-relation-changed return output.split('\n')[1]
2015-10-05 16:02:50 INFO cloud-compute-relation-changed IndexError: list index out of range
2015-10-05 16:02:50 INFO juju.worker.uniter.context context.go:543 handling reboot
2015-10-05 16:02:50 ERROR juju.worker.uniter.operation runhook.go:103 hook "cloud-compute-relation-changed" failed: exit status 1
2015-10-05 16:02:50 DEBUG juju.worker.uniter modes.go:31 [AGENT-STATUS] failed: run relation-changed (18; nova-compute/1) hook
2015-10-05 16:02:50 INFO juju.worker.uniter modes.go:543 ModeAbide exiting
2015-10-05 16:02:50 INFO juju.worker.uniter modes.go:541 ModeHookError starting
2015-10-05 16:02:50 DEBUG juju.worker.uniter.filter filter.go:597 want resolved event
2015-10-05 16:02:50 DEBUG juju.worker.uniter.filter filter.go:591 want forced upgrade true
2015-10-05 16:02:50 DEBUG juju.worker.uniter.filter filter.go:727 no new charm event
2015-10-05 16:02:50 DEBUG juju.worker.uniter modes.go:31 [AGENT-STATUS] error: hook failed: "cloud-compute-relation-changed"

Ryan Beisner (1chb1n) on 2015-10-06
description: updated
summary: - precise-icehouse deploy fails (cloud-compute-relation-changed ->
- nova_cc_utils.py -> ssh_known_host_key -> IndexError: list index out of
- range)
+ n-c-c/next fails cloud-compute-relation-changed when migration-auth-type
+ set for Precise-Icehouse
description: updated
Liam Young (gnuoy) on 2015-10-06
Changed in nova-compute (Juju Charms Collection):
assignee: nobody → Liam Young (gnuoy)
Changed in nova-cloud-controller (Juju Charms Collection):
assignee: nobody → Liam Young (gnuoy)
Liam Young (gnuoy) wrote :

It seems that the bug was introduced in r189 of nova-cloud-controller next. The problem is that on trusty running:

ssh-keygen -f /etc/nova/compute_ssh/nova-compute/known_hosts -H -F <IP>

for an ip not present in the file return a non-zero return code, the charm catches subprocess.CalledProcessError and all is good. However, on precise the command returns nothing and a 0 return code. The charm then tries to do

output.split('\n')[1]

when output is empty and chaos ensues.

Liam Young (gnuoy) on 2015-10-06
Changed in nova-cloud-controller (Juju Charms Collection):
status: New → Confirmed
importance: Undecided → Medium
Changed in nova-compute (Juju Charms Collection):
status: New → Invalid
Changed in nova-cloud-controller (Juju Charms Collection):
milestone: none → 15.10
status: Confirmed → In Progress
Ryan Beisner (1chb1n) on 2015-10-13
Changed in nova-cloud-controller (Juju Charms Collection):
status: In Progress → Fix Committed
James Page (james-page) on 2015-10-22
Changed in nova-cloud-controller (Juju Charms Collection):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers