juju2, maas2, cloud deployment failure when two domains are used.

Bug #1610397 reported by David Britton on 2016-08-05
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Undecided
Unassigned
juju
Medium
Unassigned
nova-cloud-controller (Juju Charms Collection)
Undecided
Unassigned

Bug Description

ii maas 2.0.0~rc2+bzr5156-0ubuntu1~16.04.2
$ juju --version # 2.0-beta12-xenial-amd64
nova-cloud-controller r287

== What happened ==

I have a 8 node maas server, all 8 nodes are in a domain called 'massive'. I actually do-release-upgraded this system from trusty and maas1.9 to xenial and maas2, and it kept all these settings. On upgrade, it created a default 'maas' domain, which was empty.

I then used the autopilot to deploy a cloud. It broke trying to relate nova-c-c to nova-compute, for a host resolution error:

------------------
2016-08-05 18:29:08 INFO cloud-compute-relation-changed getaddrinfo grays: Name or service not known
2016-08-05 18:29:08 ERROR juju-log cloud-compute:54: Could not obtain SSH host key from grays
------------------

== What I think should have happened ==

1) nova-cloud-controller. It should not assume the bare 'hostname' field is resolvable. If juju had a 'fqdn' parameter, that would be a different story. It does this for the ssh_compute_add method, which I'm sure tries to set up ssh key auth between systems, but that step where it failed is a critical one. It also perhaps may want to not fatal error on this ssh key add for the host. But I'm not sure of the exact reason it's there.

2) juju.

a) Juju should grab IPs for lxds from the same domain *of the physical machine* where it is creating the LXD. *Not* from the default domain.

b) Juju also may want to expose an FQDN parameter in relation settings. I would not suggest changing the semantics of 'hostname'.

3) Maas. Maas should ideally expose a way on the UI to edit the default domain, as I don't particularly want to have two domains on this system, I just don't want my domain called 'maas'.

== Full stacktrace ==

2016-08-05 18:29:08 INFO cloud-compute-relation-changed # 10.5.200.38 SSH-2.0-OpenSSH_6.6.1p1 Ubuntu-2ubuntu2.7
2016-08-05 18:29:08 INFO juju-log cloud-compute:54: Known host key for compute host 10.5.200.38 up to date.
2016-08-05 18:29:08 INFO cloud-compute-relation-changed getaddrinfo grays: Name or service not known
2016-08-05 18:29:08 ERROR juju-log cloud-compute:54: Could not obtain SSH host key from grays
2016-08-05 18:29:08 INFO cloud-compute-relation-changed Traceback (most recent call last):
2016-08-05 18:29:08 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/cloud-c
ompute-relation-changed", line 1102, in <module>
2016-08-05 18:29:08 INFO cloud-compute-relation-changed main()
2016-08-05 18:29:08 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/cloud-c
ompute-relation-changed", line 1096, in main
2016-08-05 18:29:08 INFO cloud-compute-relation-changed hooks.execute(sys.argv)
2016-08-05 18:29:08 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/charmhe
lpers/core/hookenv.py", line 715, in execute
2016-08-05 18:29:08 INFO cloud-compute-relation-changed self._hooks[hook_name]()
2016-08-05 18:29:08 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/cloud-c
ompute-relation-changed", line 618, in compute_changed
2016-08-05 18:29:08 INFO cloud-compute-relation-changed ssh_compute_add(key, rid=rid, unit=unit)
2016-08-05 18:29:08 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/nova_cc
_utils.py", line 751, in ssh_compute_add
2016-08-05 18:29:08 INFO cloud-compute-relation-changed add_known_host(host, unit, user)
2016-08-05 18:29:08 INFO cloud-compute-relation-changed File "/var/lib/juju/agents/unit-nova-cloud-controller-0/charm/hooks/nova_cc
_utils.py", line 706, in add_known_host
2016-08-05 18:29:08 INFO cloud-compute-relation-changed raise e
2016-08-05 18:29:08 INFO cloud-compute-relation-changed subprocess.CalledProcessError: Command '['ssh-keyscan', '-H', '-t', 'rsa', u'
grays']' returned non-zero exit status 255
2016-08-05 18:29:08 ERROR juju.worker.uniter.operation runhook.go:107 hook "cloud-compute-relation-changed" failed: exit status 1

Related branches

David Britton (dpb) on 2016-08-05
tags: added: kanban-cross-team landscape
description: updated
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Richard Harding (rharding)
tags: removed: kanban-cross-team
Changed in maas:
status: New → Fix Committed
Revision history for this message
Anastasia (anastasia-macmood) wrote :

Landscape team has a workaround provided by MAAS fix.

Changed in juju-core:
milestone: none → 2.1.0
affects: juju-core → juju
Changed in juju:
milestone: 2.1.0 → none
milestone: none → 2.1.0
Changed in maas:
status: Fix Committed → Fix Released
Changed in juju:
importance: High → Critical
Changed in juju:
importance: Critical → High
milestone: 2.1.0 → 2.2.0-alpha1
Changed in juju:
assignee: Richard Harding (rharding) → nobody
Revision history for this message
James Page (james-page) wrote :

This codepath is part of the charm that configures SSH host identities for live-migration; nova and libvirt make alot of assumptions about the resolv-ability of the hostname of a server when performing live migration tasks. We will generally review this piece as part of the network-space support in the nova-compute charm; that will be covered under a different bug.

Changed in nova-cloud-controller (Juju Charms Collection):
status: New → Won't Fix
Changed in juju:
milestone: 2.2-alpha1 → 2.2.0
milestone: 2.2.0 → 2.2-beta1
Curtis Hovey (sinzui) on 2017-03-24
Changed in juju:
milestone: 2.2-beta1 → 2.2-beta2
Curtis Hovey (sinzui) on 2017-03-30
Changed in juju:
milestone: 2.2-beta2 → 2.2-beta3
Changed in juju:
milestone: 2.2-beta3 → 2.2-beta4
Changed in juju:
milestone: 2.2-beta4 → 2.2-rc1
Revision history for this message
Tim Penhey (thumper) wrote :

David, not sure if this was actually a Juju issue at all. Any way to test Juju to see if this is still an issue?

Changed in juju:
milestone: 2.2-rc1 → none
status: Triaged → Incomplete
importance: High → Medium
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers