juju contacts wrong LXD API endpoint (default gateway)

Bug #1640531 reported by Andreas Hasenack
28
This bug affects 5 people
Affects Status Importance Assigned to Milestone
Canonical Juju
Triaged
High
Richard Harding

Bug Description

juju 2.0.1, xenial, MAAS 2.1

I have a case (picture of the network layout attached) where juju 2.0.1 bootstrapping a lxd provider works up to a point. Packages are installed, network is fine, but at some point juju wants to talk to a "lxd remote" and for some reason decides that the default gateway for the network is the LXD api endpoint:
$ grep "LXD remote" bootstrap-debug.txt
20:24:28 DEBUG juju.tools.lxdclient client.go:199 connecting to LXD remote "local": "unix:///var/lib/lxd/unix.socket"
20:24:29 DEBUG juju.tools.lxdclient client.go:199 connecting to LXD remote "default cloud images": "https://streams.canonical.com/juju/images/releases/"
20:24:30 DEBUG juju.tools.lxdclient client.go:199 connecting to LXD remote "default ubuntu cloud images": "https://cloud-images.ubuntu.com/releases/"
2016-11-08 20:25:55 DEBUG juju.tools.lxdclient client.go:199 connecting to LXD remote "remote": "10.0.5.1:8443"

The host where this LXD is being created is at 10.0.5.2, DNS name is 22-96.maas. The attempt above fails:

2016-11-08 20:25:55 ERROR cmd supercommand.go:458 new environ: creating LXD client: Get https://10.0.5.1:8443/1.0: x509: certificate is valid for nsn7, 10.0.10.68/24, fdde:a571:33af::e73/128, fdde:a571:33af:0:a95f:ffc5:e512:e899/64, fdde:a571:33af:0:1fa3:b861:cc18:7247/64, fe80::477:9745:bb61:dafe/64, 192.168.122.1/24, 10.0.100.1/24, fe80::9852:73ff:fec8:d69/64, not 22-96

The failure here is because 10.0.5.1 also has an LXD API endpoint, for another LXD server, and the certificate check correctly fails. Juju should be talking to 10.0.5.2:8443 instead.

DNS is correctly setup:
ubuntu@22-96:~$ hostname
22-96
ubuntu@22-96:~$ hostname -f
22-96.maas
ubuntu@22-96:~$ host 22-96.maas
22-96.maas has address 10.0.5.2
ubuntu@22-96:~$ host 10.0.5.2
2.5.0.10.in-addr.arpa domain name pointer 22-96.maas.
ubuntu@22-96:~$

I can launch a container manually and it will get a 10.0.5.0/24 IP from MAAS' DHCP just fine.

Network details:
- 10.0.5.0/24
- this is a libvirt network, no DHCP
- 10.0.5.1 is default gw
- 10.0.5.5 is MAAS server, with dhcp, running as a container. The LXD server handling this is on 10.0.5.1 and is my laptop
- 10.0.5.2 is a xenial VM, with the NIC setup as a bridge so that containers running there can get an address from the MAAS server
- the LXD on 10.0.5.2 has a profile that attaches its eth0 to the VM br0 bridge
- the /etc/default/lxd-bridge file on 10.0.5.2 is telling LXD to use an already existing bridge, br0 in this case

Attached is a bootstrap with --debug output, and a photo of a quick drawing I did showing the scenario.

Revision history for this message
Andreas Hasenack (ahasenack) wrote :
Revision history for this message
Andreas Hasenack (ahasenack) wrote :
description: updated
Chris Gregan (cgregan)
tags: added: cdo-qa-blocker
Changed in juju:
status: New → Triaged
importance: Undecided → High
assignee: nobody → Richard Harding (rharding)
milestone: none → 2.2.0
Ryan Beisner (1chb1n)
tags: added: uosci
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.