juju 1.24.0: wget cert issues causing failure to create containers on 14.04.2 with lxc 1.07

Bug #1472014 reported by Mike McCracken
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
Ian Booth
1.24
Fix Released
Critical
Ian Booth

Bug Description

There appears to be an issue with downloading the root image from the state server:

Here's a 'juju status' report: http://paste.ubuntu.com/11833262/

There are two errors from the ubuntu-cloud lxc template script that are shown there, but the first one is the important one AIUI:

wget https://10.0.3.105:17070/environment/e9579db4-134a-4a4e-88db-976c48c696f5/images/lxc/trusty/amd64/ubuntu-14.04-server-cloudimg-amd64-root.tar.gz;
          --2015-07-06 20:54:01-- https://10.0.3.105:17070/environment/e9579db4-134a-4a4e-88db-976c48c696f5/images/lxc/trusty/amd64/ubuntu-14.04-server-cloudimg-amd64-root.tar.gz;
          Connecting to 10.0.3.105:17070... connected.; ERROR: no certificate subject
          alternative name matches; \trequested host name '10.0.3.105'.; To connect
          to 10.0.3.105 insecurely, use `--no-check-certificate'

It looks like the wget wrapper that juju creates is being lost at some point.

If I log into machine 1 (the one where we're trying to create containers), I can issue a raw wget like this: wget --no-check-certificate https://10.0.3.105:17070/environment/e9579db4-134a-4a4e-88db-976c48c696f5/images/lxc/trusty/amd64/ubuntu-14.04-server-cloudimg-amd64-root.tar.gz
and that works (although slowly) to get the image.

Revision history for this message
Mike McCracken (mikemc) wrote :

ubuntu@ubuntu-local-machine-1:~$ type wget
wget is hashed (/usr/bin/wget)
ubuntu@ubuntu-local-machine-1:~$ ll /tmp
total 16
drwxrwxrwt 2 root root 4096 Jul 6 23:08 ./
drwxr-xr-x 23 root root 4096 Jul 6 23:12 ../
-rwxr-xr-x 1 ubuntu ubuntu 1628 Jul 6 20:53 apt-go-fast*
-rwxr-xr-x 1 ubuntu ubuntu 632 Jul 6 20:52 lxc-host-only*

Revision history for this message
Ian Booth (wallyworld) wrote :

This looks like the IP address of the machine which is doing the wget has not been added to the SAN list on the cert. It would be very helpful to get a log file from that state server so that we can see the logs pertaining to the operation of the certificate update worker. There will be output like:

State Server cerificate addresses updated to <blah>

That list of addresses should contain the IP address of machine 1 (the one that is presented as the source IP of the wget).

Revision history for this message
Mike McCracken (mikemc) wrote :

Hi, thanks for looking - the full log is here: http://paste.ubuntu.com/11834149/ , but here's what you mentioned specifically, they show up around line 414 and 511.

ubuntu@openstack-single-ubuntu:~$ sudo grep "cer.*ificate addresses" /var/log/juju-ubuntu-local/machine-0.log

2015-07-06 20:48:58 INFO juju.worker.certupdater certupdater.go:127 State Server cerificate addresses updated to ["public:localhost" "local-cloud:10.0.6.1"]
2015-07-06 20:48:58 INFO juju.apiserver apiserver.go:166 new certificate addresses: 10.0.6.1
2015-07-06 20:49:00 INFO juju.worker.certupdater certupdater.go:127 State Server cerificate addresses updated to ["public:localhost" "local-cloud:10.0.6.1"]
2015-07-06 20:49:00 INFO juju.apiserver apiserver.go:166 new certificate addresses: 10.0.6.1

and here's IP info from machine 1

ubuntu@openstack-single-ubuntu:~$ juju ssh 1
Warning: Permanently added '10.0.6.141' (ECDSA) to the list of known hosts.
Welcome to Ubuntu 14.04.2 LTS (GNU/Linux 3.13.0-55-generic x86_64)

 * Documentation: https://help.ubuntu.com/

  System information as of Mon Jul 6 23:08:29 UTC 2015

  System load: 0.0 Processes: 84
  Usage of /: 9.2% of 19.65GB Users logged in: 0
  Memory usage: 3% IP address for lxcbr0: 10.0.6.141
  Swap usage: 0%

  Graph this data and manage this system at:
    https://landscape.canonical.com/

  Get cloud support with Ubuntu Advantage Cloud Guest:
    http://www.ubuntu.com/business/services/cloud

0 packages can be updated.
0 updates are security updates.

*** System restart required ***
Last login: Mon Jul 6 23:08:29 2015 from 10.0.6.1
ubuntu@ubuntu-local-machine-1:~$ ip -4 addr show
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
4: lxcbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    inet 10.0.6.141/24 brd 10.0.6.255 scope global lxcbr0
       valid_lft forever preferred_lft forever

Revision history for this message
Ian Booth (wallyworld) wrote :

So that's the problem - only address 10.0.6.1 is added to the certificate SAN list, where as the state server (machine 0) also uses address 10.0..3.105 - this address is not being picked up as belonging to the machine.

As is seen in the machine logs:

certupdater.go:79 new machine addresses: [public:localhost local-cloud:10.0.6.1]

the certificate updater is only ever informed about 10.0.6.1.

Yet the machine is known to have additional addresses:

machine-0: 2015-07-06 20:48:58 DEBUG juju.network network.go:259 addresses after filtering: [local-machine:127.0.0.1 local-cloud:192.168.122.1 local-cloud:10.0.3.105 local-machine:::1]
machine-0: 2015-07-06 20:48:58 INFO juju.worker.machiner machiner.go:94 setting addresses for machine-0 to ["local-machine:127.0.0.1" "local-cloud:192.168.122.1" "local-cloud:10.0.3.105" "local-machine:::1"]

So there's a problem with the machine AddressWatcher code which reports on a machine's addresses.

Changed in juju-core:
milestone: none → 1.25.0
importance: Undecided → High
status: New → Triaged
Revision history for this message
James Tunnicliffe (dooferlad) wrote :

It turns out that we actually test that we will only set one cloud local address for each machine will be set:

github.com/juju/juju/agent/agent_test.go

func (*suite) TestSetAPIHostPorts(c *gc.C) {
 conf, err := agent.NewAgentConfig(attributeParams)
 c.Assert(err, jc.ErrorIsNil)

 addrs, err := conf.APIAddresses()
 c.Assert(err, jc.ErrorIsNil)
 c.Assert(addrs, gc.DeepEquals, attributeParams.APIAddresses)

 // The first cloud-local address for each server is used,
 // else if there are none then the first public- or unknown-
 // scope address.
 //
 // If a server has only machine-local addresses, or none
 // at all, then it will be excluded.
 server1 := network.NewAddresses("0.1.2.3", "0.1.2.4", "zeroonetwothree")
 server1[0].Scope = network.ScopeCloudLocal
 server1[1].Scope = network.ScopeCloudLocal
 server1[2].Scope = network.ScopePublic
 server2 := network.NewAddresses("127.0.0.1")
 server2[0].Scope = network.ScopeMachineLocal
 server3 := network.NewAddresses("0.1.2.5", "zeroonetwofive")
 server3[0].Scope = network.ScopeUnknown
 server3[1].Scope = network.ScopeUnknown
 conf.SetAPIHostPorts([][]network.HostPort{
  network.AddressesWithPort(server1, 123),
  network.AddressesWithPort(server2, 124),
  network.AddressesWithPort(server3, 125),
 })
 addrs, err = conf.APIAddresses()
 c.Assert(err, jc.ErrorIsNil)
 c.Assert(addrs, gc.DeepEquals, []string{"0.1.2.3:123", "0.1.2.5:125"})
}

This makes this log message highly confusing (formatting changed for readability):
line 404:
machine-0: 2015-07-06 20:48:58 DEBUG juju.state address.go:136 setting API hostPorts: [[
  localhost:17070
  10.0.3.105:17070
  10.0.6.1:17070
  192.168.122.1:17070
  127.0.0.1:17070
  [::1]:17070]]

So, this is lies. Only the first of those will be chosen.

github.com/juju/juju/state/address.go containes SetAPIHostPorts, which is what logs the above message. Looks like a fix is needed there.

Revision history for this message
Ian Booth (wallyworld) wrote :

The approach we'll take is:
1. set the SAN to all machine addresses discovered by querying the machine's hardware directly
2. update with any addresses obtained using the address watcher

This will ensure the SAN contains all possible addresses on which incoming connections may arrive.

Ian Booth (wallyworld)
Changed in juju-core:
assignee: nobody → Ian Booth (wallyworld)
status: Triaged → In Progress
Ian Booth (wallyworld)
Changed in juju-core:
status: In Progress → Fix Committed
tags: added: addressability network
Curtis Hovey (sinzui)
Changed in juju-core:
status: Fix Committed → Fix Released
Revision history for this message
Mike McCracken (mikemc) wrote :

We are still seeing this in 1.25.0.
Some details with logs are here:
https://gist.github.com/mikemccracken/883f91e34c3eb8ee4b47

We are also still seeing this in 1.24, as shown here: http://paste.ubuntu.com/13087888/

Revision history for this message
Cheryl Jennings (cherylj) wrote :

Hi Mike, could you open a new bug to track the reappearance of this issue?

Revision history for this message
Mike McCracken (mikemc) wrote :
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.