juju 2.1 issues certificates with wrong DNS alternative names on juju upgrade.

Bug #1710886 reported by José Pekkarinen
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Canonical Juju
Fix Released
High
José Pekkarinen
2.2
Fix Released
High
José Pekkarinen

Bug Description

Hi,

On an environment were juju HA was 2.1.2 version, executing an upgrade to juju 2.1.3, rendered on the secondary nodes not being able to connect to local mongodb:

2017-08-15 13:58:13 DEBUG juju.mongo open.go:144 dialled mongodb server at "127.0.0.1:37017"
2017-08-15 13:58:14 DEBUG juju.mongo open.go:144 dialled mongodb server at "127.0.0.1:37017"
2017-08-15 13:58:15 DEBUG juju.mongo open.go:144 dialled mongodb server at "127.0.0.1:37017"
2017-08-15 13:58:16 DEBUG juju.mongo open.go:144 dialled mongodb server at "127.0.0.1:37017"
2017-08-15 13:58:17 DEBUG juju.mongo open.go:144 dialled mongodb server at "127.0.0.1:37017"
2017-08-15 13:58:18 DEBUG juju.mongo open.go:144 dialled mongodb server at "127.0.0.1:37017"
2017-08-15 13:58:18 DEBUG juju.worker.dependency engine.go:500 "state" manifold worker stopped: cannot connect to mongodb: no reachable servers
2017-08-15 13:58:18 ERROR juju.worker.dependency engine.go:547 "state" manifold worker returned unexpected error: cannot connect to mongodb: no reachable servers
2017-08-15 13:58:21 DEBUG juju.cmd.jujud machine.go:1402 mongo is already initialized
2017-08-15 13:58:21 INFO juju.state open.go:121 opening state, mongo addresses: ["localhost:37017"]; entity machine-5

Reading the certificate, I can see:

# openssl x509 -text -noout -in /var/lib/juju/server.pem
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            1e:29:a6:06:22:87:cf:1f:4b:e8:80:8e:a8:bb:22:03:c4:3e:be
    Signature Algorithm: sha256WithRSAEncryption
        Issuer: O=juju, CN=juju-generated CA for model "juju-ca"/serialNumber=<valid-uuid>
        Validity
...
            X509v3 Subject Alternative Name:
                DNS:juju-mongodb, DNS:local, DNS:juju-apiserver
...

Then the api-server won't start:

# lsof -i | grep LISTEN
sshd 4541 root 3u IPv4 44323 0t0 TCP *:ssh (LISTEN)
mongod 33430 root 6u IPv4 123993 0t0 TCP *:37017 (LISTEN)

And juju status won't notice about it:

$ juju status -m controller
Model Controller Cloud/Region Version
controller mycontroller myregion 2.1.3.1

App Version Status Scale Charm Store Rev OS Notes

Unit Workload Agent Machine Public address Ports Message

Machine State DNS Inst id Series AZ
0 started <valid-ip-address> <maas.machine.name> xenial AZ1
5 started <valid-ip-address> <maas.machine.name> xenial AZ2
9 started <valid-ip-address> <maas.machine.name> xenial AZ3

This is addressed in the following PR:

https://github.com/juju/juju/pull/7748

PR against 2.2:

https://github.com/juju/juju/pull/7755

Thanks!

José.

description: updated
John A Meinel (jameinel)
affects: juju-core → juju
Changed in juju:
importance: Undecided → High
status: New → Triaged
milestone: none → 2.3-alpha1
assignee: nobody → José Pekkarinen (jose-pekkarinen)
Revision history for this message
John A Meinel (jameinel) wrote :

The fix seems like something worth landing, though that particular bug has been present since 1.23. So it isn't something that is wrong 'just now'.
I think we need to figure out what the more underlying issue is.

It may be that something was broken in 2.1.3 that started connecting to 'localhost' instead of connecting to 'juju-mongodb'.

The concern is that while this patch does look like a genuinely correct fix, that doesn't mean that by itself it will actually fix the problem with connecting to the replica set. (And it also may not be the *correct* fix as we should probably be trying to address the local mongo as juju-mongodb instead of 'localhost'.)

Revision history for this message
José Pekkarinen (koalinux) wrote :
Download full text (4.1 KiB)

I found this reproducible if I spawn a localhost HA without the patch:

2017-08-16 08:44:13 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 192.168.0.5:37017: getsockopt: connection refused
2017-08-16 08:44:13 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 192.168.0.5:37017: getsockopt: connection refused
2017-08-16 08:44:14 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 192.168.0.5:37017: getsockopt: connection refused
2017-08-16 08:44:14 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 192.168.0.5:37017: getsockopt: connection refused
2017-08-16 08:44:14 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 192.168.0.5:37017: getsockopt: connection refused
2017-08-16 08:44:45 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 192.168.0.5:37017: getsockopt: connection refused
2017-08-16 08:44:45 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 192.168.0.5:37017: getsockopt: connection refused
2017-08-16 08:44:45 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 192.168.0.5:37017: getsockopt: connection refused
2017-08-16 08:44:45 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 192.168.0.5:37017: getsockopt: connection refused
2017-08-16 08:44:46 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 192.168.0.5:37017: getsockopt: connection refused
2017-08-16 08:44:46 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 192.168.0.5:37017: getsockopt: connection refused

# openssl x509 -text -noout -in /var/lib/juju/server.pem
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
...
            X509v3 Subject Alternative Name:
                DNS:anything, DNS:juju-apiserver, DNS:juju-mongodb, DNS:localhost, DNS:local

Here, the apiserver comes up though:

# lsof -i | grep LISTEN
sshd 258 root 3u IPv4 106318 0t0 TCP *:ssh (LISTEN)
sshd 258 root 4u IPv6 106327 0t0 TCP *:ssh (LISTEN)
mongod 295 root 6u IPv4 110604 0t0 TCP *:37017 (LISTEN)
mongod 295 root 7u IPv6 110605 0t0 TCP *:37017 (LISTEN)
jujud 311 root 16u IPv6 111410 0t0 TCP *:17070 (LISTEN)
lxd-bridg 473 root 3u IPv6 115951 0t0 TCP [fe80::1]:13128 (LISTEN)

Applying the patch, rebuilding, installing, and upgrading juju agents gets this:

openssl x509 -text -noout -in /var/lib/juju/server.pem
...
            X509v3 Subject Alternative Name:
                DNS:anything, DNS:juju-apiserver, DNS:juju-mongodb, DNS:localhost, DNS:local

2017-08-16 08:55:00 INFO juju.apiserver apiserver.go:316 listening on "[::]:17070"
2017-08-16 08:55:00 DEBUG juju.apiserver apiserver.go:367 Starting API http server on address "[::]:17070"
2017-08-16 08:55:00 DEBUG juju.mongo open.go:144 dialled mongodb server at "192.168.0.3:37017"
2017-08-16 08:55:00 DEBUG juju.mongo open.go:144 dialled mongodb server at "192.168.0.3:37017"
2017-08-16 08:55:02 DEBUG juju.apiserver request_notifier.go:7...

Read more...

Revision history for this message
José Pekkarinen (koalinux) wrote :

ok, disregard machine 2, as it doesn't have mongodb installed thanks to needing to monkey patch
the resolv.conf during the boot stage.

Revision history for this message
José Pekkarinen (koalinux) wrote :
Download full text (3.7 KiB)

I managed to reproduce it twice, so seems like reproducing the issue bootstraping from 2.1
branch and enabling HA is fairly having the issue:

machine-1:
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:21:51 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 127.0.0.1:37017: getsockopt: connection refused

Certificate misses DNS alternative names:
...
            X509v3 Subject Key Identifier:
                9E:6A:52:5F:AE:17:AB:A6:C9:94:EA:82:B5:6C:19:52:0E:BD:4D:49
            X509v3 Authority Key Identifier:
                keyid:E8:E4:03:62:97:36:19:39:4A:63:6B:18:A8:97:61:FC:C9:23:9C:C3

    Signature Algorithm: sha256WithRSAEncryption
...
machine-2:

2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:20:49 ERROR juju.worker runner.go:199 fatal "2-container-watcher": connection is shut down
2017-08-16 09:20:49 ERROR juju.api.watcher watcher.go:87 error trying to stop watcher: connection is shut down
2017-08-16 09:22:00 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 127.0.0.1:37017: getsockopt: connection refused
2017-08-16 09:22:00 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 127.0.0.1:37017: getsockopt: connection refused
2017-08-16 09:22:01 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 127.0.0.1:37017: getsockopt: connection refused
2017-08-16 09:22:02 WARNING juju.mongo open.go:133 mongodb connection failed, will retry: dial tcp 127.0.0.1:37017: getsockopt: connection refused
2017-08-16 09:22:02 WARNING juju.mongo open.go:133 mongodb connection...

Read more...

Revision history for this message
José Pekkarinen (koalinux) wrote :

New machine one after juju upgrade:

2017-08-16 09:37:51 DEBUG juju.worker.dependency engine.go:500 "log-forwarder" manifold worker stopped: "upgrade-check-flag" not set: dependency not available
2017-08-16 09:37:51 DEBUG juju.mongo open.go:144 dialled mongodb server at "127.0.0.1:37017"
2017-08-16 09:37:51 DEBUG juju.mongo open.go:144 dialled mongodb server at "192.168.0.8:37017"
2017-08-16 09:37:51 DEBUG juju.mongo open.go:144 dialled mongodb server at "192.168.0.7:37017"
2017-08-16 09:37:51 DEBUG juju.state open.go:127 connection established
2017-08-16 09:37:51 DEBUG juju.mongo open.go:144 dialled mongodb server at "192.168.0.4:37017"
2017-08-16 09:37:51 DEBUG juju.state open.go:134 mongodb login successful
2017-08-16 09:37:51 DEBUG juju.mongo open.go:144 dialled mongodb server at "192.168.0.8:37017"
2017-08-16 09:37:51 INFO juju.state state.go:367 starting standard state workers

Certificate:
...
            X509v3 Subject Alternative Name:
                DNS:localhost, DNS:local, DNS:anything, DNS:juju-apiserver, DNS:juju-mongodb
...

And machine 2:

2017-08-16 09:39:45 DEBUG juju.worker runner.go:183 "txnpruner" started
2017-08-16 09:39:45 DEBUG juju.mongo open.go:144 dialled mongodb server at "127.0.0.1:37017"
2017-08-16 09:39:45 DEBUG juju.mongo open.go:144 dialled mongodb server at "192.168.0.4:37017"
2017-08-16 09:39:45 DEBUG juju.mongo open.go:144 dialled mongodb server at "192.168.0.7:37017"
2017-08-16 09:39:45 DEBUG juju.mongo open.go:144 dialled mongodb server at "192.168.0.8:37017"
2017-08-16 09:39:45 DEBUG juju.state open.go:127 connection established
2017-08-16 09:39:45 DEBUG juju.state open.go:134 mongodb login successful
2017-08-16 09:39:45 DEBUG juju.mongo open.go:144 dialled mongodb server at "192.168.0.8:37017"
2017-08-16 09:39:45 INFO juju.state state.go:367 starting standard state workers

And cert:

...
            X509v3 Subject Alternative Name:
                DNS:juju-mongodb, DNS:localhost, DNS:local, DNS:anything, DNS:juju-apiserver
...

Revision history for this message
José Pekkarinen (koalinux) wrote :

I have revised code in both 2.1 and 2.2, and both seems to use tcp address
to dial in mongo. I hack around some fool bits and that would need to add
the juju-mongodb name to /etc/hosts, otherwise it will try the resolution
through resolv.conf.

Revision history for this message
John A Meinel (jameinel) wrote :

Also, we discussed whether we would target 2.1 for this fix, but there are enough other reasons to move to 2.2 that won't be fixed by just a 2.1.4. So we are unlikely to fix this as part of a 2.1 release.

Revision history for this message
José Pekkarinen (koalinux) wrote :

No problem, the patch is little, I can make it on branch 2.2 as well.

description: updated
tags: added: 4010
Revision history for this message
José Pekkarinen (koalinux) wrote :

New PRs added to complement this work:

https://github.com/juju/juju/pull/7803
https://github.com/juju/utils/pull/285

This makes juju use more the names of juju-mongodb
and juju-apiserver to relay less in localhost. The piece
in utils adds these names to the /etc/hosts file during
bootstrap to be able to dial directly with juju-mongodb
from first instance. These two PR are independent of the
previous PR, but juju core 7803 depends on juju utils 285.

Please let me know your thoughts.

Revision history for this message
Tim Penhey (thumper) wrote :

Curious, but which version of golang are you compiling juju with?

Revision history for this message
José Pekkarinen (koalinux) wrote :

go1.8.3

Ante Karamatić (ivoks)
tags: added: cpe-onsite
Changed in juju:
milestone: 2.3-beta1 → 2.3-beta2
Revision history for this message
Anastasia (anastasia-macmood) wrote :

I suspect that the fix for this has been already ported into 2.3 (develop) as part of one of the larger merges we have undertaken recently.

Changed in juju:
status: Triaged → Fix Committed
Changed in juju:
milestone: 2.3-beta2 → 2.3-beta1
Changed in juju:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.