Certificate generated by certupdater worker cannot be used by MongoDB
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| | juju-core |
Critical
|
Ian Booth | ||
| | 1.22 |
Critical
|
Menno Finlay-Smits | ||
| | 1.23 |
Critical
|
Ian Booth | ||
Bug Description
Juju CI upgraded all is machines to 1.22.0, All Upgrade jobs to 1.23-beta1 and 1.24-alpha1 fail. This looked like fallout from bug 1434070, but we have confirmed that 1.22.0 cannot upgrade to a previously blessed 1.23-beta1 revision. Since 1.21.3 can upgrade, it appears there is something about upgrading 1.22.0 to 1.23+ that is not accounted for.
| Curtis Hovey (sinzui) wrote : | #3 |
| Curtis Hovey (sinzui) wrote : | #4 |
Attached is a redacted machine-0.log
| Curtis Hovey (sinzui) wrote : | #5 |
Attached is a redacted all-machines.log
| Curtis Hovey (sinzui) wrote : | #6 |
This is my log from a bootstrap from packaged 1.22.0 to packaged (but to yet public) 1.23-beta1
| Changed in juju-core: | |
| assignee: | nobody → Menno Smits (menno.smits) |
| Menno Finlay-Smits (menno.smits) wrote : | #7 |
It looks like certificates are getting mixed up somehow. The upgrade is triggered and machine-0 reboots into the new tools version and then it looks like the certificate for API server access is being used for connecting to MongoDB! (or something)
Because the state server can't connect to MongoDB the environment can't come up.
I'll keep digging into the cause.
| Menno Finlay-Smits (menno.smits) wrote : | #8 |
The problem is easy to reproduce with the local provider:
$ /usr/bin/juju bootstrap # where /usr/bin/juju is 1.22.0 from the stable PPA
$ juju upgrade-juju --upload-tools # where juju is 1.23-beta1 or current master
The result is the same with connections to mongodb failing with this error:
juju.mongo open.go:122 TLS handshake failed: x509: certificate is valid for localhost, juju-apiserver, not juju-mongodb
| Menno Finlay-Smits (menno.smits) wrote : | #9 |
Using git bisect, I've found that 3734d91 is the culprit. The change seems like it should be fine but repeated manual upgrades, with and without it demonstrate that it's the problem.
I'm still trying to figure out WHY it's the problem.
| Menno Finlay-Smits (menno.smits) wrote : | #10 |
The root cause is actually fairly convoluted.
Rev 3734d91 exposed the problem but it isn't actually the source. That change makes only a small non-functional cleanup to the juju-db upstart script. However, because the upstart script has changed, jujud writes out a new server.pem and restarts juju-db as it starts up into 1.23 or 1.24.
The issue is that the new server.pem is generated from the same cert and key as is the API server and since version 1.22 the certupdater worker keeps API server cert in sync with state server address changes. It also identifies the certifcate as originating from the "localhost" and "juju-apiserver" hostnames. Juju's mongodb client connection code expects a certificate for "juju-mongodb" causes connections to mongo to fail once mongo is using the new certificate file.
Although it is possible to trigger this problem through upgrades, the bug isn't really upgrade related. It is also possible to trigger it with 1.22 alone by making any edit to juju-db upstart script and restarting jujud.
Updating the ticket title to reflect this.
| summary: |
- 1.22.0 cannot upgrade to 1.23-beta1 or 1.24-alpha1 + Certificate generated by certupdater worker cannot be used by MongoDB |
| Menno Finlay-Smits (menno.smits) wrote : | #11 |
This fix for 1.22 was commited in 317ffb1b23f929e
| Menno Finlay-Smits (menno.smits) wrote : | #12 |
The fix for 1.23 and 1.24 (master) is a little more complicated because if the upgrade is coming from 1.22.0 then the certificate in the agent config is already going to be wrong when jujud starts, preventing connections to mongodb and preventing the upgrade from completing.
wallyworld and I have discussed adding some code that runs when the agent's config is first loaded which will fix the cert at that time so that connections to mongodb can work.
There's a proof of concept of how this could work here: http://
| Changed in juju-core: | |
| assignee: | Menno Smits (menno.smits) → Ian Booth (wallyworld) |
| status: | Triaged → In Progress |
| Changed in juju-core: | |
| status: | In Progress → Fix Committed |
| Ian Booth (wallyworld) wrote : | #13 |
Several upgrade tests have now passed in CI (previously failed) so marking as fix released
| Changed in juju-core: | |
| status: | Fix Committed → Fix Released |
| Aaron Bentley (abentley) wrote : | #14 |
Landed via https:/


I hid the comments with the logs because they might contain confidential information. Engineers can review them and they are also available at http:// reports. vapour. ws/releases/ 2466