juju upgrade-juju on 1.18.3 upgraded my agents to 1.19.2

Bug #1325034 reported by Christian Reis
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
juju-core
Fix Released
High
Wayne Witzel III

Bug Description

I am using the local provider. I have 1.18.3 installed, .

juju@chorus:~$ dpkg -l | grep juju
ii juju 1.18.3-0ubuntu1~12.04.1~juju1 next generation service orchestration system
ii juju-core 1.18.3-0ubuntu1~12.04.1~juju1 Juju is devops distilled - client
ii juju-local 1.18.3-0ubuntu1~12.04.1~juju1 dependency package for the Juju local provider

We had 1.18.1.6 agents running:

http://pastebin.ubuntu.com/7544651/

I issued a juju upgrade-juju with no options and found all my agents have been updated to 1.19.2:

juju@chorus:~$ history | grep upgrade-juju
  153 juju upgrade-juju

http://paste.ubuntu.com/7552880/

The help text seems to indicate that that would not happen, and alas, now I've been upgraded and exposed to bug 1309444.

Tags: upgrade-juju
Revision history for this message
Christian Reis (kiko) wrote :

A grep for upgrader in machine-0.log shows me:

2014-05-29 23:34:44 INFO juju.worker.upgrader upgrader.go:121 desired tool version: 1.18.1.6
2014-05-30 00:14:51 INFO juju.worker.upgrader upgrader.go:121 desired tool version: 1.19.2
2014-05-30 00:14:51 INFO juju.worker.upgrader upgrader.go:139 upgrade requested from 1.18.1.6-precise-amd64 to 1.19.2
2014-05-30 00:14:57 INFO juju.worker.upgrader upgrader.go:172 fetching tools from "https://streams.canonical.com/juju/tools/releases/juju-1.19.2-precise-amd64.tgz"
2014-05-30 00:15:27 INFO juju.worker.upgrader upgrader.go:186 unpacked tools 1.19.2-precise-amd64 to /var/lib/juju/.juju/local
2014-05-30 00:15:27 ERROR juju runner.go:209 worker: fatal "upgrader": must restart: an agent upgrade is available
2014-05-30 00:15:32 INFO juju.worker.upgrader error.go:32 upgraded from 1.18.1.6-precise-amd64 to 1.19.2-precise-amd64 ("https://streams.canonical.com/juju/tools/releases/juju-1.19.2-precise-amd64.tgz")

I then got about an hour of an error that repeats 700+ times, the last two iterations only reproduced:

2014-05-30 00:54:41 INFO juju runner.go:262 worker: start "upgrader"
2014-05-30 00:54:41 INFO juju.worker.upgrader upgrader.go:121 desired tool version: 1.19.2
2014-05-30 00:54:41 INFO juju.worker.upgrader upgrader.go:139 upgrade requested from 1.18.1.6-precise-amd64 to 1.19.2
2014-05-30 00:54:48 ERROR juju runner.go:209 worker: fatal "upgrader": must restart: an agent upgrade is available
2014-05-30 00:54:48 INFO juju.worker.upgrader error.go:32 upgraded from 1.18.1.6-precise-amd64 to 1.19.2-precise-amd64 ("https://streams.canonical.com/juju/tools/releases/juju-1.19.2-precise-amd64.tgz")
2014-05-30 00:54:51 INFO juju runner.go:262 worker: start "upgrader"
2014-05-30 00:54:51 INFO juju.worker.upgrader upgrader.go:121 desired tool version: 1.19.2
2014-05-30 00:54:51 INFO juju.worker.upgrader upgrader.go:139 upgrade requested from 1.18.1.6-precise-amd64 to 1.19.2
2014-05-30 00:54:57 INFO juju.worker.upgrader upgrader.go:172 fetching tools from "https://streams.canonical.com/juju/tools/releases/juju-1.19.2-precise-amd64.tgz"
2014-05-30 00:55:25 INFO juju.worker.upgrader upgrader.go:186 unpacked tools 1.19.2-precise-amd64 to /home/juju/.juju/local
2014-05-30 00:55:25 ERROR juju runner.go:209 worker: fatal "upgrader": must restart: an agent upgrade is available
2014-05-30 00:55:25 INFO juju.worker.upgrader error.go:32 upgraded from 1.18.1.6-precise-amd64 to 1.19.2-precise-amd64 ("https://streams.canonical.com/juju/tools/releases/juju-1.19.2-precise-amd64.tgz")

Revision history for this message
Christian Reis (kiko) wrote :

I now have an error that occurs about 4 hours apart:

2014-05-30 17:01:13 DEBUG juju.worker runner.go:241 killing "upgrader"
2014-05-30 17:01:13 ERROR juju.worker runner.go:207 fatal "upgrader": watcher iteration error: read tcp 127.0.0.1:37017: i/o timeout
2014-05-30 17:01:41 INFO juju.worker runner.go:260 start "upgrader"
2014-05-30 17:01:41 INFO juju.worker.upgrader upgrader.go:121 desired tool version: 1.19.2

I assume this failure is linked to bug 1307434.

Revision history for this message
Curtis Hovey (sinzui) wrote :

That was the version of juju that issues the juju-upgrade command? 1.18.x or was the client 1.19.x?

tags: added: upgrade-juju
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
milestone: none → next-stable
Revision history for this message
Christian Reis (kiko) wrote :

It was 1.18.3, as I noted in the description. I had never installed 1.19, nor was I trying to move to development!

Revision history for this message
Wayne Witzel III (wwitzel3) wrote :

We were able to get the agent back online by manually connecting to using the admin-secret from the local.jenv file. We then manually issued the rs.initiate() command. Once this was done Juju was able to connect to mongo again and continue performing its operations.

After that, we ran into another issue with the LXC containers trying to connect to what appeared to be the wrong web socket service address. The user (hackedbellini) updated one of the agents.conf apiaddress field to point to the proper wss:// resource and that agent was properly upgraded to 1.20.1 as well.

The user is continuing that process for the rest of the agents.conf files for each LXC machine.

I was working with hackedbellini on #juju-dev IRC.

Changed in juju-core:
assignee: nobody → Wayne Witzel III (wwitzel3)
status: Triaged → In Progress
Revision history for this message
Wayne Witzel III (wwitzel3) wrote :

We also manually added the machine to the tags of the members array under the db.local.replicaset collection.

db.system.replset.update({_id:"juju", "members._id": 0}, {$set: {"members.$.tags": {"juju-machine-id":"0"}}})

This seems to have resolved all of the issues and the user is back up and running.

Revision history for this message
Wayne Witzel III (wwitzel3) wrote :

During this process I was unable to identify what caused the user to just to unstable in the first place and many of my attempts to replicate have failed.

Revision history for this message
Thiago Bellini (bellini666) wrote :

The messages wwitzel3 wrote above were about a problem that happened because of this bug.

We were on 1.19.3 and tried to move back to stable, upgrading it to version 1.20.1. After the upgrade, the agent would not start at all... juju was pratically dead for some weeks.

Now after wwtzel3 help, everything is running ok and even better than before. Anyone with the same issue should try to follow the steps described above.

Let's hope now that this is fixed soon and any future upgrade-juju will not upgrade us to another development version.

Revision history for this message
Thiago Bellini (bellini666) wrote :

I forgot to mention: The upgrade to 1.20.1 broke juju because it was missing (as described by thumper on #juju-dev) "HA" setup, and that setup would just be triggered by an upgrade from 1.18.x to 1.20.x (and as is described in this bug, we were on 1.19.3)

Changed in juju-core:
status: In Progress → Fix Released
Revision history for this message
Thiago Bellini (bellini666) wrote :

wwtizel3: You marked this as "Fix Released", but is it really fixed in the way that, if I run "upgrade-juju" now, it will not upgrade my stable environment to a development release?

Curtis Hovey (sinzui)
Changed in juju-core:
milestone: next-stable → 1.21-alpha1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.