So it appears 1.19.0 is trying to connect to the database somehow differently:
2014-04-14 18:11:33 DEBUG juju.state.apiserver apiserver.go:127 -> [1] machine-0 40.970088016s {"RequestId":92,"Error":"watcher has been stopped","ErrorCode":"stopped","Response":{}} NotifyWatcher["16"].Next
2014-04-14 18:11:33 INFO juju.rpc server.go:295 error closing codec: write tcp 127.0.0.1:58053: broken pipe
2014-04-14 18:11:33 INFO juju.state.apiserver apiserver.go:135 [1] machine-0 API connection terminated after 6m45.651136547s
2014-04-14 18:11:33 INFO juju.worker.upgrader error.go:32 upgraded from 1.18.1-precise-amd64 to 1.19.0.1-precise-amd64 ("https://s3.amazonaws.com/b6bbc17f4f914a42840f3e5cf9fec7cf/tools/releases/juju-1.19.0.1-precise-amd64.tgz?AWSAccessKeyId=AKIAJQ4KQVZRZLUR6YWQ&Expires=1713118291&Signature=Ah33VHBEO41byrDWndjXB4plWvA%3D")
2014-04-14 18:11:33 ERROR juju.cmd supercommand.go:300 must restart: an agent upgrade is available
2014-04-14 18:11:34 INFO juju.cmd supercommand.go:296 running juju-1.19.0.1-precise-amd64 [gc]
2014-04-14 18:11:34 INFO juju.cmd.jujud machine.go:147 machine agent machine-0 start (1.19.0.1-precise-amd64 [gc])
2014-04-14 18:11:34 DEBUG juju.agent agent.go:365 read agent config, format "1.18"
2014-04-14 18:11:34 INFO juju runner.go:262 worker: start "api"
2014-04-14 18:11:34 INFO juju runner.go:262 worker: start "statestarter"
2014-04-14 18:11:34 INFO juju runner.go:262 worker: start "termination"
2014-04-14 18:11:34 INFO juju.state.api apiclient.go:201 dialing "wss://localhost:17070/"
2014-04-14 18:11:34 INFO juju runner.go:262 worker: start "state"
2014-04-14 18:11:34 INFO juju.agent.mongo mongo.go:189 Ensuring mongo server is running; dataDir /var/lib/juju; port 37017
2014-04-14 18:11:34 DEBUG juju.agent.mongo mongo.go:255 found mongod at: /usr/bin/mongod
2014-04-14 18:11:34 INFO juju.state.api apiclient.go:209 error dialing "wss://localhost:17070/": websocket.Dial wss://localhost:17070/: dial tcp 127.0.0.1:17070: connection refused
2014-04-14 18:11:34 ERROR juju runner.go:220 worker: exited "api": timed out connecting to "wss://localhost:17070/"
2014-04-14 18:11:34 INFO juju runner.go:254 worker: restarting "api" in 3s
2014-04-14 18:11:34 DEBUG juju.agent.mongo mongo.go:262 mongod --version:
db version v2.4.6
Mon Apr 14 18:11:34.222 git version: nogitversion
2014-04-14 18:11:34 INFO juju.state open.go:81 opening state, mongo addresses: ["localhost:37017"]; entity "machine-0"
2014-04-14 18:11:34 DEBUG juju.state open.go:86 dialing mongo
2014-04-14 18:11:34 DEBUG juju.state open.go:92 connection established
2014-04-14 18:11:34 ERROR juju runner.go:220 worker: exited "state": cannot log in to admin database as "machine-0": unauthorized mongo access: auth fails
2014-04-14 18:11:34 INFO juju runner.go:254 worker: restarting "state" in 3s
I think the key point is that because this is an upgrade, we haven't given "machine-0" access to the "admin" database. Which means we can't try to log into that database.
So either we need to teach upgrade to give machine-0 admin rights (which we've determined to be hard to do), or we treat failures to connect to "admin" as a soft failure.
So it appears 1.19.0 is trying to connect to the database somehow differently: apiserver apiserver.go:127 -> [1] machine-0 40.970088016s {"RequestId" :92,"Error" :"watcher has been stopped" ,"ErrorCode" :"stopped" ,"Response" :{}} NotifyWatcher[ "16"].Next apiserver apiserver.go:135 [1] machine-0 API connection terminated after 6m45.651136547s upgrader error.go:32 upgraded from 1.18.1- precise- amd64 to 1.19.0. 1-precise- amd64 ("https:/ /s3.amazonaws. com/b6bbc17f4f9 14a42840f3e5cf9 fec7cf/ tools/releases/ juju-1. 19.0.1- precise- amd64.tgz? AWSAccessKeyId= AKIAJQ4KQVZRZLU R6YWQ&Expires= 1713118291& Signature= Ah33VHBEO41byrD WndjXB4plWvA% 3D") 19.0.1- precise- amd64 [gc] 1-precise- amd64 [gc]) localhost: 17070/" localhost: 17070/" : websocket.Dial wss://localhost :17070/ : dial tcp 127.0.0.1:17070: connection refused localhost: 17070/"
2014-04-14 18:11:33 DEBUG juju.state.
2014-04-14 18:11:33 INFO juju.rpc server.go:295 error closing codec: write tcp 127.0.0.1:58053: broken pipe
2014-04-14 18:11:33 INFO juju.state.
2014-04-14 18:11:33 INFO juju.worker.
2014-04-14 18:11:33 ERROR juju.cmd supercommand.go:300 must restart: an agent upgrade is available
2014-04-14 18:11:34 INFO juju.cmd supercommand.go:296 running juju-1.
2014-04-14 18:11:34 INFO juju.cmd.jujud machine.go:147 machine agent machine-0 start (1.19.0.
2014-04-14 18:11:34 DEBUG juju.agent agent.go:365 read agent config, format "1.18"
2014-04-14 18:11:34 INFO juju runner.go:262 worker: start "api"
2014-04-14 18:11:34 INFO juju runner.go:262 worker: start "statestarter"
2014-04-14 18:11:34 INFO juju runner.go:262 worker: start "termination"
2014-04-14 18:11:34 INFO juju.state.api apiclient.go:201 dialing "wss://
2014-04-14 18:11:34 INFO juju runner.go:262 worker: start "state"
2014-04-14 18:11:34 INFO juju.agent.mongo mongo.go:189 Ensuring mongo server is running; dataDir /var/lib/juju; port 37017
2014-04-14 18:11:34 DEBUG juju.agent.mongo mongo.go:255 found mongod at: /usr/bin/mongod
2014-04-14 18:11:34 INFO juju.state.api apiclient.go:209 error dialing "wss://
2014-04-14 18:11:34 ERROR juju runner.go:220 worker: exited "api": timed out connecting to "wss://
2014-04-14 18:11:34 INFO juju runner.go:254 worker: restarting "api" in 3s
2014-04-14 18:11:34 DEBUG juju.agent.mongo mongo.go:262 mongod --version:
db version v2.4.6
Mon Apr 14 18:11:34.222 git version: nogitversion
2014-04-14 18:11:34 INFO juju.state open.go:81 opening state, mongo addresses: ["localhost: 37017"] ; entity "machine-0"
2014-04-14 18:11:34 DEBUG juju.state open.go:86 dialing mongo
2014-04-14 18:11:34 DEBUG juju.state open.go:92 connection established
2014-04-14 18:11:34 ERROR juju runner.go:220 worker: exited "state": cannot log in to admin database as "machine-0": unauthorized mongo access: auth fails
2014-04-14 18:11:34 INFO juju runner.go:254 worker: restarting "state" in 3s
I think the key point is that because this is an upgrade, we haven't given "machine-0" access to the "admin" database. Which means we can't try to log into that database.
So either we need to teach upgrade to give machine-0 admin rights (which we've determined to be hard to do), or we treat failures to connect to "admin" as a soft failure.