worker/provisioner missed signal to start new machine

Bug #1248800 reported by Dave Cheney
22
This bug affects 5 people
Affects Status Importance Assigned to Milestone
juju-core
Won't Fix
Medium
Unassigned

Bug Description

juju bootstrap
juju deploy $SERVICE

worker/uniter fails to provision new machine

2013-11-07 02:35:03 INFO juju machine.go:109 machine agent machine-0 start
2013-11-07 02:35:03 DEBUG juju.agent agent.go:232 Reading agent config, format: format 1.16
2013-11-07 02:35:03 INFO juju machine.go:134 Starting StateWorker for machine-0
2013-11-07 02:35:03 INFO juju runner.go:253 worker: start "state"
2013-11-07 02:35:03 INFO juju.state open.go:67 opening state; mongo addresses: ["localhost:37017"]; entity "machine-0"
2013-11-07 02:35:03 INFO juju runner.go:253 worker: start "api"
2013-11-07 02:35:03 INFO juju apiclient.go:111 state/api: dialing "wss://localhost:17070/"
2013-11-07 02:35:03 ERROR juju apiclient.go:116 state/api: websocket.Dial wss://localhost:17070/: dial tcp 127.0.0.1:17070: connection refused
2013-11-07 02:35:03 ERROR juju runner.go:211 worker: exited "api": websocket.Dial wss://localhost:17070/: dial tcp 127.0.0.1:17070: connection refused
2013-11-07 02:35:03 INFO juju runner.go:245 worker: restarting "api" in 3s
2013-11-07 02:35:03 INFO juju.state open.go:105 connection established
2013-11-07 02:35:03 INFO juju runner.go:253 worker: start "firewaller"
2013-11-07 02:35:03 INFO juju runner.go:253 worker: start "addressupdater"
2013-11-07 02:35:03 INFO juju runner.go:253 worker: start "apiserver"
2013-11-07 02:35:03 INFO juju.state.apiserver apiserver.go:42 listening on "[::]:17070"
2013-11-07 02:35:03 INFO juju runner.go:253 worker: start "cleaner"
2013-11-07 02:35:03 INFO juju runner.go:253 worker: start "resumer"
2013-11-07 02:35:03 INFO juju runner.go:253 worker: start "minunitsworker"
2013-11-07 02:35:03 INFO juju.provider.ec2 ec2.go:176 opening environment "ap-southeast-2"
2013-11-07 02:35:03 ERROR juju.worker environ.go:50 loaded invalid environment configuration: environment has no access-key or secret-key
2013-11-07 02:35:03 INFO juju.provider.ec2 ec2.go:176 opening environment "ap-southeast-2"
2013-11-07 02:35:03 ERROR juju.worker environ.go:50 loaded invalid environment configuration: environment has no access-key or secret-key
2013-11-07 02:35:06 INFO juju runner.go:253 worker: start "api"
2013-11-07 02:35:06 INFO juju apiclient.go:111 state/api: dialing "wss://localhost:17070/"
2013-11-07 02:35:06 INFO juju apiclient.go:121 state/api: connection established
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:102 <- [1] <unknown> {"RequestId":1,"Type":"Admin","Request":"Login","Params":{"AuthTag":"machine-0","Password":"wVudtpZq3hrLeetHf/XKGq9B","Nonce":"user-admin:bootstrap"}}
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:109 <- [1] machine-0 175.561964ms {"RequestId":1,"Response":{}} Admin[""].Login
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:102 <- [1] machine-0 {"RequestId":2,"Type":"Agent","Request":"GetEntities","Params":{"Entities":[{"Tag":"machine-0"}]}}
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:109 <- [1] machine-0 770.651us {"RequestId":2,"Response":{"Entities":[{"Life":"alive","Jobs":["JobManageEnviron","JobManageState","JobHostUnits"],"ContainerType":"","Error":null}]}} Agent[""].GetEntities
2013-11-07 02:35:06 INFO juju runner.go:253 worker: start "machiner"
2013-11-07 02:35:06 INFO juju runner.go:253 worker: start "upgrader"
2013-11-07 02:35:06 INFO juju runner.go:253 worker: start "logger"
2013-11-07 02:35:06 DEBUG juju.worker.logger logger.go:35 initial log config: "<root>=DEBUG"
2013-11-07 02:35:06 DEBUG juju.worker.logger logger.go:60 logger setup
2013-11-07 02:35:06 INFO juju runner.go:253 worker: start "lxc-provisioner"
2013-11-07 02:35:06 INFO juju runner.go:253 worker: start "environ-provisioner"
2013-11-07 02:35:06 INFO juju runner.go:253 worker: start "deployer"
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:102 <- [1] machine-0 {"RequestId":4,"Type":"Machiner","Request":"Life","Params":{"Entities":[{"Tag":"machine-0"}]}}
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:102 <- [1] machine-0 {"RequestId":5,"Type":"Upgrader","Request":"SetTools","Params":{"AgentTools":[{"Tag":"machine-0","Tools":{"Version":"1.17.0.1-precise-amd64"}}]}}
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:102 <- [1] machine-0 {"RequestId":6,"Type":"Logger","Request":"LoggingConfig","Params":{"Entities":[{"Tag":"machine-0"}]}}
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:102 <- [1] machine-0 {"RequestId":7,"Type":"Provisioner","Request":"StateAddresses","Params":{}}
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:102 <- [1] machine-0 {"RequestId":8,"Type":"Provisioner","Request":"WatchForEnvironConfigChanges","Params":{}}
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:102 <- [1] machine-0 {"RequestId":9,"Type":"Deployer","Request":"WatchUnits","Params":{"Entities":[{"Tag":"machine-0"}]}}
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:109 <- [1] machine-0 38.362153ms {"RequestId":4,"Response":{"Results":[{"Life":"alive","Error":null}]}} Machiner[""].Life
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:102 <- [1] machine-0 {"RequestId":10,"Type":"Machiner","Request":"SetStatus","Params":{"Entities":[{"Tag":"machine-0","Status":"started","Info":"","Data":null}],"Machines":null}}
2013-11-07 02:35:06 DEBUG juju.state.apiserver apiserver.go:109 <- [1] machine-0 41.690216ms {"RequestId":6,"Response":{"Results":[{"Error":null,"Result":"\u003croot\u003e=INFO"}]}} Logger[""].LoggingConfig
2013-11-07 02:35:06 DEBUG juju.worker.logger logger.go:45 reconfiguring logging from "<root>=DEBUG" to "<root>=INFO"
2013-11-07 02:35:06 ERROR juju runner.go:211 worker: exited "lxc-provisioner": no state server machines with addresses found
2013-11-07 02:35:06 INFO juju.provider.ec2 ec2.go:176 opening environment "ap-southeast-2"
2013-11-07 02:35:06 INFO juju runner.go:245 worker: restarting "lxc-provisioner" in 3s
2013-11-07 02:35:06 ERROR juju runner.go:211 worker: exited "environ-provisioner": no state server machines with addresses found
2013-11-07 02:35:06 INFO juju runner.go:245 worker: restarting "environ-provisioner" in 3s
2013-11-07 02:35:06 INFO juju.worker.upgrader upgrader.go:134 desired tool version: 1.17.0.1
2013-11-07 02:35:06 INFO juju.worker.machiner machiner.go:52 "machine-0" started
2013-11-07 02:35:08 INFO juju.provider.ec2 ec2.go:176 opening environment "ap-southeast-2"
2013-11-07 02:35:08 INFO juju.worker.addressupdater worker.go:49 address updater received inital environment configuration
2013-11-07 02:35:08 INFO juju.provider.ec2 ec2.go:176 opening environment "ap-southeast-2"
2013-11-07 02:35:08 INFO juju.worker.addressupdater updater.go:226 machine "0" has new addresses: [public:ec2-54-253-140-99.ap-southeast-2.compute.amazonaws.com local-cloud:ip-10-248-60-195.ap-southeast-2.compute.internal public:54.253.140.99 local-cloud:10.248.60.195]
2013-11-07 02:35:09 INFO juju runner.go:253 worker: start "lxc-provisioner"
2013-11-07 02:35:09 INFO juju runner.go:253 worker: start "environ-provisioner"
2013-11-07 02:35:09 INFO juju.provider.ec2 ec2.go:176 opening environment "ap-southeast-2"
2013-11-07 02:35:09 INFO juju.environs.tools tools.go:208 finding exact version 1.17.0.1
2013-11-07 02:35:09 INFO juju.environs.tools tools.go:85 reading tools with major.minor version 1.17
2013-11-07 02:35:09 INFO juju.environs.tools tools.go:93 filtering tools by version: 1.17.0.1
2013-11-07 02:35:09 INFO juju.environs.tools tools.go:96 filtering tools by series: precise
2013-11-07 02:35:09 INFO juju.environs.tools tools.go:99 filtering tools by architecture: amd64
2013-11-07 02:35:09 INFO juju.provider.ec2 ec2.go:176 opening environment "ap-southeast-2"
2013-11-07 02:35:09 INFO juju.provisioner provisioner_task.go:102 Starting up provisioner task machine-0
2013-11-07 02:35:09 INFO juju.provisioner provisioner_task.go:243 machine 0 already started as instance "i-872b3fba"
2013-11-07 02:35:09 INFO juju.provisioner provisioner_task.go:234 cannot get machine "1" status: status not found
2013-11-07 02:35:09 INFO juju.provisioner provisioner_task.go:102 Starting up provisioner task machine-0

restarting the provisioner fixed the problem

2013-11-07 02:49:02 ERROR juju runner.go:211 worker: exited "firewaller": machine 1 is not provisioned
2013-11-07 02:49:02 INFO juju runner.go:245 worker: restarting "firewaller" in 3s
2013-11-07 02:49:02 INFO juju.provisioner provisioner_task.go:102 Starting up provisioner task machine-0
2013-11-07 02:49:02 INFO juju.provisioner provisioner_task.go:239 found machine "1" pending provisioning
2013-11-07 02:49:02 INFO juju.provisioner provisioner_task.go:243 machine 2 already started as instance "i-a1e0179d"
2013-11-07 02:49:02 INFO juju.provisioner provisioner_task.go:243 machine 0 already started as instance "i-872b3fba"
2013-11-07 02:49:02 INFO juju.environs.tools tools.go:85 reading tools with major.minor version 1.17
2013-11-07 02:49:02 INFO juju.environs.tools tools.go:93 filtering tools by version: 1.17.0.1
2013-11-07 02:49:02 INFO juju.environs.tools tools.go:96 filtering tools by series: raring
2013-11-07 02:49:02 INFO juju password.go:94 setting password for "machine-1"
2013-11-07 02:49:05 INFO juju runner.go:253 worker: start "firewaller"
2013-11-07 02:49:05 INFO juju.provider.ec2 ec2.go:176 opening environment "ap-southeast-2"
2013-11-07 02:49:06 ERROR juju runner.go:211 worker: exited "firewaller": machine 1 is not provisioned
2013-11-07 02:49:06 INFO juju runner.go:245 worker: restarting "firewaller" in 3s
2013-11-07 02:49:06 INFO juju.provider.ec2 ec2.go:418 started instance "i-11d4c02c"
2013-11-07 02:49:06 INFO juju.provisioner provisioner_task.go:367 started machine 1 as instance i-11d4c02c with hardware "arch=amd64 cpu-cores=1 cpu-power=100 mem=1740M root-disk=8192M"
2013-11-07 02:49:09 INFO juju runner.go:253 worker: start "firewaller"
2013-11-07 02:49:09 INFO juju.provider.ec2 ec2.go:176 opening environment "ap-southeast-2"
2013-11-07 02:49:12 INFO juju.worker.addressupdater updater.go:226 machine "1" has new addresses: [public:ec2-54-253-152-115.ap-southeast-2.compute.amazonaws.com local-cloud:ip-10-248-215-148.ap-southeast-2.compute.internal public:54.253.152.115 local-cloud:10.248.215.148]
2013-11-07 02:51:45 INFO juju password.go:94 setting password for "machine-1"
2013-11-07 02:51:45 INFO juju.provider.ec2 ec2.go:176 opening environment "ap-southeast-2"
2013-11-07 02:51:45 INFO juju.environs.tools tools.go:208 finding exact version 1.17.0.1
2013-11-07 02:51:45 INFO juju.environs.tools tools.go:85 reading tools with major.minor version 1.17
2013-11-07 02:51:45 INFO juju.environs.tools tools.go:93 filtering tools by version: 1.17.0.1
2013-11-07 02:51:45 INFO juju.environs.tools tools.go:96 filtering tools by series: raring
2013-11-07 02:51:45 INFO juju.environs.tools tools.go:99 filtering tools by architecture: amd64

Revision history for this message
Andrew Wilkins (axwalk) wrote :

I *think* this is the problem:
    2013-11-07 02:35:09 INFO juju.provisioner provisioner_task.go:234 cannot get machine "1" status: status not found

But not sure how that happens. I'm not full bottle on mgo/state/transactions, but it looks like adding a machine creates a status entry in the same transaction.

Curtis Hovey (sinzui)
tags: added: deploy
Changed in juju-core:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Sean Feole (sfeole) wrote :

I've experienced this bug a few times, here is a debug log:: I'll see if I can get more logs/data the next time this happens, but wanted to +1 this bug.

machine-0: 2014-03-05 20:38:49 INFO juju.provider.ec2 ec2.go:200 opening environment "amazon"
machine-0: 2014-03-05 20:38:49 INFO juju.worker.instanceupdater worker.go:49 instance poller received inital environment configuration
machine-0: 2014-03-05 20:38:49 DEBUG juju.state.apiserver apiserver.go:120 <- [2]
machine-0 {"RequestId":9,"Type":"KeyUpdater","Request":"AuthorisedKeys","Params":{"Entities":[{"Tag":"machine-0"}]}}
machine-0: 2014-03-05 20:38:49 DEBUG juju.state.apiserver apiserver.go:120 <- [2]
machine-0 {"RequestId":10,"Type":"Provisioner","Request":"Life","Params":{"Entities":[{"Tag":"machine-0"}]}}
machine-0: 2014-03-05 20:38:49 INFO juju.worker.machiner machiner.go:88 setting addresses for machine-0 to ["10.165.31.233" "fe80::2000:aff:fea5:1fe9"]
machine-0: 2014-03-05 20:38:49 DEBUG juju.worker.logger logger.go:45 reconfiguring logging from "<root>=DEBUG" to "<root>=WARNING;unit=DEBUG"
machine-0: 2014-03-05 20:38:50 ERROR juju runner.go:220 worker: exited "environ-provisioner": no state server machines with addresses found

tags: added: hs-arm64
Revision history for this message
Raghuram Kota (rkota) wrote :

Impacts juju local provider fucntionality

Curtis Hovey (sinzui)
Changed in juju-core:
importance: High → Medium
Revision history for this message
Anastasia (anastasia-macmood) wrote :

There have been a lot of changes in the area. We cannot reproduce this with Juju 2.

If you can, please open a bug against "juju" launchpad project with reproducible scenario and newer logs.

Changed in juju-core:
status: Triaged → Won't Fix
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.