Activity log for bug #1514874

Date Who What changed Old value New value Message
2015-11-10 15:05:35 Jorge Niedbalski bug added bug
2015-11-10 15:09:13 Jorge Niedbalski description [Environment] Juju-core 1.24.5 (Still reproducible with 1.24.7) Trusty 14.04 [Description] We noticed this behavior on multiple machines, First for some unknown reason , the agent fails to authenticate on the state servers. 184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password Then because of this code, the workers are stopped, and the uninstall of the agent occurs. https://github.com/juju/juju/blob/4e3972b33f1fc9db13d0779a5a6b0ff7fab9cdb0/worker/apicaller/open.go#L66 184168-2015-10-22 09:06:02 INFO juju.worker runner.go:275 stopped "api", err: try again 184169-2015-10-22 09:06:02 DEBUG juju.worker runner.go:203 "api" done: try again 184170-2015-10-22 09:06:02 ERROR juju.worker runner.go:223 exited "api": try again 184171-2015-10-22 09:06:02 INFO juju.worker runner.go:261 restarting "api" in 3s 184172-2015-10-22 09:06:05 INFO juju.worker runner.go:269 start "api" 184173-2015-10-22 09:06:05 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184174-2015-10-22 09:06:05 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184175-2015-10-22 09:21:49 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password 184178:2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated 184179:2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated 184180:2015-10-22 09:21:50 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated 184181-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "statestarter" 184182-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "termination" 184183-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "statestarter", err: <nil> 184184-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "termination", err: <nil> 184185-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "statestarter" done: <nil> 184186-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "statestarter" from known workers 184187-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "termination" done: <nil> 184188-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "termination" from known workers 184189-2015-10-22 09:21:50 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host 184190-2015-10-22 09:21:51 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host 184191-2015-10-22 09:21:51 INFO juju.cmd supercommand.go:436 command finished After this point, you can observe messages such as: /proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found /proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found [Environment] Juju-core 1.24.5 (Still reproducible with 1.24.7) Trusty 14.04 [Description] We noticed this behavior on multiple machines, First for some unknown reason , the agent fails to authenticate on the state servers. 184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password Then because of this code, the workers are stopped, and the uninstall of the agent occurs. https://github.com/juju/juju/blob/4e3972b33f1fc9db13d0779a5a6b0ff7fab9cdb0/worker/apicaller/open.go#L66 184168-2015-10-22 09:06:02 INFO juju.worker runner.go:275 stopped "api", err: try again 184169-2015-10-22 09:06:02 DEBUG juju.worker runner.go:203 "api" done: try again 184170-2015-10-22 09:06:02 ERROR juju.worker runner.go:223 exited "api": try again 184171-2015-10-22 09:06:02 INFO juju.worker runner.go:261 restarting "api" in 3s 184172-2015-10-22 09:06:05 INFO juju.worker runner.go:269 start "api" 184173-2015-10-22 09:06:05 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184174-2015-10-22 09:06:05 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184175-2015-10-22 09:21:49 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password 184178:2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated 184179:2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated 184180:2015-10-22 09:21:50 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated 184181-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "statestarter" 184182-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "termination" 184183-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "statestarter", err: <nil> 184184-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "termination", err: <nil> 184185-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "statestarter" done: <nil> 184186-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "statestarter" from known workers 184187-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "termination" done: <nil> 184188-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "termination" from known workers 184189-2015-10-22 09:21:50 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host 184190-2015-10-22 09:21:51 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host 184191-2015-10-22 09:21:51 INFO juju.cmd supercommand.go:436 command finished After this point, you can observe messages such as: /proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found /proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found [Other ways to reproduce] 1) Modify the apipassword on the /var/lib/juju/agents/machine-8/agent.conf 2) Restart jujud-machine-8 The following error is printed: 2015-11-10 14:14:28 ERROR juju.worker.apicaller open.go:169 agent terminating due to error returned during API open: invalid entity name or password 2015-11-10 14:14:28 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated 2015-11-10 14:14:28 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated 2015-11-10 14:14:28 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated 3) /var/lib/juju dissapears ls -lh /var/lib/jujuju ls: cannot access /var/lib/juju: No such file or directory Note: Please note that we still don't know exactly what caused the authentication to fail, but, even if you are manually modifying the agent.conf , I don't think that a correct reaction is to wipe the juju agent from the machine.
2015-11-10 17:04:02 Cheryl Jennings juju-core: status New Triaged
2015-11-10 17:05:57 Cheryl Jennings juju-core: importance Undecided High
2015-11-10 18:26:17 Dominique Poulain bug added subscriber Dominique Poulain
2016-02-08 19:09:55 Jorge Niedbalski tags sts sts sts-needs-review
2016-02-08 19:14:30 Jorge Niedbalski tags sts sts-needs-review
2016-05-05 17:29:44 Jorge Niedbalski description [Environment] Juju-core 1.24.5 (Still reproducible with 1.24.7) Trusty 14.04 [Description] We noticed this behavior on multiple machines, First for some unknown reason , the agent fails to authenticate on the state servers. 184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password Then because of this code, the workers are stopped, and the uninstall of the agent occurs. https://github.com/juju/juju/blob/4e3972b33f1fc9db13d0779a5a6b0ff7fab9cdb0/worker/apicaller/open.go#L66 184168-2015-10-22 09:06:02 INFO juju.worker runner.go:275 stopped "api", err: try again 184169-2015-10-22 09:06:02 DEBUG juju.worker runner.go:203 "api" done: try again 184170-2015-10-22 09:06:02 ERROR juju.worker runner.go:223 exited "api": try again 184171-2015-10-22 09:06:02 INFO juju.worker runner.go:261 restarting "api" in 3s 184172-2015-10-22 09:06:05 INFO juju.worker runner.go:269 start "api" 184173-2015-10-22 09:06:05 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184174-2015-10-22 09:06:05 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184175-2015-10-22 09:21:49 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password 184178:2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated 184179:2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated 184180:2015-10-22 09:21:50 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated 184181-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "statestarter" 184182-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "termination" 184183-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "statestarter", err: <nil> 184184-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "termination", err: <nil> 184185-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "statestarter" done: <nil> 184186-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "statestarter" from known workers 184187-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "termination" done: <nil> 184188-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "termination" from known workers 184189-2015-10-22 09:21:50 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host 184190-2015-10-22 09:21:51 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host 184191-2015-10-22 09:21:51 INFO juju.cmd supercommand.go:436 command finished After this point, you can observe messages such as: /proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found /proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found [Other ways to reproduce] 1) Modify the apipassword on the /var/lib/juju/agents/machine-8/agent.conf 2) Restart jujud-machine-8 The following error is printed: 2015-11-10 14:14:28 ERROR juju.worker.apicaller open.go:169 agent terminating due to error returned during API open: invalid entity name or password 2015-11-10 14:14:28 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated 2015-11-10 14:14:28 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated 2015-11-10 14:14:28 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated 3) /var/lib/juju dissapears ls -lh /var/lib/jujuju ls: cannot access /var/lib/juju: No such file or directory Note: Please note that we still don't know exactly what caused the authentication to fail, but, even if you are manually modifying the agent.conf , I don't think that a correct reaction is to wipe the juju agent from the machine. [Environment] Juju-core 1.24.5 - 1.25.5 (Still reproducible with 1.24.7) Trusty 14.04 [Description] We noticed this behavior on multiple machines, First for some unknown reason , the agent fails to authenticate on the state servers. 184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password Then because of this code, the workers are stopped, and the uninstall of the agent occurs. https://github.com/juju/juju/blob/4e3972b33f1fc9db13d0779a5a6b0ff7fab9cdb0/worker/apicaller/open.go#L66 184168-2015-10-22 09:06:02 INFO juju.worker runner.go:275 stopped "api", err: try again 184169-2015-10-22 09:06:02 DEBUG juju.worker runner.go:203 "api" done: try again 184170-2015-10-22 09:06:02 ERROR juju.worker runner.go:223 exited "api": try again 184171-2015-10-22 09:06:02 INFO juju.worker runner.go:261 restarting "api" in 3s 184172-2015-10-22 09:06:05 INFO juju.worker runner.go:269 start "api" 184173-2015-10-22 09:06:05 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184174-2015-10-22 09:06:05 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184175-2015-10-22 09:21:49 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password 184178:2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated 184179:2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated 184180:2015-10-22 09:21:50 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated 184181-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "statestarter" 184182-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "termination" 184183-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "statestarter", err: <nil> 184184-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "termination", err: <nil> 184185-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "statestarter" done: <nil> 184186-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "statestarter" from known workers 184187-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "termination" done: <nil> 184188-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "termination" from known workers 184189-2015-10-22 09:21:50 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host 184190-2015-10-22 09:21:51 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host 184191-2015-10-22 09:21:51 INFO juju.cmd supercommand.go:436 command finished After this point, you can observe messages such as: /proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found /proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found [Other ways to reproduce] 1) Modify the apipassword on the /var/lib/juju/agents/machine-8/agent.conf 2) Restart jujud-machine-8 The following error is printed: 2015-11-10 14:14:28 ERROR juju.worker.apicaller open.go:169 agent terminating due to error returned during API open: invalid entity name or password 2015-11-10 14:14:28 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated 2015-11-10 14:14:28 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated 2015-11-10 14:14:28 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated 3) /var/lib/juju dissapears ls -lh /var/lib/jujuju ls: cannot access /var/lib/juju: No such file or directory Note: Please note that we still don't know exactly what caused the authentication to fail, but, even if you are manually modifying the agent.conf , I don't think that a correct reaction is to wipe the juju agent from the machine.
2016-05-06 15:14:44 Eric Snow juju-core: assignee Eric Snow (ericsnowcurrently)
2016-05-06 16:07:59 Eric Snow juju-core: status Triaged In Progress
2016-05-10 15:56:26 Eric Snow juju-core: assignee Eric Snow (ericsnowcurrently)
2016-05-10 15:56:32 Eric Snow juju-core: status In Progress Triaged
2016-05-10 15:56:46 Eric Snow nominated for series juju-core/1.25
2016-05-10 15:56:46 Eric Snow bug task added juju-core/1.25
2016-05-10 15:56:53 Eric Snow juju-core/1.25: status New Triaged
2016-05-10 15:56:56 Eric Snow juju-core/1.25: importance Undecided High
2016-05-10 16:03:48 Eric Snow summary Invalid entity name or password error, causes Juju to uninstall "Invalid entity name or password" error with valid credentials.
2016-05-10 16:05:31 Eric Snow description [Environment] Juju-core 1.24.5 - 1.25.5 (Still reproducible with 1.24.7) Trusty 14.04 [Description] We noticed this behavior on multiple machines, First for some unknown reason , the agent fails to authenticate on the state servers. 184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password Then because of this code, the workers are stopped, and the uninstall of the agent occurs. https://github.com/juju/juju/blob/4e3972b33f1fc9db13d0779a5a6b0ff7fab9cdb0/worker/apicaller/open.go#L66 184168-2015-10-22 09:06:02 INFO juju.worker runner.go:275 stopped "api", err: try again 184169-2015-10-22 09:06:02 DEBUG juju.worker runner.go:203 "api" done: try again 184170-2015-10-22 09:06:02 ERROR juju.worker runner.go:223 exited "api": try again 184171-2015-10-22 09:06:02 INFO juju.worker runner.go:261 restarting "api" in 3s 184172-2015-10-22 09:06:05 INFO juju.worker runner.go:269 start "api" 184173-2015-10-22 09:06:05 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184174-2015-10-22 09:06:05 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184175-2015-10-22 09:21:49 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password 184178:2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated 184179:2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated 184180:2015-10-22 09:21:50 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated 184181-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "statestarter" 184182-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "termination" 184183-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "statestarter", err: <nil> 184184-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "termination", err: <nil> 184185-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "statestarter" done: <nil> 184186-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "statestarter" from known workers 184187-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "termination" done: <nil> 184188-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "termination" from known workers 184189-2015-10-22 09:21:50 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host 184190-2015-10-22 09:21:51 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host 184191-2015-10-22 09:21:51 INFO juju.cmd supercommand.go:436 command finished After this point, you can observe messages such as: /proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found /proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found [Other ways to reproduce] 1) Modify the apipassword on the /var/lib/juju/agents/machine-8/agent.conf 2) Restart jujud-machine-8 The following error is printed: 2015-11-10 14:14:28 ERROR juju.worker.apicaller open.go:169 agent terminating due to error returned during API open: invalid entity name or password 2015-11-10 14:14:28 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated 2015-11-10 14:14:28 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated 2015-11-10 14:14:28 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated 3) /var/lib/juju dissapears ls -lh /var/lib/jujuju ls: cannot access /var/lib/juju: No such file or directory Note: Please note that we still don't know exactly what caused the authentication to fail, but, even if you are manually modifying the agent.conf , I don't think that a correct reaction is to wipe the juju agent from the machine. (agent uninstall addressed in lp:1580233 and lp:1580221) [Environment] Juju-core 1.24.5 - 1.25.5 (Still reproducible with 1.24.7) Trusty 14.04 [Description] We noticed this behavior on multiple machines, First for some unknown reason , the agent fails to authenticate on the state servers. 184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password Then because of this code, the workers are stopped, and the uninstall of the agent occurs. https://github.com/juju/juju/blob/4e3972b33f1fc9db13d0779a5a6b0ff7fab9cdb0/worker/apicaller/open.go#L66 184168-2015-10-22 09:06:02 INFO juju.worker runner.go:275 stopped "api", err: try again 184169-2015-10-22 09:06:02 DEBUG juju.worker runner.go:203 "api" done: try again 184170-2015-10-22 09:06:02 ERROR juju.worker runner.go:223 exited "api": try again 184171-2015-10-22 09:06:02 INFO juju.worker runner.go:261 restarting "api" in 3s 184172-2015-10-22 09:06:05 INFO juju.worker runner.go:269 start "api" 184173-2015-10-22 09:06:05 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184174-2015-10-22 09:06:05 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184175-2015-10-22 09:21:49 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api" 184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password 184178:2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated 184179:2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated 184180:2015-10-22 09:21:50 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated 184181-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "statestarter" 184182-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "termination" 184183-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "statestarter", err: <nil> 184184-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "termination", err: <nil> 184185-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "statestarter" done: <nil> 184186-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "statestarter" from known workers 184187-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "termination" done: <nil> 184188-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "termination" from known workers 184189-2015-10-22 09:21:50 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host 184190-2015-10-22 09:21:51 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host 184191-2015-10-22 09:21:51 INFO juju.cmd supercommand.go:436 command finished After this point, you can observe messages such as: /proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found /proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found [Other ways to reproduce] 1) Modify the apipassword on the /var/lib/juju/agents/machine-8/agent.conf 2) Restart jujud-machine-8 The following error is printed: 2015-11-10 14:14:28 ERROR juju.worker.apicaller open.go:169 agent terminating due to error returned during API open: invalid entity name or password 2015-11-10 14:14:28 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated 2015-11-10 14:14:28 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated 2015-11-10 14:14:28 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated 3) /var/lib/juju dissapears ls -lh /var/lib/jujuju ls: cannot access /var/lib/juju: No such file or directory Note: Please note that we still don't know exactly what caused the authentication to fail, but, even if you are manually modifying the agent.conf , I don't think that a correct reaction is to wipe the juju agent from the machine.
2016-05-11 21:05:00 Cheryl Jennings juju-core/1.25: milestone 1.25.6
2016-05-11 21:05:02 Cheryl Jennings juju-core: milestone 2.0-beta7
2016-05-13 16:21:59 Curtis Hovey juju-core: milestone 2.0-beta7 2.0-beta8
2016-05-24 22:48:28 Anastasia juju-core/1.25: assignee Anastasia (anastasia-macmood)
2016-05-24 22:48:31 Anastasia juju-core/1.25: status Triaged In Progress
2016-05-26 22:14:45 Anastasia juju-core/1.25: importance High Critical
2016-05-26 22:15:13 Anastasia tags blocker
2016-05-27 06:09:19 Anastasia juju-core/1.25: status In Progress Fix Committed
2016-05-27 06:09:26 Anastasia juju-core: assignee Anastasia (anastasia-macmood)
2016-05-27 06:09:31 Anastasia juju-core: status Triaged In Progress
2016-05-27 06:12:54 Anastasia tags blocker
2016-05-27 08:22:40 Anastasia juju-core: importance High Critical
2016-05-27 08:22:48 Anastasia tags blocker
2016-05-27 16:15:10 Alexis Bruemmer juju-core: status In Progress Fix Committed
2016-05-27 18:41:48 Curtis Hovey tags blocker
2016-05-30 21:29:49 Anastasia juju-core: status Fix Committed In Progress
2016-05-30 21:29:51 Anastasia juju-core/1.25: status Fix Committed In Progress
2016-05-30 21:29:54 Anastasia juju-core: importance Critical High
2016-05-30 21:29:56 Anastasia juju-core/1.25: importance Critical High
2016-06-01 04:43:25 Anastasia juju-core/1.25: importance High Critical
2016-06-01 04:43:36 Anastasia tags blocker
2016-06-01 06:38:19 Anastasia juju-core: importance High Critical
2016-06-02 21:38:07 Anastasia juju-core: status In Progress Fix Committed
2016-06-02 21:38:10 Anastasia juju-core/1.25: status In Progress Fix Committed
2016-06-03 17:52:30 Curtis Hovey juju-core: status Fix Committed Fix Released
2016-06-07 20:18:21 Cheryl Jennings tags blocker
2016-07-14 21:09:58 Curtis Hovey juju-core/1.25: status Fix Committed Fix Released
2016-08-23 00:57:04 Canonical Juju QA Bot affects juju-core juju
2016-08-23 00:57:04 Canonical Juju QA Bot juju: milestone 2.0-beta8
2016-08-23 00:57:07 Canonical Juju QA Bot juju: milestone 2.0-beta8
2016-08-23 13:24:31 Canonical Juju QA Bot juju-core: importance Undecided Critical
2016-08-23 13:24:31 Canonical Juju QA Bot juju-core: status New Fix Released
2016-08-23 13:24:31 Canonical Juju QA Bot juju-core: assignee Anastasia (anastasia-macmood)