2015-11-10 15:05:35 |
Jorge Niedbalski |
bug |
|
|
added bug |
2015-11-10 15:09:13 |
Jorge Niedbalski |
description |
[Environment]
Juju-core 1.24.5 (Still reproducible with 1.24.7)
Trusty 14.04
[Description]
We noticed this behavior on multiple machines,
First for some unknown reason , the agent fails to authenticate on the state servers.
184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password
Then because of this code, the workers are stopped, and the uninstall of the agent occurs.
https://github.com/juju/juju/blob/4e3972b33f1fc9db13d0779a5a6b0ff7fab9cdb0/worker/apicaller/open.go#L66
184168-2015-10-22 09:06:02 INFO juju.worker runner.go:275 stopped "api", err: try again
184169-2015-10-22 09:06:02 DEBUG juju.worker runner.go:203 "api" done: try again
184170-2015-10-22 09:06:02 ERROR juju.worker runner.go:223 exited "api": try again
184171-2015-10-22 09:06:02 INFO juju.worker runner.go:261 restarting "api" in 3s
184172-2015-10-22 09:06:05 INFO juju.worker runner.go:269 start "api"
184173-2015-10-22 09:06:05 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184174-2015-10-22 09:06:05 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184175-2015-10-22 09:21:49 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password
184178:2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated
184179:2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated
184180:2015-10-22 09:21:50 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
184181-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "statestarter"
184182-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "termination"
184183-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "statestarter", err: <nil>
184184-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "termination", err: <nil>
184185-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "statestarter" done: <nil>
184186-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "statestarter" from known workers
184187-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "termination" done: <nil>
184188-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "termination" from known workers
184189-2015-10-22 09:21:50 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host
184190-2015-10-22 09:21:51 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host
184191-2015-10-22 09:21:51 INFO juju.cmd supercommand.go:436 command finished
After this point, you can observe messages such as:
/proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found
/proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found |
[Environment]
Juju-core 1.24.5 (Still reproducible with 1.24.7)
Trusty 14.04
[Description]
We noticed this behavior on multiple machines,
First for some unknown reason , the agent fails to authenticate on the state servers.
184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password
Then because of this code, the workers are stopped, and the uninstall of the agent occurs.
https://github.com/juju/juju/blob/4e3972b33f1fc9db13d0779a5a6b0ff7fab9cdb0/worker/apicaller/open.go#L66
184168-2015-10-22 09:06:02 INFO juju.worker runner.go:275 stopped "api", err: try again
184169-2015-10-22 09:06:02 DEBUG juju.worker runner.go:203 "api" done: try again
184170-2015-10-22 09:06:02 ERROR juju.worker runner.go:223 exited "api": try again
184171-2015-10-22 09:06:02 INFO juju.worker runner.go:261 restarting "api" in 3s
184172-2015-10-22 09:06:05 INFO juju.worker runner.go:269 start "api"
184173-2015-10-22 09:06:05 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184174-2015-10-22 09:06:05 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184175-2015-10-22 09:21:49 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password
184178:2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated
184179:2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated
184180:2015-10-22 09:21:50 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
184181-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "statestarter"
184182-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "termination"
184183-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "statestarter", err: <nil>
184184-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "termination", err: <nil>
184185-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "statestarter" done: <nil>
184186-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "statestarter" from known workers
184187-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "termination" done: <nil>
184188-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "termination" from known workers
184189-2015-10-22 09:21:50 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host
184190-2015-10-22 09:21:51 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host
184191-2015-10-22 09:21:51 INFO juju.cmd supercommand.go:436 command finished
After this point, you can observe messages such as:
/proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found
/proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found
[Other ways to reproduce]
1) Modify the apipassword on the /var/lib/juju/agents/machine-8/agent.conf
2) Restart jujud-machine-8
The following error is printed:
2015-11-10 14:14:28 ERROR juju.worker.apicaller open.go:169 agent terminating due to error returned during API open: invalid entity name or password
2015-11-10 14:14:28 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated
2015-11-10 14:14:28 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated
2015-11-10 14:14:28 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
3) /var/lib/juju dissapears
ls -lh /var/lib/jujuju
ls: cannot access /var/lib/juju: No such file or directory
Note:
Please note that we still don't know exactly what caused the authentication to fail, but, even if you are manually modifying the agent.conf , I don't think that a correct reaction is to wipe the juju agent from the machine. |
|
2015-11-10 17:04:02 |
Cheryl Jennings |
juju-core: status |
New |
Triaged |
|
2015-11-10 17:05:57 |
Cheryl Jennings |
juju-core: importance |
Undecided |
High |
|
2015-11-10 18:26:17 |
Dominique Poulain |
bug |
|
|
added subscriber Dominique Poulain |
2016-02-08 19:09:55 |
Jorge Niedbalski |
tags |
sts |
sts sts-needs-review |
|
2016-02-08 19:14:30 |
Jorge Niedbalski |
tags |
sts sts-needs-review |
|
|
2016-05-05 17:29:44 |
Jorge Niedbalski |
description |
[Environment]
Juju-core 1.24.5 (Still reproducible with 1.24.7)
Trusty 14.04
[Description]
We noticed this behavior on multiple machines,
First for some unknown reason , the agent fails to authenticate on the state servers.
184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password
Then because of this code, the workers are stopped, and the uninstall of the agent occurs.
https://github.com/juju/juju/blob/4e3972b33f1fc9db13d0779a5a6b0ff7fab9cdb0/worker/apicaller/open.go#L66
184168-2015-10-22 09:06:02 INFO juju.worker runner.go:275 stopped "api", err: try again
184169-2015-10-22 09:06:02 DEBUG juju.worker runner.go:203 "api" done: try again
184170-2015-10-22 09:06:02 ERROR juju.worker runner.go:223 exited "api": try again
184171-2015-10-22 09:06:02 INFO juju.worker runner.go:261 restarting "api" in 3s
184172-2015-10-22 09:06:05 INFO juju.worker runner.go:269 start "api"
184173-2015-10-22 09:06:05 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184174-2015-10-22 09:06:05 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184175-2015-10-22 09:21:49 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password
184178:2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated
184179:2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated
184180:2015-10-22 09:21:50 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
184181-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "statestarter"
184182-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "termination"
184183-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "statestarter", err: <nil>
184184-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "termination", err: <nil>
184185-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "statestarter" done: <nil>
184186-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "statestarter" from known workers
184187-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "termination" done: <nil>
184188-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "termination" from known workers
184189-2015-10-22 09:21:50 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host
184190-2015-10-22 09:21:51 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host
184191-2015-10-22 09:21:51 INFO juju.cmd supercommand.go:436 command finished
After this point, you can observe messages such as:
/proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found
/proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found
[Other ways to reproduce]
1) Modify the apipassword on the /var/lib/juju/agents/machine-8/agent.conf
2) Restart jujud-machine-8
The following error is printed:
2015-11-10 14:14:28 ERROR juju.worker.apicaller open.go:169 agent terminating due to error returned during API open: invalid entity name or password
2015-11-10 14:14:28 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated
2015-11-10 14:14:28 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated
2015-11-10 14:14:28 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
3) /var/lib/juju dissapears
ls -lh /var/lib/jujuju
ls: cannot access /var/lib/juju: No such file or directory
Note:
Please note that we still don't know exactly what caused the authentication to fail, but, even if you are manually modifying the agent.conf , I don't think that a correct reaction is to wipe the juju agent from the machine. |
[Environment]
Juju-core 1.24.5 - 1.25.5 (Still reproducible with 1.24.7)
Trusty 14.04
[Description]
We noticed this behavior on multiple machines,
First for some unknown reason , the agent fails to authenticate on the state servers.
184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password
Then because of this code, the workers are stopped, and the uninstall of the agent occurs.
https://github.com/juju/juju/blob/4e3972b33f1fc9db13d0779a5a6b0ff7fab9cdb0/worker/apicaller/open.go#L66
184168-2015-10-22 09:06:02 INFO juju.worker runner.go:275 stopped "api", err: try again
184169-2015-10-22 09:06:02 DEBUG juju.worker runner.go:203 "api" done: try again
184170-2015-10-22 09:06:02 ERROR juju.worker runner.go:223 exited "api": try again
184171-2015-10-22 09:06:02 INFO juju.worker runner.go:261 restarting "api" in 3s
184172-2015-10-22 09:06:05 INFO juju.worker runner.go:269 start "api"
184173-2015-10-22 09:06:05 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184174-2015-10-22 09:06:05 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184175-2015-10-22 09:21:49 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password
184178:2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated
184179:2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated
184180:2015-10-22 09:21:50 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
184181-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "statestarter"
184182-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "termination"
184183-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "statestarter", err: <nil>
184184-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "termination", err: <nil>
184185-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "statestarter" done: <nil>
184186-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "statestarter" from known workers
184187-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "termination" done: <nil>
184188-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "termination" from known workers
184189-2015-10-22 09:21:50 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host
184190-2015-10-22 09:21:51 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host
184191-2015-10-22 09:21:51 INFO juju.cmd supercommand.go:436 command finished
After this point, you can observe messages such as:
/proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found
/proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found
[Other ways to reproduce]
1) Modify the apipassword on the /var/lib/juju/agents/machine-8/agent.conf
2) Restart jujud-machine-8
The following error is printed:
2015-11-10 14:14:28 ERROR juju.worker.apicaller open.go:169 agent terminating due to error returned during API open: invalid entity name or password
2015-11-10 14:14:28 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated
2015-11-10 14:14:28 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated
2015-11-10 14:14:28 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
3) /var/lib/juju dissapears
ls -lh /var/lib/jujuju
ls: cannot access /var/lib/juju: No such file or directory
Note:
Please note that we still don't know exactly what caused the authentication to fail, but, even if you are manually modifying the agent.conf , I don't think that a correct reaction is to wipe the juju agent from the machine. |
|
2016-05-06 15:14:44 |
Eric Snow |
juju-core: assignee |
|
Eric Snow (ericsnowcurrently) |
|
2016-05-06 16:07:59 |
Eric Snow |
juju-core: status |
Triaged |
In Progress |
|
2016-05-10 15:56:26 |
Eric Snow |
juju-core: assignee |
Eric Snow (ericsnowcurrently) |
|
|
2016-05-10 15:56:32 |
Eric Snow |
juju-core: status |
In Progress |
Triaged |
|
2016-05-10 15:56:46 |
Eric Snow |
nominated for series |
|
juju-core/1.25 |
|
2016-05-10 15:56:46 |
Eric Snow |
bug task added |
|
juju-core/1.25 |
|
2016-05-10 15:56:53 |
Eric Snow |
juju-core/1.25: status |
New |
Triaged |
|
2016-05-10 15:56:56 |
Eric Snow |
juju-core/1.25: importance |
Undecided |
High |
|
2016-05-10 16:03:48 |
Eric Snow |
summary |
Invalid entity name or password error, causes Juju to uninstall |
"Invalid entity name or password" error with valid credentials. |
|
2016-05-10 16:05:31 |
Eric Snow |
description |
[Environment]
Juju-core 1.24.5 - 1.25.5 (Still reproducible with 1.24.7)
Trusty 14.04
[Description]
We noticed this behavior on multiple machines,
First for some unknown reason , the agent fails to authenticate on the state servers.
184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password
Then because of this code, the workers are stopped, and the uninstall of the agent occurs.
https://github.com/juju/juju/blob/4e3972b33f1fc9db13d0779a5a6b0ff7fab9cdb0/worker/apicaller/open.go#L66
184168-2015-10-22 09:06:02 INFO juju.worker runner.go:275 stopped "api", err: try again
184169-2015-10-22 09:06:02 DEBUG juju.worker runner.go:203 "api" done: try again
184170-2015-10-22 09:06:02 ERROR juju.worker runner.go:223 exited "api": try again
184171-2015-10-22 09:06:02 INFO juju.worker runner.go:261 restarting "api" in 3s
184172-2015-10-22 09:06:05 INFO juju.worker runner.go:269 start "api"
184173-2015-10-22 09:06:05 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184174-2015-10-22 09:06:05 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184175-2015-10-22 09:21:49 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password
184178:2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated
184179:2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated
184180:2015-10-22 09:21:50 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
184181-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "statestarter"
184182-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "termination"
184183-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "statestarter", err: <nil>
184184-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "termination", err: <nil>
184185-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "statestarter" done: <nil>
184186-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "statestarter" from known workers
184187-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "termination" done: <nil>
184188-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "termination" from known workers
184189-2015-10-22 09:21:50 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host
184190-2015-10-22 09:21:51 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host
184191-2015-10-22 09:21:51 INFO juju.cmd supercommand.go:436 command finished
After this point, you can observe messages such as:
/proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found
/proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found
[Other ways to reproduce]
1) Modify the apipassword on the /var/lib/juju/agents/machine-8/agent.conf
2) Restart jujud-machine-8
The following error is printed:
2015-11-10 14:14:28 ERROR juju.worker.apicaller open.go:169 agent terminating due to error returned during API open: invalid entity name or password
2015-11-10 14:14:28 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated
2015-11-10 14:14:28 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated
2015-11-10 14:14:28 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
3) /var/lib/juju dissapears
ls -lh /var/lib/jujuju
ls: cannot access /var/lib/juju: No such file or directory
Note:
Please note that we still don't know exactly what caused the authentication to fail, but, even if you are manually modifying the agent.conf , I don't think that a correct reaction is to wipe the juju agent from the machine. |
(agent uninstall addressed in lp:1580233 and lp:1580221)
[Environment]
Juju-core 1.24.5 - 1.25.5 (Still reproducible with 1.24.7)
Trusty 14.04
[Description]
We noticed this behavior on multiple machines,
First for some unknown reason , the agent fails to authenticate on the state servers.
184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password
Then because of this code, the workers are stopped, and the uninstall of the agent occurs.
https://github.com/juju/juju/blob/4e3972b33f1fc9db13d0779a5a6b0ff7fab9cdb0/worker/apicaller/open.go#L66
184168-2015-10-22 09:06:02 INFO juju.worker runner.go:275 stopped "api", err: try again
184169-2015-10-22 09:06:02 DEBUG juju.worker runner.go:203 "api" done: try again
184170-2015-10-22 09:06:02 ERROR juju.worker runner.go:223 exited "api": try again
184171-2015-10-22 09:06:02 INFO juju.worker runner.go:261 restarting "api" in 3s
184172-2015-10-22 09:06:05 INFO juju.worker runner.go:269 start "api"
184173-2015-10-22 09:06:05 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184174-2015-10-22 09:06:05 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184175-2015-10-22 09:21:49 INFO juju.api apiclient.go:331 dialing "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184176-2015-10-22 09:21:50 INFO juju.api apiclient.go:263 connection established to "wss://172.30.8.124:17070/environment/cb173249-d3a0-4667-8833-93bf8b804731/api"
184177-2015-10-22 09:21:50 ERROR juju.cmd.jujud agent.go:298 agent terminating due to error returned during API open: invalid entity name or password
184178:2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated
184179:2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated
184180:2015-10-22 09:21:50 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
184181-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "statestarter"
184182-2015-10-22 09:21:50 DEBUG juju.worker runner.go:248 killing "termination"
184183-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "statestarter", err: <nil>
184184-2015-10-22 09:21:50 INFO juju.worker runner.go:275 stopped "termination", err: <nil>
184185-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "statestarter" done: <nil>
184186-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "statestarter" from known workers
184187-2015-10-22 09:21:50 DEBUG juju.worker runner.go:203 "termination" done: <nil>
184188-2015-10-22 09:21:50 DEBUG juju.worker runner.go:227 no restart, removing "termination" from known workers
184189-2015-10-22 09:21:50 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host
184190-2015-10-22 09:21:51 DEBUG juju.service discovery.go:115 discovered init system "upstart" from local host
184191-2015-10-22 09:21:51 INFO juju.cmd supercommand.go:436 command finished
After this point, you can observe messages such as:
/proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found
/proc/self/fd/9: 9: exec: /var/lib/juju/tools/unit-nova-cloud-controller-0/jujud: not found
[Other ways to reproduce]
1) Modify the apipassword on the /var/lib/juju/agents/machine-8/agent.conf
2) Restart jujud-machine-8
The following error is printed:
2015-11-10 14:14:28 ERROR juju.worker.apicaller open.go:169 agent terminating due to error returned during API open: invalid entity name or password
2015-11-10 14:14:28 INFO juju.worker runner.go:275 stopped "api", err: agent should be terminated
2015-11-10 14:14:28 DEBUG juju.worker runner.go:203 "api" done: agent should be terminated
2015-11-10 14:14:28 ERROR juju.worker runner.go:212 fatal "api": agent should be terminated
3) /var/lib/juju dissapears
ls -lh /var/lib/jujuju
ls: cannot access /var/lib/juju: No such file or directory
Note:
Please note that we still don't know exactly what caused the authentication to fail, but, even if you are manually modifying the agent.conf , I don't think that a correct reaction is to wipe the juju agent from the machine. |
|
2016-05-11 21:05:00 |
Cheryl Jennings |
juju-core/1.25: milestone |
|
1.25.6 |
|
2016-05-11 21:05:02 |
Cheryl Jennings |
juju-core: milestone |
|
2.0-beta7 |
|
2016-05-13 16:21:59 |
Curtis Hovey |
juju-core: milestone |
2.0-beta7 |
2.0-beta8 |
|
2016-05-24 22:48:28 |
Anastasia |
juju-core/1.25: assignee |
|
Anastasia (anastasia-macmood) |
|
2016-05-24 22:48:31 |
Anastasia |
juju-core/1.25: status |
Triaged |
In Progress |
|
2016-05-26 22:14:45 |
Anastasia |
juju-core/1.25: importance |
High |
Critical |
|
2016-05-26 22:15:13 |
Anastasia |
tags |
|
blocker |
|
2016-05-27 06:09:19 |
Anastasia |
juju-core/1.25: status |
In Progress |
Fix Committed |
|
2016-05-27 06:09:26 |
Anastasia |
juju-core: assignee |
|
Anastasia (anastasia-macmood) |
|
2016-05-27 06:09:31 |
Anastasia |
juju-core: status |
Triaged |
In Progress |
|
2016-05-27 06:12:54 |
Anastasia |
tags |
blocker |
|
|
2016-05-27 08:22:40 |
Anastasia |
juju-core: importance |
High |
Critical |
|
2016-05-27 08:22:48 |
Anastasia |
tags |
|
blocker |
|
2016-05-27 16:15:10 |
Alexis Bruemmer |
juju-core: status |
In Progress |
Fix Committed |
|
2016-05-27 18:41:48 |
Curtis Hovey |
tags |
blocker |
|
|
2016-05-30 21:29:49 |
Anastasia |
juju-core: status |
Fix Committed |
In Progress |
|
2016-05-30 21:29:51 |
Anastasia |
juju-core/1.25: status |
Fix Committed |
In Progress |
|
2016-05-30 21:29:54 |
Anastasia |
juju-core: importance |
Critical |
High |
|
2016-05-30 21:29:56 |
Anastasia |
juju-core/1.25: importance |
Critical |
High |
|
2016-06-01 04:43:25 |
Anastasia |
juju-core/1.25: importance |
High |
Critical |
|
2016-06-01 04:43:36 |
Anastasia |
tags |
|
blocker |
|
2016-06-01 06:38:19 |
Anastasia |
juju-core: importance |
High |
Critical |
|
2016-06-02 21:38:07 |
Anastasia |
juju-core: status |
In Progress |
Fix Committed |
|
2016-06-02 21:38:10 |
Anastasia |
juju-core/1.25: status |
In Progress |
Fix Committed |
|
2016-06-03 17:52:30 |
Curtis Hovey |
juju-core: status |
Fix Committed |
Fix Released |
|
2016-06-07 20:18:21 |
Cheryl Jennings |
tags |
blocker |
|
|
2016-07-14 21:09:58 |
Curtis Hovey |
juju-core/1.25: status |
Fix Committed |
Fix Released |
|
2016-08-23 00:57:04 |
Canonical Juju QA Bot |
affects |
juju-core |
juju |
|
2016-08-23 00:57:04 |
Canonical Juju QA Bot |
juju: milestone |
2.0-beta8 |
|
|
2016-08-23 00:57:07 |
Canonical Juju QA Bot |
juju: milestone |
|
2.0-beta8 |
|
2016-08-23 13:24:31 |
Canonical Juju QA Bot |
juju-core: importance |
Undecided |
Critical |
|
2016-08-23 13:24:31 |
Canonical Juju QA Bot |
juju-core: status |
New |
Fix Released |
|
2016-08-23 13:24:31 |
Canonical Juju QA Bot |
juju-core: assignee |
|
Anastasia (anastasia-macmood) |
|