all units have false hook errors after reboot
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
juju-core |
Expired
|
High
|
Unassigned |
Bug Description
I have a multinode deployment with services running in LXC across nodes (HA). When I reboot a node, once it comes back up all units on that node show an error for the workload-status. This is an issue that has been seen in the past and supposedly fixed but I am running Juju 1.24.5 and I still see this.
# Juju status for one of the services:
environment: test
machines:
"1":
agent-state: started
agent-version: 1.24.5
dns-name: krueger.maas
instance-id: /MAAS/api/
series: trusty
containers:
1/lxc/0:
dns-name: 10.232.16.32
series: trusty
hardware: arch=amd64
hardware: arch=amd64 cpu-cores=32 mem=32768M tags=api
"2":
agent-state: started
agent-version: 1.24.5
dns-name: kearns.maas
instance-id: /MAAS/api/
series: trusty
containers:
2/lxc/0:
dns-name: 10.232.16.44
series: trusty
hardware: arch=amd64
hardware: arch=amd64 cpu-cores=32 mem=32768M tags=api
"3":
agent-state: started
agent-version: 1.24.5
dns-name: doble.maas
instance-id: /MAAS/api/
series: trusty
containers:
3/lxc/0:
dns-name: 10.232.16.45
series: trusty
hardware: arch=amd64
hardware: arch=amd64 cpu-cores=32 mem=32768M tags=api
services:
cinder:
charm: local:trusty/
exposed: false
service-status:
current: unknown
since: 17 Sep 2015 11:48:58Z
relations:
amqp:
- rabbitmq-server
ceph:
- ceph
cinder-
- nova-cloud-
cluster:
- cinder
ha:
- cinder-hacluster
identity-
- keystone
image-
- glance
shared-db:
- percona-cluster
units:
cinder/0:
current: unknown
since: 17 Sep 2015 11:46:25Z
current: idle
since: 24 Sep 2015 12:32:35Z
version: 1.24.5
machine: 1/lxc/0
cinder/1:
current: unknown
since: 17 Sep 2015 11:48:58Z
current: idle
since: 24 Sep 2015 12:32:21Z
version: 1.24.5
machine: 2/lxc/0
cinder/2:
current: unknown
since: 17 Sep 2015 11:48:51Z
current: idle
since: 24 Sep 2015 12:32:20Z
version: 1.24.5
machine: 3/lxc/0
cinder-hacluster:
charm: local:trusty/
exposed: false
service-status: {}
relations:
ha:
- cinder
hanode:
- cinder-hacluster
subordinate-to:
- cinder
networks:
maas-eth0:
provider-id: maas-eth0
cidr: 10.232.16.0/21
# Looking at the unit logs there are no errors:
ubuntu@
Warning: Permanently added '10.232.16.5' (ECDSA) to the list of known hosts.
Warning: Permanently added '10.232.16.32' (ECDSA) to the list of known hosts.
sudo: unable to resolve host juju-machine-
Connection to 10.232.16.32 closed.
On the juju state-server I see a ton of this in /var/log/
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
2015-09-24 12:26:48 ERROR juju.rpc server.go:573 error writing response: EOF
...
tags: | added: sts |
Changed in juju-core: | |
status: | New → Triaged |
importance: | Undecided → High |
milestone: | none → 1.25-beta2 |
Changed in juju-core: | |
milestone: | 1.25-beta2 → 1.24.7 |
Changed in juju-core: | |
importance: | High → Critical |
Changed in juju-core: | |
importance: | Critical → High |
Changed in juju-core: | |
milestone: | 1.24.7 → 1.24.8 |
Changed in juju-core: | |
milestone: | 1.24.8 → none |
Changed in juju-core: | |
assignee: | Cheryl Jennings (cherylj) → nobody |
Taking a look...