A lot of charms in error state, can't set or get relations in different units.
For example running: relation-get --format=json -r dashboards:209 - grafana/0 in debug-hooks results in 'ERROR permission denied'
stracing this results in this:
However this file '/var/lib/juju/agents/unit-grafana-0/agent.socket' doesn't exist, there is run.socket, but not agent.socket:
newfstatat(AT_FDCWD, ".", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
newfstatat(AT_FDCWD, "/var/lib/juju/agents/unit-grafana-0/charm", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
newfstatat(AT_FDCWD, ".", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
newfstatat(AT_FDCWD, "/var/lib/juju/agents/unit-grafana-0/charm", {st_mode=S_IFDIR|0755, st_size=4096, ...}, 0) = 0
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC|SOCK_NONBLOCK, 0) = 3
setsockopt(3, SOL_SOCKET, SO_BROADCAST, [1], 4) = 0
connect(3, {sa_family=AF_UNIX, sun_path=@"/var/lib/juju/agents/unit-grafana-0/agent.socket"}, 51) = 0
epoll_create1(EPOLL_CLOEXEC) = 4
pipe2([5, 6], O_NONBLOCK|O_CLOEXEC) = 0
epoll_ctl(4, EPOLL_CTL_ADD, 5, {EPOLLIN, {u32=12757296, u64=12757296}}) = 0
epoll_ctl(4, EPOLL_CTL_ADD, 3, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=4000936304, u64=139796596486512}}) = 0
getsockname(3, {sa_family=AF_UNIX}, [112->2]) = 0
getpeername(3, {sa_family=AF_UNIX, sun_path=@"/var/lib/juju/agents/unit-grafana-0/agent.socket"}, [112->51]) = 0
write(3, "/\377\201\3\1\1\7Request\1\377\202\0\1\2\1\rServiceMet"..., 344) = 344
read(3, 0xc00016d000, 4096) = -1 EAGAIN (Resource temporarily unavailable)
epoll_pwait(4, [{EPOLLOUT, {u32=4000936304, u64=139796596486512}}], 128, 0, NULL, 2) = 1
epoll_pwait(4, [{EPOLLIN|EPOLLOUT, {u32=4000936304, u64=139796596486512}}], 128, -1, NULL, 0) = 1
futex(0xc001b8, FUTEX_WAKE_PRIVATE, 1) = 1
read(3, ":\377\201\3\1\1\10Response\1\377\202\0\1\3\1\rServiceMe"..., 4096) = 165
futex(0xc000046bc8, FUTEX_WAKE_PRIVATE, 1) = 1
read(3, 0xc00016d000, 4096) = -1 EAGAIN (Resource temporarily unavailable)
write(1, "", 0) = 0
write(2, "ERROR permission denied\n", 24ERROR permission denied
) = 24
epoll_ctl(4, EPOLL_CTL_DEL, 3, 0xc00011da5c) = 0
close(3) = 0
exit_group(1) = ?
+++ exited with 1 +++
Seen in 2.9.16 with a update-status hook failing on relation-get. When we restarted the unit workers, it turns out the relation had been removed.
In this case as well unit prometheus- libvirt- exporter/ 91 was the leader. Though 5 of 42 units had the error.
unit-prometheus -libvirt- exporter- 126: 15:49:47 WARNING unit.prometheus -libvirt- exporter/ 91.update- status subprocess. CalledProcessEr ror: Command '['relation-get', '--format=json', '-r', 'scrape:713', '-', 'prometheus/0']' returned non-zero exit status 1. -libvirt- exporter- 126: 15:49:47 ERROR juju.worker. uniter. operation hook "update-status" (via explicit, bespoke hook script) failed: exit status 1 -libvirt- exporter- 126: 15:49:46 INFO unit.prometheus -libvirt- exporter/ 91.juju- log Reactive main running for hook update-status -libvirt- exporter- 126: 15:49:47 WARNING unit.prometheus -libvirt- exporter/ 91.update- status ERROR permission denied
unit-prometheus
...
unit-prometheus
unit-prometheus
2022-02-17 17:19:30 WARNING unit.prometheus -libvirt- exporter/ 91.update- status logger.go:60 subprocess. CalledProcessEr ror: Command '['relation-get', '--format=json', '-r', 'scrape:668', '-', 'prometheus- libvirt- exporter/ 91']' return uniter. operation runhook.go:146 hook "update-status" (via explicit, bespoke libvirt- exporter/ 91" shutting down: catacomb 0xc0006dc000 is dying libvirt- exporter/ 91" apicaller connect.go:158 [666cbd] "unit-prometheu s-libvirt- exporter- 91" successfully connected to "10.130. 12.22:17070" migrationminion worker.go:140 migration phase is now: NONE upgrader upgrader.go:219 no waiter, upgrader is done uniter. relation statemanager.go:68 unit prometheus/0 in relation 713 no longer exists libvirt- exporter/ 91" started -libvirt- exporter/ 91.juju- log server.go:327 Reactive main running for hook update-status -libvirt- exporter/ 91.juju- log server.go:327 Initializing Snap Layer -libvirt- exporter/ 91.juju- log server.go:327 Initializing Leadership L...
ed non-zero exit status 1.
2022-02-17 17:19:31 ERROR juju.worker.
hook script) failed: exit status 1
2022-02-17 17:19:31 INFO juju.worker.uniter resolver.go:150 awaiting error resolution for "update-status" hook
2022-02-17 17:21:03 INFO juju.worker.logger logger.go:136 logger worker stopped
2022-02-17 17:21:03 INFO juju.worker.uniter uniter.go:323 unit "prometheus-
2022-02-17 17:22:42 INFO juju unit_agent.go:253 Starting unit workers for "prometheus-
2022-02-17 17:22:42 INFO juju.agent.setup agentconf.go:128 setting logging config to "<root>=INFO"
2022-02-17 17:22:42 INFO juju.worker.
2022-02-17 17:22:42 INFO juju.worker.
2022-02-17 17:22:42 INFO juju.worker.logger logger.go:120 logger worker started
2022-02-17 17:22:42 INFO juju.worker.
2022-02-17 17:22:43 WARNING juju.worker.
2022-02-17 17:22:43 INFO juju.worker.uniter uniter.go:339 unit "prometheus-
2022-02-17 17:22:43 INFO juju.worker.uniter uniter.go:357 hooks are retried true
2022-02-17 17:22:43 INFO juju.worker.uniter resolver.go:150 awaiting error resolution for "update-status" hook
2022-02-17 17:22:48 INFO juju.worker.uniter resolver.go:150 awaiting error resolution for "update-status" hook
2022-02-17 17:22:49 INFO unit.prometheus
2022-02-17 17:22:49 INFO unit.prometheus
2022-02-17 17:22:49 INFO unit.prometheus