Charm installation fails with hook failed: "local-monitors-relation-changed"

Bug #2015344 reported by Bas de Bruijne
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
PostgreSQL Charm
New
Undecided
Unassigned

Bug Description

In test run [28fcfc8b-9180-48ed-9d51-7613538d11e9](https://solutions.qa.canonical.com/v2/testruns/28fcfc8b-9180-48ed-9d51-7613538d11e9), we install [a landscape bundle](https://oil-jenkins.canonical.com/artifacts/28fcfc8b-9180-48ed-9d51-7613538d11e9/generated/generated/lma-maas/bundle.yaml) but the postgresql charm fails with status:

```
landscape-postgresql/0* error idle 2 10.246.166.55 hook failed: "local-monitors-relation-changed"
  canonical-livepatch/9 active idle 10.246.166.55 Running kernel 5.15.0-69.76-generic, patchState: nothing-to-apply (source version/commit f1e83ae)
  filebeat/9 active idle 10.246.166.55 Filebeat ready.
  landscape-client/9 maintenance idle 10.246.166.55 Need computer-title and juju-info to proceed
  logrotated/9 active idle 10.246.166.55 Unit is ready.
  nrpe/9 active idle 10.246.166.55 icmp,5666/tcp Ready
  ntp/9 active idle 10.246.166.55 123/udp chrony: Ready, OK: offset is 0.000054
  telegraf/9 active idle 10.246.166.55 9103/tcp Monitoring landscape-postgresql/0 (source version/commit 23.01-4-...)
landscape-rabbitmq-server/0* waiting idle 7 10.246.166.155 5672/tcp,15672/tcp Not reached target cluster-partition-handling mode
  canonical-livepatch/3 active idle 10.246.166.155 Running kernel 5.15.0-69.76-generic, patchState: nothing-to-apply (source version/commit f1e83ae)
  filebeat/3 active idle 10.246.166.155 Filebeat ready.
  landscape-client/3 maintenance idle 10.246.166.155 Need computer-title and juju-info to proceed
  logrotated/3 active idle 10.246.166.155 Unit is ready.
  nrpe/3 active idle 10.246.166.155 icmp,5666/tcp Ready
  ntp/4 active idle 10.246.166.155 123/udp chrony: Ready, OK: offset is 0.000053
  telegraf/3 active idle 10.246.166.155 9103/tcp Monitoring landscape-rabbitmq-server/0 (source version/commit 23.01-4-...)
landscape-server-haproxy/0* active idle 5 10.246.165.90 80/tcp,443/tcp Unit is ready
  canonical-livepatch/1 active idle 10.246.165.90 Running kernel 5.4.0-146.163-generic, patchState: nothing-to-apply (source version/commit f1e83ae)
  filebeat/1 active idle 10.246.165.90 Filebeat ready.
  landscape-client/1 maintenance idle 10.246.165.90 Need computer-title and juju-info to proceed
  logrotated/1 active idle 10.246.165.90 Unit is ready.
  nrpe/1 active idle 10.246.165.90 icmp,5666/tcp Ready
  ntp/1 active idle 10.246.165.90 123/udp chrony: Ready, OK: offset is 0.000184
  telegraf/1 active idle 10.246.165.90 9103/tcp Monitoring landscape-server-haproxy/0 (source version/commit 23.01-4-...)
landscape-server/0* waiting idle 6 10.246.167.157 Waiting on relations: db
  canonical-livepatch/11 active idle 10.246.167.157 Running kernel 5.15.0-69.76-generic, patchState: nothing-to-apply (source version/commit f1e83ae)
  filebeat/11 active idle 10.246.167.157 Filebeat ready.
  landscape-client/11 maintenance idle 10.246.167.157 Need computer-title and juju-info to proceed
  logrotated/11 active idle 10.246.167.157 Unit is ready.
  nrpe/11 active idle 10.246.167.157 icmp,5666/tcp Ready
  ntp/11 active idle 10.246.167.157 123/udp chrony: Ready, OK: offset is 0.000163
  telegraf/11 active idle 10.246.167.157 9103/tcp Monitoring landscape-server/0 (source version/commit 23.01-4-...)
```

In the debug-log we see:
```
unit-landscape-postgresql-0: 09:34:36 INFO unit.landscape-postgresql/0.juju-log local-monitors:63: Setting charm primary status True
unit-landscape-postgresql-0: 09:34:36 ERROR unit.landscape-postgresql/0.juju-log local-monitors:63: Hook error:
Traceback (most recent call last):
  File "/usr/lib/python3.10/shutil.py", line 815, in move
    os.rename(src, real_dst)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmpx8jh2efa' -> '/var/lib/postgresql/scripts/find_latest_ready_wal.py'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-landscape-postgresql-0/.venv/lib/python3.10/site-packages/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "/var/lib/juju/agents/unit-landscape-postgresql-0/.venv/lib/python3.10/site-packages/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "/var/lib/juju/agents/unit-landscape-postgresql-0/.venv/lib/python3.10/site-packages/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "/var/lib/juju/agents/unit-landscape-postgresql-0/.venv/lib/python3.10/site-packages/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-landscape-postgresql-0/charm/reactive/postgresql/nagios.py", line 117, in update_nrpe_config
    helpers.write(check_script_path, check_script, mode=0o755)
  File "/var/lib/juju/agents/unit-landscape-postgresql-0/charm/reactive/postgresql/helpers.py", line 75, in write
    shutil.move(f.name, path)
  File "/usr/lib/python3.10/shutil.py", line 835, in move
    copy_function(src, real_dst)
  File "/usr/lib/python3.10/shutil.py", line 434, in copy2
    copyfile(src, dst, follow_symlinks=follow_symlinks)
  File "/usr/lib/python3.10/shutil.py", line 256, in copyfile
    with open(dst, 'wb') as fdst:
FileNotFoundError: [Errno 2] No such file or directory: '/var/lib/postgresql/scripts/find_latest_ready_wal.py'
```

Maybe the `/var/lib/postgresql/scripts/` dir needs to be created first? I'm not sure why it does install successfully in some cases.

Crashdumps and configs can be found here:
https://oil-jenkins.canonical.com/artifacts/28fcfc8b-9180-48ed-9d51-7613538d11e9/index.html

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.