Unable to deploy lxd vm's on new 3.5.0~rc4-16292-g.18b753d78 install with new DB

Bug #2067474 reported by Jeff Rivero
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Fix Committed
High
Anton Troyanov
3.5
Fix Committed
High
Anton Troyanov

Bug Description

root@mgt:/home/jeffreyr# maas status
Service Startup Current Since
agent disabled inactive -
apiserver enabled active yesterday at 15:55 UTC
bind9 disabled active yesterday at 15:56 UTC
dhcpd disabled active yesterday at 16:05 UTC
dhcpd6 disabled inactive -
http disabled active yesterday at 15:56 UTC
ntp disabled active yesterday at 15:56 UTC
proxy disabled active yesterday at 16:26 UTC
rackd enabled active yesterday at 15:55 UTC
regiond enabled active yesterday at 15:55 UTC
syslog disabled active yesterday at 15:56 UTC
temporal disabled active yesterday at 15:56 UTC
temporal-worker disabled active yesterday at 15:56 UTC

Looks like Agent is down.

Getting error in agent log

journalctl -u snap.maas.pebble -t maas-agent

May 29 06:54:01 mgt maas-agent[150201]: INF Starting power-service Attempt=1 Namespace=default RunID=5324438b-9cef-46ab-8f84-1727df54e6ef TaskQueue=cfypnq@agent:main WorkerID=cfypnq@agent:150201 WorkflowID=configure-power-service:cfypnq WorkflowType=configure-power-service
May 29 06:54:01 mgt maas-agent[150201]: ERR Workflow configure-agent failed error="workflow execution error (type: configure-agent, workflowID: d2cc4390-1423-4d2d-b8d3-3097b9068f83, runID: 6820963b-d2ea-4434-810b-01d74bde8f5f): child workflow execution error (type: configure-httpproxy-service, workflowID: configure-httpproxy-service:cfypnq, runID: 7e9df91d-bddd-4817-bcd9-8d221d2b50b6, initiatedEventID: 14, startedEventID: 15): listen unix /run/snap.maas/httpproxy.sock: bind: no such file or directory (type: OpError, retryable: true): bind: no such file or directory (type: SyscallError, retryable: true): no such file or directory (type: Errno, retryable: true)"
May 29 06:54:02 mgt maas-agent[150224]: INF Logger is configured with log level "info"

Related branches

Revision history for this message
Jacopo Rota (r00ta) wrote :

Hi Jeff,

thank you very much for reporting this.

I suspect this is related to https://bugs.launchpad.net/maas/+bug/2060288 . If you have the possibility to test, could you please try to enlist or commission or deploy a machine and check on the console of the machine that the booloader has size 0?

I faced the bug I linked above in the last couple of days but I thought it was somehow a misconfiguration of my test env.

Revision history for this message
Anton Troyanov (troyanov) wrote :

Hi Jeff,

Can you please provide the output of:
```
ls -al /run/snap.maas
```

Agent is using the following method to start the listener (socket is created by net.Listen)
```
s.listener, err = net.Listen("unix", s.socketPath)
if err != nil {
    return err
}
```

Revision history for this message
Jeff Rivero (jeffrm2) wrote :

jeffreyr@mgt:~$ ls -al /run/snap.maas
ls: cannot access '/run/snap.maas': No such file or directory
jeffreyr@mgt:~$

Revision history for this message
Anton Troyanov (troyanov) wrote :

Can you please provide exact steps you followed during MAAS installation?

I just did a clean install in a VM and failed to reproduce it:

lxc launch ubuntu:jammy --vm -c limits.cpu=4 -c limits.memory=4GiB
lxc shell game-shrew

root@game-shrew:~# snap install maas-test-db --channel 3.5/edge
root@game-shrew:~# snap install maas --channel 3.5/candidate
root@game-shrew:~# maas init region+rack
root@game-shrew:~# maas createadmin --username admin --password admin --email admin@localhost --ssh-import gh:troyanov

root@game-shrew:~# ls -al /run/snap.maas
total 0
drwxr-xr-x 3 root root 80 May 29 13:46 .
drwxr-xr-x 28 root root 820 May 29 13:46 ..
drwxr-x--- 2 root root 80 May 29 13:46 chrony
srw-rw---- 1 root root 0 May 29 13:46 httpproxy.sock

root@game-shrew:~# journalctl -u snap.maas.pebble -t maas-agent
May 29 13:46:26 game-shrew maas-agent[2652]: INF Logger is configured with log level "info"
May 29 13:46:26 game-shrew maas-agent[2652]: INF Started Worker Namespace=default TaskQueue=w6t3nr@agent:main WorkerID=w6t3nr@agent:2652
May 29 13:46:27 game-shrew maas-agent[2652]: INF Started Worker Namespace=default TaskQueue=agent:power@vlan-1 WorkerID=w6t3nr@agent:2652
May 29 13:46:27 game-shrew maas-agent[2652]: INF Started Worker Namespace=default TaskQueue=agent:power@vlan-1 WorkerID=w6t3nr@agent:2652
May 29 13:46:27 game-shrew maas-agent[2652]: INF Started Worker Namespace=default TaskQueue=w6t3nr@agent:power WorkerID=w6t3nr@agent:2652
May 29 13:46:27 game-shrew maas-agent[2652]: INF Starting power-service Attempt=1 Namespace=default RunID=ef734129-b998-4e3e-9414-179f3ff1ee97 TaskQueue=w6t3nr@agent:main>
May 29 13:46:27 game-shrew maas-agent[2652]: INF Starting httpproxy-service Attempt=1 Namespace=default RunID=01702bd7-8aae-499b-b7f5-469b40df529b TaskQueue=w6t3nr@agent:>
May 29 13:46:27 game-shrew maas-agent[2652]: INF Service MAAS Agent started

Revision history for this message
Jeff Rivero (jeffrm2) wrote :

Sure

sudo snap install --channel=3.5/candidate maas

sudo maas init region+rack --database-uri "postgres://$MAAS_DBUSER:$MAAS_DBPASS@localhost/$MAAS_DBNAME"

then

sudo maas createadmin

Revision history for this message
Anton Troyanov (troyanov) wrote :

Seems to be the same (except I am using maas-test-db, but that doesn't make any difference)

Can you please provide output for the following commands after you do `sudo snap restart maas`
journalctl -u snap.maas.pebble -t maas.pebble --since '1 minute ago'
journalctl -u snap.maas.pebble -t maas-agent --since '1 minute ago'

Revision history for this message
Jeff Rivero (jeffrm2) wrote :
Download full text (4.3 KiB)

root@mgt:/home/jaya# journalctl -u snap.maas.pebble -t maas.pebble --since '1 minute ago'
May 29 14:40:31 mgt maas.pebble[2520]: 2024-05-29T14:40:31.182Z [pebble] GET /v1/services?names=http 140.168µs 200
May 29 14:40:31 mgt maas.pebble[2520]: 2024-05-29T14:40:31.182Z [pebble] GET /v1/services?names=dhcpd 126.317µs 200
May 29 14:40:31 mgt maas.pebble[2520]: 2024-05-29T14:40:31.182Z [pebble] GET /v1/services?names=dhcpd6 89.666µs 200
May 29 14:40:31 mgt maas.pebble[2520]: 2024-05-29T14:40:31.182Z [pebble] GET /v1/services?names=agent 61.456µs 200
May 29 14:40:31 mgt maas.pebble[2520]: 2024-05-29T14:40:31.325Z [pebble] POST /v1/services 135.230643ms 202
May 29 14:40:31 mgt maas.pebble[2520]: 2024-05-29T14:40:31.449Z [pebble] Service "agent" starting: sh -c "exec systemd-cat -t maas-agent $SNAP/bin/run-maas-agent"
May 29 14:40:31 mgt maas.pebble[2520]: 2024-05-29T14:40:31.930Z [pebble] Change 5480 task (Start service "agent") failed: cannot start service: exited quickly with code>
May 29 14:40:32 mgt maas.pebble[2520]: 2024-05-29T14:40:32.607Z [pebble] POST /v1/services 148.632495ms 202
May 29 14:40:33 mgt maas.pebble[2520]: 2024-05-29T14:40:33.058Z [pebble] Service "agent" starting: sh -c "exec systemd-cat -t maas-agent $SNAP/bin/run-maas-agent"
May 29 14:40:33 mgt maas.pebble[2520]: 2024-05-29T14:40:33.607Z [pebble] Change 5481 task (Start service "agent") failed: cannot start service: exited quickly with code>
May 29 14:40:34 mgt maas.pebble[2520]: 2024-05-29T14:40:34.709Z [pebble] GET /v1/services?names=bind9 103.808µs 200
May 29 14:40:34 mgt maas.pebble[2520]: 2024-05-29T14:40:34.709Z [pebble] GET /v1/services?names=ntp 107.891µs 200
May 29 14:40:34 mgt maas.pebble[2520]: 2024-05-29T14:40:34.709Z [pebble] GET /v1/services?names=http 91.584µs 200
May 29 14:40:34 mgt maas.pebble[2520]: 2024-05-29T14:40:34.709Z [pebble] GET /v1/services?names=syslog 86.714µs 200
May 29 14:40:34 mgt maas.pebble[2520]: 2024-05-29T14:40:34.709Z [pebble] GET /v1/services?names=temporal 64.24µs 200
May 29 14:40:34 mgt maas.pebble[2520]: 2024-05-29T14:40:34.709Z [pebble] GET /v1/services?names=temporal-worker 68.63µs 200
May 29 14:40:34 mgt maas.pebble[2520]: 2024-05-29T14:40:34.730Z [pebble] GET /v1/services?names=proxy 83.358µs 200
May 29 14:41:01 mgt maas.pebble[2520]: 2024-05-29T14:41:01.207Z [pebble] GET /v1/services?names=http 150.391µs 200
May 29 14:41:01 mgt maas.pebble[2520]: 2024-05-29T14:41:01.207Z [pebble] GET /v1/services?names=dhcpd 121.378µs 200
May 29 14:41:01 mgt maas.pebble[2520]: 2024-05-29T14:41:01.207Z [pebble] GET /v1/services?names=dhcpd6 123.708µs 200
May 29 14:41:01 mgt maas.pebble[2520]: 2024-05-29T14:41:01.207Z [pebble] GET /v1/services?names=agent 89.917µs 200
May 29 14:41:01 mgt maas.pebble[2520]: 2024-05-29T14:41:01.379Z [pebble] POST /v1/services 164.349115ms 202
May 29 14:41:01 mgt maas.pebble[2520]: 2024-05-29T14:41:01.524Z [pebble] Service "agent" starting: sh -c "exec systemd-cat -t maas-agent $SNAP/bin/run-maas-agent"
May 29 14:41:02 mgt maas.pebble[2520]: 2024-05-29T14:41:02.018Z [pebble] Change 5482 task (Start service "agent") failed: cannot start service: exited quickly with code>
May 29 14:41:02 mgt maas.pebble[2520]: 2024-05...

Read more...

Revision history for this message
Jeff Rivero (jeffrm2) wrote :
Download full text (4.4 KiB)

root@mgt:/home/jaya# journalctl -u snap.maas.pebble -t maas-agent --since '1 minute ago'
May 29 14:40:31 mgt maas-agent[226412]: INF Logger is configured with log level "info"
May 29 14:40:31 mgt maas-agent[226412]: INF Started Worker Namespace=default TaskQueue=cfypnq@agent:main WorkerID=cfypnq@agent:226412
May 29 14:40:31 mgt maas-agent[226412]: INF Started Worker Namespace=default TaskQueue=agent:power@vlan-1 WorkerID=cfypnq@agent:226412
May 29 14:40:31 mgt maas-agent[226412]: INF Started Worker Namespace=default TaskQueue=agent:power@vlan-1 WorkerID=cfypnq@agent:226412
May 29 14:40:31 mgt maas-agent[226412]: INF Started Worker Namespace=default TaskQueue=agent:power@vlan-1 WorkerID=cfypnq@agent:226412
May 29 14:40:31 mgt maas-agent[226412]: INF Started Worker Namespace=default TaskQueue=cfypnq@agent:power WorkerID=cfypnq@agent:226412
May 29 14:40:31 mgt maas-agent[226412]: INF Starting power-service Attempt=1 Namespace=default RunID=c949fe24-506b-4db3-a02e-e3648740ea28 TaskQueue=cfypnq@agent:main Wo>
May 29 14:40:31 mgt maas-agent[226412]: ERR Workflow configure-agent failed error="workflow execution error (type: configure-agent, workflowID: 78a64c7f-53d3-437f-a6b3->
May 29 14:40:33 mgt maas-agent[226435]: INF Logger is configured with log level "info"
May 29 14:40:33 mgt maas-agent[226435]: INF Started Worker Namespace=default TaskQueue=cfypnq@agent:main WorkerID=cfypnq@agent:226435
May 29 14:40:33 mgt maas-agent[226435]: INF Started Worker Namespace=default TaskQueue=agent:power@vlan-1 WorkerID=cfypnq@agent:226435
May 29 14:40:33 mgt maas-agent[226435]: INF Started Worker Namespace=default TaskQueue=agent:power@vlan-1 WorkerID=cfypnq@agent:226435
May 29 14:40:33 mgt maas-agent[226435]: INF Started Worker Namespace=default TaskQueue=agent:power@vlan-1 WorkerID=cfypnq@agent:226435
May 29 14:40:33 mgt maas-agent[226435]: INF Started Worker Namespace=default TaskQueue=cfypnq@agent:power WorkerID=cfypnq@agent:226435
May 29 14:40:33 mgt maas-agent[226435]: INF Starting power-service Attempt=1 Namespace=default RunID=55900644-a745-47dd-a167-122d8aaabb6e TaskQueue=cfypnq@agent:main Wo>
May 29 14:40:33 mgt maas-agent[226435]: ERR Workflow configure-agent failed error="workflow execution error (type: configure-agent, workflowID: 1e680b0f-5537-443f-b1a9->
May 29 14:41:01 mgt maas-agent[226467]: INF Logger is configured with log level "info"
May 29 14:41:01 mgt maas-agent[226467]: INF Started Worker Namespace=default TaskQueue=cfypnq@agent:main WorkerID=cfypnq@agent:226467
May 29 14:41:01 mgt maas-agent[226467]: INF Started Worker Namespace=default TaskQueue=agent:power@vlan-1 WorkerID=cfypnq@agent:226467
May 29 14:41:01 mgt maas-agent[226467]: INF Started Worker Namespace=default TaskQueue=agent:power@vlan-1 WorkerID=cfypnq@agent:226467
May 29 14:41:01 mgt maas-agent[226467]: INF Started Worker Namespace=default TaskQueue=agent:power@vlan-1 WorkerID=cfypnq@agent:226467
May 29 14:41:01 mgt maas-agent[226467]: INF Started Worker Namespace=default TaskQueue=cfypnq@agent:power WorkerID=cfypnq@agent:226467
May 29 14:41:01 mgt maas-agent[226467]: INF Starting power-service Attempt=1 Namespace=default RunID=337b70f5-f75b-47f5-b655-0eb1194498f0 TaskQueue=...

Read more...

Revision history for this message
Jacopo Rota (r00ta) wrote :

Could you also run
```
snap version
```
and paste here the result?

Revision history for this message
Anton Troyanov (troyanov) wrote :

Jeff,

Which version of Ubuntu are you running?

Revision history for this message
Anton Troyanov (troyanov) wrote :

That might also be useful. It seems that we have an issue and `/run/snap.maas` is created only when chrony is started.

journalctl -u snap.maas.pebble -t chronyd

Revision history for this message
Jeff Rivero (jeffrm2) wrote :

May 29 14:41:03 mgt maas-agent[226490]: ERR Workflow configure-agent failed error="workflow execution error (type: configure-agent, workflowID: 96552f3d-2548-4307-8775->
root@mgt:/home/jaya# snap version
snap 2.63
snapd 2.63
series 16
ubuntu 22.04
kernel 5.15.0-107-generic

Revision history for this message
Jeff Rivero (jeffrm2) wrote :

root@mgt:/home/jaya# cat /etc/os-release
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
root@mgt:/home/jaya#

Revision history for this message
Jeff Rivero (jeffrm2) wrote :

root@mgt:/home/jaya# journalctl -u snap.maas.pebble -t chronyd
-- No entries --

Revision history for this message
Anton Troyanov (troyanov) wrote :

Any tracebacks or errors in the journalctl -u snap.maas.pebble -t maas-regiond?

When agent starts it pushes configure-agent workflow which should be picked up by the regiond. So it might be something with the regiond

Revision history for this message
Jeff Rivero (jeffrm2) wrote :
Download full text (3.6 KiB)

No errors, some Warns

May 29 11:26:47 mgt maas-regiond[2564]: maasserver.region_controller: [warn] The dynamic dns update notification '' is not valid. It will be dropped.
May 29 11:26:47 mgt maas-regiond[2564]: * ip 10.11.11.119 linked to resource juju-16749e-1 on zone maas
May 29 11:26:47 mgt maas-regiond[2564]: * zone maas added resource juju-16749e-1
May 29 11:26:47 mgt maas-regiond[2564]: maasserver.region_controller: [warn] The dynamic dns update notification '' is not valid. It will be dropped.
May 29 11:26:47 mgt maas-regiond[2564]: maasserver.region_controller: [warn] The dynamic dns update notification '' is not valid. It will be dropped.
May 29 11:26:48 mgt maas-regiond[2564]: maasserver.region_controller: [warn] The dynamic dns update notification '' is not valid. It will be dropped.
May 29 11:26:49 mgt maas-regiond[2564]: * ip 10.11.11.125 linked to resource juju-16749e-9 on zone maas
May 29 11:26:49 mgt maas-regiond[2564]: * zone maas added resource juju-16749e-9
May 29 11:26:49 mgt maas-regiond[2564]: * ip 10.11.11.124 linked to resource juju-16749e-5 on zone maas
May 29 11:26:49 mgt maas-regiond[2564]: * zone maas added resource juju-16749e-5
May 29 11:26:49 mgt maas-regiond[2564]: * ip 10.11.11.123 linked to resource juju-16749e-3 on zone maas
May 29 11:26:49 mgt maas-regiond[2564]: * zone maas added resource juju-16749e-3
May 29 11:26:49 mgt maas-regiond[2564]: * ip 10.11.11.122 linked to resource juju-16749e-7 on zone maas
May 29 11:26:49 mgt maas-regiond[2564]: * zone maas added resource juju-16749e-7
May 29 11:26:49 mgt maas-regiond[2564]: * ip 10.11.11.121 linked to resource juju-16749e-2 on zone maas
May 29 11:26:49 mgt maas-regiond[2564]: * zone maas added resource juju-16749e-2
May 29 11:26:49 mgt maas-regiond[2564]: * ip 10.11.11.120 linked to resource juju-16749e-8 on zone maas
May 29 11:26:49 mgt maas-regiond[2564]: * zone maas added resource juju-16749e-8
May 29 11:26:50 mgt maas-regiond[2564]: maasserver.region_controller: [warn] The dynamic dns update notification '' is not valid. It will be dropped.
May 29 11:26:52 mgt maas-regiond[2564]: maasserver.region_controller: [warn] The dynamic dns update notification '' is not valid. It will be dropped.
May 29 11:26:52 mgt maas-regiond[2564]: * ip 10.11.11.126 linked to resource juju-16749e-4 on zone maas
May 29 11:26:52 mgt maas-regiond[2564]: * zone maas added resource juju-16749e-4
May 29 11:26:54 mgt maas-regiond[2564]: * ip 10.11.11.127 linked to resource juju-16749e-0 on zone maas
May 29 11:26:54 mgt maas-regiond[2564]: * zone maas added resource juju-16749e-0
May 29 11:27:08 mgt maas-regiond[2564]: maasserver.region_controller: [warn] The dynamic dns update notification '' is not valid. It will be dropped.
May 29 11:27:08 mgt maas-regiond[2564]: * ip 10.11.11.128 linked to resource juju-16749e-6 on zone maas
May 29 11:27:08 mgt maas-regiond[2564]: * zone maas added resource juju-16749e-6
May 29 11:28:18 mgt maas-regiond[2564]: maasserver.region_controller: [warn] The dynamic dns upd...

Read more...

Revision history for this message
Anton Troyanov (troyanov) wrote :

And what about:
journalctl -fu snap.maas.pebble -t maas-rackd
journalctl -fu snap.maas.pebble -t maas-temporal
journalctl -fu snap.maas.pebble -t maas-temporal-worker

Revision history for this message
Anton Troyanov (troyanov) wrote :

To summarise:

1. Issue with a missing `/run/snap.maas` happened because chrony didn't start (it is not clear to me why)
The proper fix for this would be to create this path using snap install hook
https://code.launchpad.net/~troyanov/maas/+git/maas/+merge/466606

2. Agent should call `configure-agent` workflow that is sent towards `maas-temporal`
3. Regiond has `maas-temporal-worker` running aside that should provide all the required configurations back to the agent

So if agent is failing to start, then it might be either regiond & temporal-worker or something wrong with the temporal itself.

When you were installing snap did you get any errors? Who is the owner of maas database?

Revision history for this message
Jeff Rivero (jeffrm2) wrote :

root@mgt:/home/jeffreyr# journalctl -fu snap.maas.pebble -t maas-rackd
May 29 16:10:33 mgt maas-rackd[2568]: return g.throw(self.type, self.value, self.tb)
May 29 16:10:33 mgt maas-rackd[2568]: File "/snap/maas/35434/lib/python3.10/site-packages/provisioningserver/utils/service_monitor.py", line 375, in restartService
May 29 16:10:33 mgt maas-rackd[2568]: yield self._performServiceAction(service, "restart")
May 29 16:10:33 mgt maas-rackd[2568]: File "/snap/maas/35434/usr/lib/python3/dist-packages/twisted/internet/defer.py", line 1660, in _inlineCallbacks
May 29 16:10:33 mgt maas-rackd[2568]: result = current_context.run(gen.send, result)
May 29 16:10:33 mgt maas-rackd[2568]: File "/snap/maas/35434/lib/python3.10/site-packages/provisioningserver/utils/service_monitor.py", line 696, in _performServiceAction
May 29 16:10:33 mgt maas-rackd[2568]: raise ServiceActionError(error_msg)
May 29 16:10:33 mgt maas-rackd[2568]: provisioningserver.utils.service_monitor.ServiceActionError: Service 'agent' failed to restart: Pebble change 5841 failed with an error: cannot perform the following tasks:
May 29 16:10:33 mgt maas-rackd[2568]: - Start service "agent" (cannot start service: exited quickly with code 1)
May 29 16:10:33 mgt maas-rackd[2568]:

Revision history for this message
Jeff Rivero (jeffrm2) wrote :
Download full text (9.6 KiB)

root@mgt:/home/jeffreyr# journalctl -fu snap.maas.pebble -t maas-temporal
May 29 10:56:31 mgt maas-temporal[2739]: {"level":"error","ts":"2024-05-29T10:56:31.726Z","msg":"Persistent store operation failure","service":"matching","component":"matching-engine","wf-task-queue-name":"mgt:f2b8310e-6e8f-47f5-a81b-1d4c48e9ea9d","wf-task-queue-type":"Workflow","wf-namespace":"default","store-operation":"update-task-queue","error":"context canceled","logging-call-at":"task_reader.go:214","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/build/temporal-8dYnj9/temporal-1.22.5/src/common/log/zap_logger.go:156\ngo.temporal.io/server/service/matching.(*taskReader).getTasksPump\n\t/build/temporal-8dYnj9/temporal-1.22.5/src/service/matching/task_reader.go:214\ngo.temporal.io/server/internal/goro.(*Group).Go.func1\n\t/build/temporal-8dYnj9/temporal-1.22.5/src/internal/goro/group.go:58"}
May 29 10:56:31 mgt maas-temporal[2739]: {"level":"error","ts":"2024-05-29T10:56:31.726Z","msg":"Persistent store operation failure","service":"matching","component":"matching-engine","wf-task-queue-name":"mgt:0c6fe162-ecb7-4473-a0ec-2189f0914c3e","wf-task-queue-type":"Workflow","wf-namespace":"default","store-operation":"update-task-queue","error":"context canceled","logging-call-at":"task_reader.go:214","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/build/temporal-8dYnj9/temporal-1.22.5/src/common/log/zap_logger.go:156\ngo.temporal.io/server/service/matching.(*taskReader).getTasksPump\n\t/build/temporal-8dYnj9/temporal-1.22.5/src/service/matching/task_reader.go:214\ngo.temporal.io/server/internal/goro.(*Group).Go.func1\n\t/build/temporal-8dYnj9/temporal-1.22.5/src/internal/goro/group.go:58"}
May 29 11:15:31 mgt maas-temporal[2739]: {"level":"error","ts":"2024-05-29T11:15:31.746Z","msg":"Persistent store operation failure","service":"matching","component":"matching-engine","wf-task-queue-name":"mgt:1982de8a-96fb-455a-81b5-ed0af235f6a8","wf-task-queue-type":"Workflow","wf-namespace":"default","store-operation":"update-task-queue","error":"context canceled","logging-call-at":"task_reader.go:214","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/build/temporal-8dYnj9/temporal-1.22.5/src/common/log/zap_logger.go:156\ngo.temporal.io/server/service/matching.(*taskReader).getTasksPump\n\t/build/temporal-8dYnj9/temporal-1.22.5/src/service/matching/task_reader.go:214\ngo.temporal.io/server/internal/goro.(*Group).Go.func1\n\t/build/temporal-8dYnj9/temporal-1.22.5/src/internal/goro/group.go:58"}
May 29 11:53:04 mgt maas-temporal[2739]: {"level":"error","ts":"2024-05-29T11:53:04.223Z","msg":"Persistent store operation failure","service":"matching","component":"matching-engine","wf-task-queue-name":"mgt:1d1e3068-776a-4b07-b393-b5c447ffa35a","wf-task-queue-type":"Workflow","wf-namespace":"default","store-operation":"update-task-queue","error":"context canceled","logging-call-at":"task_reader.go:214","stacktrace":"go.temporal.io/server/common/log.(*zapLogger).Error\n\t/build/temporal-8dYnj9/temporal-1.22.5/src/common/log/zap_logger.go:156\ngo.temporal.io/server/service/matching.(*taskReader).getTasksPump\n\t/build/temporal-8dYnj9...

Read more...

Revision history for this message
Jeff Rivero (jeffrm2) wrote :

root@mgt:/home/jeffreyr# journalctl -fu snap.maas.pebble -t maas-temporal-worker

Revision history for this message
Anton Troyanov (troyanov) wrote :

Is it a clean 22.04 without any configuration? Maybe some tweaks to apparmor were made?

I will need a full journal in order to understand what could go wrong
journalctl -u snap.maas.pebble --no-tail > /tmp/maas.log

Revision history for this message
Jeff Rivero (jeffrm2) wrote :

Log attached

Revision history for this message
Anton Troyanov (troyanov) wrote :

Thanks. I will try to go through it tomorrow.

Meanwhile, can you please try the following:

mkdir /run/snap.maas
sudo snap restart maas

Changed in maas:
milestone: none → 3.6.0
milestone: 3.6.0 → none
Revision history for this message
Jeff Rivero (jeffrm2) wrote :

root@mgt:/home/jeffreyr# htop
root@mgt:/home/jeffreyr# mkdir /run/snap.maas
root@mgt:/home/jeffreyr# sudo snap restart maas

root@mgt:/home/jeffreyr# maas status
Service Startup Current Since
agent disabled active today at 18:22 UTC
apiserver enabled active today at 18:22 UTC
bind9 disabled active today at 18:22 UTC
dhcpd disabled active today at 18:22 UTC
dhcpd6 disabled inactive -
http disabled active today at 18:22 UTC
ntp disabled active today at 18:22 UTC
proxy disabled active today at 18:22 UTC
rackd enabled active today at 18:22 UTC
regiond enabled active today at 18:22 UTC
syslog disabled active today at 18:22 UTC
temporal disabled active today at 18:22 UTC
temporal-worker disabled active today at 18:22 UTC

Revision history for this message
Anton Troyanov (troyanov) wrote :

And what is the error reported by the agent this time?
> journalctl -u snap.maas.pebble -t maas-agent

Revision history for this message
Jeff Rivero (jeffrm2) wrote :

No errors, that fixed it (I assumed it would) but wanted to figure out why its broke, I am able to deploy now

Revision history for this message
Anton Troyanov (troyanov) wrote :

Thats good to know. Thank you for the provided information, we will try to figure out how that happened.

One thing that I still don't understand is why there were no socket and PID files for `chrony` and so far I don't see any errors related to `chrony`.

1. MAAS is crafting the following config:
(.ve) ubuntu@maas:~/maas$ sudo cat /var/snap/maas/x1/etc/chrony/maas.conf
# MAAS NTP configuration.
hwtimestamp *
pool ntp.ubuntu.com iburst
local stratum 8 orphan
allow
dumpdir /run/snap.maas/chrony
pidfile /run/snap.maas/chrony/chronyd.pid
bindcmdaddress /run/snap.maas/chrony/chronyd.sock

2. Chrony creates /run/snap.maas/chrony in a way similar to `mkdir -p`
3. Agent should recover if there is no dir, but eventually it will be created.

We will update the logic in the Agent to check if `/run/snap.maas` needs to be created

Changed in maas:
status: New → Triaged
milestone: none → 3.6.0
assignee: nobody → Anton Troyanov (troyanov)
importance: Undecided → High
Changed in maas:
status: Triaged → Fix Committed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.