Restarting snap maas 3.3.2 breaks MAAS DNS

Bug #2017684 reported by Jean-Fabrice Bobo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
MAAS
Invalid
Medium
Unassigned

Bug Description

While MaaS (snap flavor) DNS is working as expected after a fresh server starts and correctly serving the MaaS Zone, manually restarting maas (using `snap maas restart`) makes the DNS component (bind9) to restart twice, not properly loading the MaaS DNS zone the second time.

See log details in https://discourse.maas.io/t/restarting-snap-maas-3-3-2-breaks-maas-dns/7022

As asked in the above thread, I'm performing a `snap maas restart` and attaching:
* the bind9 logs
* the rackd logs
* the regiond logs

```
root@maas:/home/ubuntu# snap list
Name Version Rev Tracking Publisher Notes
core 16-2.58.2 14789 latest/stable canonical✓ core
core18 20230320 2724 latest/stable canonical✓ base
core20 20230308 1856 latest/stable canonical✓ base
core22 20230325 611 latest/stable canonical✓ base
lxd 5.13-cea5ee2 24761 latest/stable/… canonical✓ -
maas 3.3.2-13177-g.a73a6e2bd 27110 3.3/stable canonical✓ -
maas-cli 0.6.8 81 latest/stable canonical✓ -
root@maas:/home/ubuntu# lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.2 LTS
Release: 22.04
Codename: jammy
root@maas:/home/ubuntu# uname -a
Linux maas 5.15.0-1027-raspi #29-Ubuntu SMP PREEMPT Mon Apr 3 10:12:21 UTC 2023 aarch64 aarch64 aarch64 GNU/Linux
root@maas:/home/ubuntu# host nuc10i7fnh-bm1.maas localhost
Using domain server:
Name: localhost
Address: 127.0.0.1#53
Aliases:

nuc10i7fnh-bm1.maas has address 192.168.10.61
root@maas:/home/ubuntu# date && snap restart maas
Tue Apr 25 16:25:39 UTC 2023
2023-04-25T16:25:40Z INFO Waiting for "snap.maas.supervisor.service" to stop.
Restarted.
root@maas:/home/ubuntu# date && host nuc10i7fnh-bm1.maas localhost
Tue Apr 25 16:26:53 UTC 2023
Using domain server:
Name: localhost
Address: 127.0.0.1#53
Aliases:

Host nuc10i7fnh-bm1.maas not found: 2(SERVFAIL)
```

Revision history for this message
Jean-Fabrice Bobo (jeanfabrice) wrote :
Revision history for this message
Alberto Donato (ack) wrote :

could you please provide also the output of `maas $profile region-controllers read`?

Changed in maas:
status: New → Incomplete
Revision history for this message
Jean-Fabrice Bobo (jeanfabrice) wrote :
Download full text (6.8 KiB)

Sure things!
```
root@maas:/home/ubuntu# maas $profile region-controllers read
Success.
Machine-readable output follows:
[
    {
        "current_testing_result_id": null,
        "interface_set": [
            {
                "sriov_max_vf": 0,
                "type": "physical",
                "numa_node": 0,
                "name": "eth0",
                "params": "",
                "vendor": null,
                "system_id": "tydnmg",
                "discovered": null,
                "enabled": true,
                "interface_speed": 1000,
                "mac_address": "dc:a6:32:45:61:99",
                "link_speed": 1000,
                "vlan": {
                    "vid": 0,
                    "mtu": 1500,
                    "dhcp_on": true,
                    "external_dhcp": null,
                    "relay_vlan": null,
                    "secondary_rack": null,
                    "id": 5001,
                    "fabric_id": 0,
                    "space": "undefined",
                    "primary_rack": "tydnmg",
                    "name": "untagged",
                    "fabric": "fabric-0",
                    "resource_uri": "/MAAS/api/2.0/vlans/5001/"
                },
                "effective_mtu": 1500,
                "children": [],
                "firmware_version": null,
                "link_connected": true,
                "parents": [],
                "id": 3,
                "tags": [],
                "product": null,
                "links": [
                    {
                        "id": 714,
                        "mode": "static",
                        "ip_address": "192.168.10.4",
                        "subnet": {
                            "name": "LAN",
                            "description": "",
                            "vlan": {
                                "vid": 0,
                                "mtu": 1500,
                                "dhcp_on": true,
                                "external_dhcp": null,
                                "relay_vlan": null,
                                "secondary_rack": null,
                                "id": 5001,
                                "fabric_id": 0,
                                "space": "undefined",
                                "primary_rack": "tydnmg",
                                "name": "untagged",
                                "fabric": "fabric-0",
                                "resource_uri": "/MAAS/api/2.0/vlans/5001/"
                            },
                            "cidr": "192.168.10.0/24",
                            "rdns_mode": 2,
                            "gateway_ip": "192.168.10.1",
                            "dns_servers": [
                                "192.168.10.14"
                            ],
                            "allow_dns": true,
                            "allow_proxy": true,
                            "active_discovery": false,
                            "managed": false,
                            "disabled_boot_architectures": [],
                            "id": 12,
                            "space": "undefined",
                   ...

Read more...

Revision history for this message
Alberto Donato (ack) wrote :

The only issue I'm seeing in logs is the following error, which could be related to the region failing to update its status. Unfortunately the assertion error doesn't contain much info, but I've seen this error raised in other contexts as well

2023-04-25 16:28:33 maasserver: [error] ################################ Exception: ################################
2023-04-25 16:28:33 maasserver: [error] Traceback (most recent call last):
  File "/snap/maas/27110/usr/lib/python3/dist-packages/django/core/handlers/base.py", line 181, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "/snap/maas/27110/lib/python3.10/site-packages/maasserver/utils/views.py", line 293, in view_atomic_with_post_commit_savepoint
    return view_atomic(*args, **kwargs)
  File "/usr/lib/python3.10/contextlib.py", line 79, in inner
    return func(*args, **kwds)
  File "/snap/maas/27110/lib/python3.10/site-packages/maasserver/api/support.py", line 62, in __call__
    response = super().__call__(request, *args, **kwargs)
  File "/snap/maas/27110/usr/lib/python3/dist-packages/django/views/decorators/vary.py", line 20, in inner_func
    response = func(*args, **kwargs)
  File "/snap/maas/27110/usr/lib/python3.10/dist-packages/piston3/resource.py", line 197, in __call__
    result = self.error_handler(e, request, meth, em_format)
  File "/snap/maas/27110/usr/lib/python3.10/dist-packages/piston3/resource.py", line 195, in __call__
    result = meth(request, *args, **kwargs)
  File "/snap/maas/27110/lib/python3.10/site-packages/maasserver/api/support.py", line 370, in dispatch
    return function(self, request, *args, **kwargs)
  File "/snap/maas/27110/lib/python3.10/site-packages/metadataserver/api.py", line 860, in signal
    target_status = process(node, request, status)
  File "/snap/maas/27110/lib/python3.10/site-packages/metadataserver/api.py", line 682, in _process_commissioning
    self._store_results(
  File "/snap/maas/27110/lib/python3.10/site-packages/metadataserver/api.py", line 565, in _store_results
    script_result.store_result(
  File "/snap/maas/27110/lib/python3.10/site-packages/metadataserver/models/scriptresult.py", line 270, in store_result
    assert self.status in SCRIPT_STATUS_RUNNING_OR_PENDING
AssertionError

Changed in maas:
status: Incomplete → Triaged
importance: Undecided → High
milestone: none → 3.4.0
Revision history for this message
Jean-Fabrice Bobo (jeanfabrice) wrote :

Anything I can help with?

summary: - Restarting snap maas 3.3.2 breaks MaaS DNS
+ Restarting snap maas 3.3.2 breaks MAAS DNS
Changed in maas:
importance: High → Medium
milestone: 3.4.0 → 3.5.0
Revision history for this message
Jean-Fabrice Bobo (jeanfabrice) wrote :

Cannot reproduce after upgrading to 3.4.0.
I think will can close the issue for now. Will reopen if the issue arises again.

Jacopo Rota (r00ta)
Changed in maas:
status: Triaged → Invalid
Changed in maas:
milestone: 3.5.0 → 3.5.0-beta1
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.