DNS updates are consumed concurrently, leading to an incorrect nsupdate payload
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
MAAS |
Fix Released
|
High
|
Jacopo Rota | ||
3.3 |
Fix Released
|
High
|
Jacopo Rota | ||
3.4 |
Fix Released
|
High
|
Jacopo Rota | ||
3.5 |
Fix Released
|
High
|
Jacopo Rota |
Bug Description
SOLQA run https:/
The 3 regions
10.246.64.5
10.246.64.6
10.246.64.7
Have a different record set for the zone `zone.64.
10.246.64.5 (looks good):
212 PTR eth0.meowth.
PTR meowth.
213 PTR eth0.duosion.
PTR duosion.
214 PTR eth0.mudkip.
PTR mudkip.
10.246.64.6 (NOT good):
211 PTR eth0.taillow.
PTR taillow.
212 PTR eth0.meowth.
PTR meowth.
214 PTR eth0.mudkip.
PTR mudkip.
10.246.64.7 (looks good):
212 PTR eth0.meowth.
PTR meowth.
213 PTR eth0.duosion.
PTR duosion.
214 PTR eth0.mudkip.
PTR mudkip.
All of 3 send the same nsupdate payload, but with a different order of records.
The 10.246.64.5 sends
server localhost
zone 64.246.
...
update delete 213.64.
update delete 213.64.
update add 213.64.
update add 213.64.
...
and looking at the bind logs, the records are first deleted and then added
Nov 25 20:24:17 anahuac named[1232742]: client @0x7fa57002dab8 ::1#36021/key maas: updating zone '64.246.
Nov 25 20:24:17 anahuac named[1232742]: client @0x7fa57002dab8 ::1#36021/key maas: updating zone '64.246.
Nov 25 20:24:17 anahuac named[1232742]: client @0x7fa57002dab8 ::1#36021/key maas: updating zone '64.246.
Nov 25 20:24:17 anahuac named[1232742]: client @0x7fa57002dab8 ::1#36021/key maas: updating zone '64.246.
so this region looks good and it's actually ok.
The 10.246.64.6, the faulty one, sends
update add 213.64.
update add 213.64.
update delete 213.64.
update delete 213.64.
i.e. first add and then delete. Looking at the bind logs,
Nov 25 20:24:18 noma named[2789281]: client @0x7f193801e098 ::1#35153/key maas: updating zone '64.246.
Nov 25 20:24:18 noma named[2789281]: client @0x7f193801e098 ::1#35153/key maas: updating zone '64.246.
Nov 25 20:24:18 noma named[2789281]: client @0x7f193801e098 ::1#35153/key maas: updating zone '64.246.
Nov 25 20:24:18 noma named[2789281]: client @0x7f193801e098 ::1#35153/key maas: updating zone '64.246.
it is adding and then deleting.
The region 10.246.64.7 is exactly doing the same as 10.246.64.5 and it works fine.
One possible issue is that in the RegionControlle
The handler should be a synchronous function.
By patching the function with
def queueDynamicDNS
"""
Called when the `sys_dns_update` message is received
and queues updates for existing domains
"""
print("XXX: " + str(message))
if message == "":
return
(new_updates, need_reload) = yield deferToDatabase(
)
self.
)
if self._dns_
else:
import time
print("XXX: " + str(message) + " SLEEPING")
time.sleep(2)
print("XXX: " + str(message) + " RETURNING")
As you can see the execution is interleaved
2023-11-29 15:26:27 stdout: [info] XXX: DELETE maas maas-edge A 10.0.1.18
2023-11-29 15:26:27 stdout: [info] XXX: DELETE maas eth0.maas-edge A 10.0.1.18
2023-11-29 15:26:27 stdout: [info] XXX: INSERT maas maas-edge A 0 10.0.1.18
2023-11-29 15:26:27 stdout: [info] XXX: INSERT maas eth0.maas-edge A 0 10.0.1.18
2023-11-29 15:26:27 stdout: [info] XXX: DELETE maas maas-edge A 10.0.2.254
2023-11-29 15:26:27 stdout: [info] XXX: DELETE maas eth1.maas-edge A 10.0.2.254
2023-11-29 15:26:27 stdout: [info] XXX: INSERT maas maas-edge A 0 10.0.2.254
2023-11-29 15:26:27 stdout: [info] XXX: INSERT maas eth1.maas-edge A 0 10.0.2.254
2023-11-29 15:26:27 stdout: [info] XXX: DELETE maas maas-edge A 10.115.195.1
2023-11-29 15:26:27 stdout: [info] XXX: DELETE maas lxdbr0.maas-edge A 10.115.195.1
2023-11-29 15:26:27 stdout: [info] XXX: INSERT maas maas-edge A 0 10.115.195.1
2023-11-29 15:26:27 stdout: [info] XXX: INSERT maas lxdbr0.maas-edge A 0 10.115.195.1
2023-11-29 15:26:27 stdout: [info] XXX: DELETE maas maas-edge A fd42:271e:
2023-11-29 15:26:27 stdout: [info] XXX: DELETE maas lxdbr0.maas-edge A fd42:271e:
2023-11-29 15:26:27 stdout: [info] XXX: INSERT maas maas-edge A 0 fd42:271e:
2023-11-29 15:26:27 stdout: [info] XXX: INSERT maas lxdbr0.maas-edge A 0 fd42:271e:
2023-11-29 15:26:27 stdout: [info] XXX: RELOAD
2023-11-29 15:26:27 stdout: [info] XXX: DELETE maas maas-edge A 10.0.2.254 SLEEPING
2023-11-29 15:26:29 stdout: [info] XXX: DELETE maas maas-edge A 10.0.2.254 RETURNING
2023-11-29 15:26:29 stdout: [info] XXX: DELETE maas eth0.maas-edge A 10.0.1.18 SLEEPING
2023-11-29 15:26:31 stdout: [info] XXX: DELETE maas eth0.maas-edge A 10.0.1.18 RETURNING
2023-11-29 15:26:31 stdout: [info] XXX: DELETE maas eth1.maas-edge A 10.0.2.254 SLEEPING
2023-11-29 15:26:33 stdout: [info] XXX: DELETE maas eth1.maas-edge A 10.0.2.254 RETURNING
2023-11-29 15:26:33 stdout: [info] XXX: DELETE maas maas-edge A 10.0.1.18 SLEEPING
2023-11-29 15:26:35 stdout: [info] XXX: DELETE maas maas-edge A 10.0.1.18 RETURNING
2023-11-29 15:26:35 stdout: [info] XXX: DELETE maas lxdbr0.maas-edge A 10.115.195.1 SLEEPING
2023-11-29 15:26:37 stdout: [info] XXX: DELETE maas lxdbr0.maas-edge A 10.115.195.1 RETURNING
Related branches
- Jacopo Rota: Approve
- MAAS Lander: Approve
-
Diff: 719 lines (+163/-77)7 files modifiedsrc/maasserver/eventloop.py (+8/-3)
src/maasserver/region_controller.py (+21/-13)
src/maasserver/tests/test_eventloop.py (+13/-1)
src/maasserver/tests/test_plugin.py (+2/-0)
src/maasserver/tests/test_region_controller.py (+73/-56)
src/maasserver/utils/dbtasks.py (+25/-4)
src/maasserver/utils/tests/test_dbtasks.py (+21/-0)
- Jacopo Rota: Approve
- MAAS Lander: Approve
-
Diff: 719 lines (+163/-77)7 files modifiedsrc/maasserver/eventloop.py (+8/-3)
src/maasserver/region_controller.py (+21/-13)
src/maasserver/tests/test_eventloop.py (+13/-1)
src/maasserver/tests/test_plugin.py (+2/-0)
src/maasserver/tests/test_region_controller.py (+73/-56)
src/maasserver/utils/dbtasks.py (+25/-4)
src/maasserver/utils/tests/test_dbtasks.py (+21/-0)
- Christian Grabowski: Approve
- MAAS Lander: Approve
-
Diff: 719 lines (+163/-77)7 files modifiedsrc/maasserver/eventloop.py (+8/-3)
src/maasserver/region_controller.py (+21/-13)
src/maasserver/tests/test_eventloop.py (+13/-1)
src/maasserver/tests/test_plugin.py (+2/-0)
src/maasserver/tests/test_region_controller.py (+73/-56)
src/maasserver/utils/dbtasks.py (+25/-4)
src/maasserver/utils/tests/test_dbtasks.py (+21/-0)
- MAAS Lander: Approve
- Jacopo Rota: Approve
-
Diff: 152 lines (+133/-0)2 files modifiedsrc/maasserver/region_controller.py (+1/-0)
src/maasserver/tests/test_region_controller.py (+132/-0)
Changed in maas: | |
milestone: | 3.5.0 → 3.5.0-beta1 |
Changed in maas: | |
status: | Fix Committed → Fix Released |
According to the failed run with verbose logging here https:/ /solutions. qa.canonical. com/testruns/ 312e80e0- b17a-466d- be30-e03788fbfd 6c
we are still consuming the Postgres notifications concurrently
2023-12-07 20:03:26 maasserver. region_ controller: [info] [Dynamic update e2abeef3- 46e1-4d89- 8725-6177713b41 81] start processing message DELETE silo2.lab0. solutionsqa landscapeamqp-3-4 A 10.246.64.192 region_ controller: [info] [Dynamic update 3ef5ea26- 48c6-460a- a765-4b01945b23 21] start processing message DELETE silo2.lab0. solutionsqa ubuntu A 10.246.64.192 region_ controller: [info] [Dynamic update d606a8ad- 6288-4b3c- af21-5cff388a9d bd] start processing message INSERT silo2.lab0. solutionsqa landscapeamqp-3-4 A 0 10.246.64.192 region_ controller: [info] [Dynamic update 6c557e22- b621-4031- baf1-d77136f3e9 4e] start processing message region_ controller: [info] [Dynamic update 3ef5ea26- 48c6-460a- a765-4b01945b23 21] _dns_requires_ full_reload = False, need_reload = False region_ controller: [info] [Dynamic update 3ef5ea26- 48c6-460a- a765-4b01945b23 21] requires full reload False region_ controller: [info] [Dynamic update 3ef5ea26- 48c6-460a- a765-4b01945b23 21] dns update is NOT in progress. The updates are [] and the new updates are [DynamicDNSUpda te(operation= 'DELETE' , name='ubuntu. silo2.lab0. solutionsqa' , zone='silo2. lab0.solutionsq a', rectype='A', rev_zone=None, ttl=None, subnet=None, answer= '10.246. 64.192' , ip=None)] region_ controller: [info] [Dynamic update e2abeef3- 46e1-4d89- 8725-6177713b41 81] _dns_requires_ full_reload = False, need_reload = False region_ controller: [info] [Dynamic update e2abeef3- 46e1-4d89- 8725-6177713b41 81] requires full reload False region_ controller: [info] [Dynamic update e2abeef3- 46e1-4d89- 8725-6177713b41 81] dns update is NOT in progress. The updates are [DynamicDNSUpda te(operation= 'DELETE' , name='ubuntu. silo2.lab0. solutionsqa' , zone='silo2. lab0.solutionsq a', rectype='A', rev_zone=None, ttl=None, subnet=None, answer= '10.246. 64.192' , ip=None)] and the new updates are [DynamicDNSUpda te(operation= 'DELETE' , name='landscape amqp-3- 4.silo2. lab0.solutionsq a', zone='silo2. lab0.solutionsq a', rectype='A', rev_zone=None, ttl=None, subnet=None, answer= '10.246. 64.192' , ip=None)] region_ controller: [info] [Dynamic update d606a8ad- 6288-4b3c- af21-5cff388a9d bd] _dns_requires_ full_reload = False, need_reload = False region_ controller: [info] [Dynamic update d606a8ad- 6288-4b3c- af21-5cff388a9d bd] requires full reload False region_ controller: [info] [Dynamic update d606a8ad- 6288-4b3c- af21-5cff388a9d bd] dns update is NOT in progress. The updates are [DynamicDNSUpda te(operation= 'DELETE' , name='ubuntu. silo2.lab0. solutionsqa' , zone='silo2. lab0.solutionsq a', rectype='A', rev_zone=None, ttl=None, subnet=None, answer= '10.246. 64.192' , ip=None), DynamicDNSUpdat e(operation= 'DELETE' , name='landscape am...
2023-12-07 20:03:26 maasserver.
2023-12-07 20:03:26 maasserver.
2023-12-07 20:03:26 maasserver.
2023-12-07 20:03:26 maasserver.
2023-12-07 20:03:26 maasserver.
2023-12-07 20:03:26 maasserver.
2023-12-07 20:03:26 maasserver.
2023-12-07 20:03:26 maasserver.
2023-12-07 20:03:26 maasserver.
2023-12-07 20:03:26 maasserver.
2023-12-07 20:03:26 maasserver.
2023-12-07 20:03:26 maasserver.