Comment 29 for bug 1817484

Revision history for this message
Dmitrii Shcherbakov (dmitriis) wrote :

Another update: I managed to reproduce the issue with missing regiond sockets.

See maas-vhost2 logs:
https://private-fileshare.canonical.com/~dima/maas-dumps/2019-03-01-maas-vhost1-2-3-etc-var-log.tar.gz

# no listening sockets
ubuntu@maas-vhost2:~$ sudo ss -tlpna 'sport = 5240'
State Recv-Q Send-Q Local Address:Port Peer Address:Port

# however, the processes are running
ubuntu@maas-vhost2:~$ pgrep -af regiond
1002 /bin/sh -c exec /usr/sbin/regiond 2>&1 | tee -a $LOGFILE
1004 /usr/bin/python3 /usr/sbin/regiond
1005 tee -a /var/log/maas/regiond.log
1967 /usr/bin/python3 /usr/sbin/regiond
1969 /usr/bin/python3 /usr/sbin/regiond
1972 /usr/bin/python3 /usr/sbin/regiond
1973 /usr/bin/python3 /usr/sbin/regiond

It definitely has the right content in the zone file so it receives DNS updates, however, no API sockets are present (port 5240):

ubuntu@maas-vhost2:~$ cat /etc/bind/maas/zone.test
; Zone file modified: 2019-02-28 23:26:43.245638.
$TTL 30
#...
@ 30 IN NS maas.
maas-region 0 IN A 10.100.1.2

ubuntu@maas-vhost1:~$ cat /etc/bind/maas/zone.test
; Zone file modified: 2019-02-28 23:26:32.619195.
$TTL 30
#...
@ 30 IN NS maas.
maas-region 0 IN A 10.100.1.2

Given our resource agent tries to use http://localhost:5240/MAAS for DNS update API calls the relevant timestamps can be seen by the "connection refused" errors reported by it:

root@maas-vhost2:~# grep -B1 -A1 -a refused /var/log/pacemaker.log
Feb 28 23:26:30 [1401] maas-vhost2 lrmd: notice: operation_finished: res_maas_region_hostname_start_0:2891:stderr [ sock.connect(sa) ]
Feb 28 23:26:30 [1401] maas-vhost2 lrmd: notice: operation_finished: res_maas_region_hostname_start_0:2891:stderr [ ConnectionRefusedError: [Errno 111] Connection refused ]
Feb 28 23:26:30 [1401] maas-vhost2 lrmd: notice: operation_finished: res_maas_region_hostname_start_0:2891:stderr [ ]
--
Feb 28 23:26:30 [1401] maas-vhost2 lrmd: notice: operation_finished: res_maas_region_hostname_start_0:2891:stderr [ raise URLError(err) ]
Feb 28 23:26:30 [1401] maas-vhost2 lrmd: notice: operation_finished: res_maas_region_hostname_start_0:2891:stderr [ urllib.error.URLError: <urlopen error [Errno 111] Connection refused> ]
Feb 28 23:26:30 [1401] maas-vhost2 lrmd: info: log_finished: finished - rsc:res_maas_region_hostname action:start call_id:25 pid:2891 exit-code:1 exec-time:9651ms queue-time:0ms

Maybe the sockets were gone even before that because I can see "request to http://127.0.0.1:5240/MAAS/metadata/2012-03-01/ failed" messages earlier in the regiond.log.