Intermittent DNS server non-responsiveness
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Charm Test Infra |
Invalid
|
Undecided
|
Unassigned |
Bug Description
On multiple occasions I have observed long-running deployments, either on ServerStack or in Icarus MAAS, have charms error out during update-status hook.
Looking at one of the occurrences of this issue it appears to be a issue with the DNS server not responding (see Traceback below).
While the charms running of other stuff in update-status hook is a generic challenge with the reactive framework we need to battle, it does not get us away from the fact that we appear to have intermittent DNS server (or networking) issues in our lab.
Example Traceback:
2019-10-09 09:21:42 DEBUG juju-log tracer>
tracer: hooks phase, 1 handlers queued
tracer: ++ queue handler reactive/
2019-10-09 09:21:42 INFO juju-log Invoking reactive handler: reactive/
2019-10-09 09:21:42 DEBUG juju-log tracer: set flag run-default-
2019-10-09 09:21:43 DEBUG juju-log tracer>
tracer: main dispatch loop, 7 handlers queued
tracer: ++ queue handler hooks/relations
tracer: ++ queue handler hooks/relations
tracer: ++ queue handler reactive/
tracer: ++ queue handler reactive/
tracer: ++ queue handler reactive/
tracer: ++ queue handler reactive/
tracer: ++ queue handler reactive/
2019-10-09 09:21:43 INFO juju-log Invoking reactive handler: reactive/
2019-10-09 09:21:43 DEBUG juju-log tracer>
tracer: cleared flag run-default-
tracer: -- dequeue handler reactive/
2019-10-09 09:21:43 INFO juju-log Invoking reactive handler: reactive/
2019-10-09 09:22:13 ERROR juju-log Hook error:
Traceback (most recent call last):
File "/var/lib/
bus.
File "/var/lib/
_invoke(
File "/var/lib/
handler.
File "/var/lib/
self.
File "/var/lib/
for cn, req in instance.
File "/var/lib/
json_
File "/var/lib/
req.
File "/var/lib/
'cn': get_hostname(ip),
File "/var/lib/
result = ns_query(rev)
File "/var/lib/
answers = dns.resolver.
File "/var/lib/
lifetime)
File "/var/lib/
timeout = self._compute_
File "/var/lib/
raise Timeout(
dns.exception.
Actually, the DNS server appears to be dead in general atm!
root@juju- 7153cf- 0-lxd-3: ~# systemd-resolve --status
16.172. in-addr. arpa
168.192. in-addr. arpa
17.172. in-addr. arpa
18.172. in-addr. arpa
19.172. in-addr. arpa
20.172. in-addr. arpa
21.172. in-addr. arpa
22.172. in-addr. arpa
23.172. in-addr. arpa
24.172. in-addr. arpa
25.172. in-addr. arpa
26.172. in-addr. arpa
27.172. in-addr. arpa
28.172. in-addr. arpa
29.172. in-addr. arpa
30.172. in-addr. arpa
31.172. in-addr. arpa
corp
d.f.ip6. arpa
home
internal
intranet
lan
local
private
test
Global
DNSSEC NTA: 10.in-addr.arpa
Link 29 (eth0) 7153cf- 0-lxd-3: ~# ip addr show UP,LOWER_ UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 MULTICAST, UP,LOWER_ UP> mtu 9000 qdisc noqueue state UP group default qlen 1000 3eff:fe59: 2b82/64 scope link 7153cf- 0-lxd-3: ~# time host 10.246.114.24 10.246.112.3
Current Scopes: DNS
LLMNR setting: yes
MulticastDNS setting: no
DNSSEC setting: no
DNSSEC supported: no
DNS Servers: 10.246.112.3
DNS Domain: maas
root@juju-
1: lo: <LOOPBACK,
link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
inet 127.0.0.1/8 scope host lo
valid_lft forever preferred_lft forever
inet6 ::1/128 scope host
valid_lft forever preferred_lft forever
29: eth0@if30: <BROADCAST,
link/ether 00:16:3e:59:2b:82 brd ff:ff:ff:ff:ff:ff link-netnsid 0
inet 10.246.114.24/21 brd 10.246.119.255 scope global eth0
valid_lft forever preferred_lft forever
inet6 fe80::216:
valid_lft forever preferred_lft forever
root@juju-
;; connection timed out; no servers could be reached
real 0m10.015s
user 0m0.004s
sys 0m0.011s