2021-08-24 09:03:17 |
TJ |
bug |
|
|
added bug |
2021-08-24 09:12:17 |
TJ |
description |
With systemd v245 (and v247) and systemd-resolved we're seeing frequent problems due to resolved rapidly closing the socket on which it sends out a query before the server has answered. The server answers and then resolved sends an ICMP Destination Unreachable (Port Unreachable) response!
This breaks name lookups frequently. In our case the DNS server is reached via a Wireguard tunnel over a satellite link and latencies can vary.
A typical example captured via tcpdump:
07:22:03.446919 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338 > fddc:7e00:e001:ee00::1.53: 2963+ [1au] AAAA? contile-images.services.mozilla.com. (64)
07:22:03.501089 IP6 fddc:7e00:e001:ee00::1.53 > fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338: 2963 1/0/1 AAAA 2a01:7e00:e001:ee64::2278:7366 (92)
07:22:03.501152 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 > fddc:7e00:e001:ee00::1: ICMP6, destination unreachable, unreachable port, fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 udp port 45338, length 148
The time difference here is only 0.054170 and there is no way to alter the timeout in resolved.
There are recent upstream commits to fix this which ought to be cherry-picked. See:
https://github.com/systemd/systemd/issues/17421
https://github.com/systemd/systemd/pull/17535
https://github.com/systemd/systemd/commit/e03d156f78cb5a0cac85d1e1310d89fdfa4f1b88 |
Afffects Ubuntu 18.04 through 21.04 (fixes are in systemd v248)
With systemd v245 (and v247) and systemd-resolved we're seeing frequent problems due to resolved rapidly closing the socket on which it sends out a query before the server has answered. The server answers and then resolved sends an ICMP Destination Unreachable (Port Unreachable) response!
This breaks name lookups frequently. In our case the DNS server is reached via a Wireguard tunnel over a satellite link and latencies can vary.
A typical example captured via tcpdump:
07:22:03.446919 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338 > fddc:7e00:e001:ee00::1.53: 2963+ [1au] AAAA? contile-images.services.mozilla.com. (64)
07:22:03.501089 IP6 fddc:7e00:e001:ee00::1.53 > fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338: 2963 1/0/1 AAAA 2a01:7e00:e001:ee64::2278:7366 (92)
07:22:03.501152 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 > fddc:7e00:e001:ee00::1: ICMP6, destination unreachable, unreachable port, fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 udp port 45338, length 148
The time difference here is only 0.054170 and there is no way to alter the timeout in resolved.
There are recent upstream commits to fix this which ought to be cherry-picked. See:
https://github.com/systemd/systemd/issues/17421
https://github.com/systemd/systemd/pull/17535
https://github.com/systemd/systemd/commit/e03d156f78cb5a0cac85d1e1310d89fdfa4f1b88 |
|
2021-08-24 09:24:26 |
TJ |
description |
Afffects Ubuntu 18.04 through 21.04 (fixes are in systemd v248)
With systemd v245 (and v247) and systemd-resolved we're seeing frequent problems due to resolved rapidly closing the socket on which it sends out a query before the server has answered. The server answers and then resolved sends an ICMP Destination Unreachable (Port Unreachable) response!
This breaks name lookups frequently. In our case the DNS server is reached via a Wireguard tunnel over a satellite link and latencies can vary.
A typical example captured via tcpdump:
07:22:03.446919 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338 > fddc:7e00:e001:ee00::1.53: 2963+ [1au] AAAA? contile-images.services.mozilla.com. (64)
07:22:03.501089 IP6 fddc:7e00:e001:ee00::1.53 > fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338: 2963 1/0/1 AAAA 2a01:7e00:e001:ee64::2278:7366 (92)
07:22:03.501152 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 > fddc:7e00:e001:ee00::1: ICMP6, destination unreachable, unreachable port, fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 udp port 45338, length 148
The time difference here is only 0.054170 and there is no way to alter the timeout in resolved.
There are recent upstream commits to fix this which ought to be cherry-picked. See:
https://github.com/systemd/systemd/issues/17421
https://github.com/systemd/systemd/pull/17535
https://github.com/systemd/systemd/commit/e03d156f78cb5a0cac85d1e1310d89fdfa4f1b88 |
Afffects Ubuntu 18.04 through 21.04 (fixes are in systemd v248)
With systemd v245 (and v247) and systemd-resolved we're seeing frequent problems due to resolved rapidly closing the socket on which it sends out a query before the server has answered. The server answers and then resolved sends an ICMP Destination Unreachable (Port Unreachable) response!
This breaks name lookups frequently. In our case the DNS server is reached via a Wireguard tunnel over a satellite link and latencies can vary.
A typical example captured via tcpdump:
07:22:03.446919 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338 > fddc:7e00:e001:ee00::1.53: 2963+ [1au] AAAA? contile-images.services.mozilla.com. (64)
07:22:03.501089 IP6 fddc:7e00:e001:ee00::1.53 > fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338: 2963 1/0/1 AAAA 2a01:7e00:e001:ee64::2278:7366 (92)
07:22:03.501152 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 > fddc:7e00:e001:ee00::1: ICMP6, destination unreachable, unreachable port, fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 udp port 45338, length 148
The time difference here is only 0.054170 and there is no way to alter the timeout in resolved.
There are recent upstream commits to fix this which ought to be cherry-picked. See:
https://github.com/systemd/systemd/issues/17421
https://github.com/systemd/systemd/pull/17535
https://github.com/systemd/systemd/commit/e03d156f78cb5a0cac85d1e1310d89fdfa4f1b88
If I am reading the code correctly the timeout is very short:
src/resolve/resolved-dns-transaction.c:22:#define DNS_TIMEOUT_USEC (SD_RESOLVED_QUERY_TIMEOUT_USEC / DNS_TRANSACTION_ATTEMPTS_MAX)
src/resolve/resolved-def.h:79:#define SD_RESOLVED_QUERY_TIMEOUT_USEC (120 * USEC_PER_SEC)
src/resolve/resolved-dns-transaction.h:212:#define DNS_TRANSACTION_ATTEMPTS_MAX 24
So in micro-seconds that is 120 /24 = 5 per query with, as inferred, up to 24 attempts (I don't see multiple duplicate requests on the wire so not sure DNS_TRANSACTION_ATTEMPTS_MAX affects this. |
|
2021-08-24 11:40:24 |
Dan Streetman |
bug |
|
|
added subscriber Dan Streetman |
2021-08-24 11:42:40 |
Dan Streetman |
systemd (Ubuntu): status |
New |
Incomplete |
|
2021-10-25 04:17:25 |
Launchpad Janitor |
systemd (Ubuntu): status |
Incomplete |
Expired |
|