Activity log for bug #1940908

Date Who What changed Old value New value Message
2021-08-24 09:03:17 TJ bug added bug
2021-08-24 09:12:17 TJ description With systemd v245 (and v247) and systemd-resolved we're seeing frequent problems due to resolved rapidly closing the socket on which it sends out a query before the server has answered. The server answers and then resolved sends an ICMP Destination Unreachable (Port Unreachable) response! This breaks name lookups frequently. In our case the DNS server is reached via a Wireguard tunnel over a satellite link and latencies can vary. A typical example captured via tcpdump: 07:22:03.446919 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338 > fddc:7e00:e001:ee00::1.53: 2963+ [1au] AAAA? contile-images.services.mozilla.com. (64) 07:22:03.501089 IP6 fddc:7e00:e001:ee00::1.53 > fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338: 2963 1/0/1 AAAA 2a01:7e00:e001:ee64::2278:7366 (92) 07:22:03.501152 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 > fddc:7e00:e001:ee00::1: ICMP6, destination unreachable, unreachable port, fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 udp port 45338, length 148 The time difference here is only 0.054170 and there is no way to alter the timeout in resolved. There are recent upstream commits to fix this which ought to be cherry-picked. See: https://github.com/systemd/systemd/issues/17421 https://github.com/systemd/systemd/pull/17535 https://github.com/systemd/systemd/commit/e03d156f78cb5a0cac85d1e1310d89fdfa4f1b88 Afffects Ubuntu 18.04 through 21.04 (fixes are in systemd v248) With systemd v245 (and v247) and systemd-resolved we're seeing frequent problems due to resolved rapidly closing the socket on which it sends out a query before the server has answered. The server answers and then resolved sends an ICMP Destination Unreachable (Port Unreachable) response! This breaks name lookups frequently. In our case the DNS server is reached via a Wireguard tunnel over a satellite link and latencies can vary. A typical example captured via tcpdump: 07:22:03.446919 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338 > fddc:7e00:e001:ee00::1.53: 2963+ [1au] AAAA? contile-images.services.mozilla.com. (64) 07:22:03.501089 IP6 fddc:7e00:e001:ee00::1.53 > fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338: 2963 1/0/1 AAAA 2a01:7e00:e001:ee64::2278:7366 (92) 07:22:03.501152 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 > fddc:7e00:e001:ee00::1: ICMP6, destination unreachable, unreachable port, fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 udp port 45338, length 148 The time difference here is only 0.054170 and there is no way to alter the timeout in resolved. There are recent upstream commits to fix this which ought to be cherry-picked. See: https://github.com/systemd/systemd/issues/17421 https://github.com/systemd/systemd/pull/17535 https://github.com/systemd/systemd/commit/e03d156f78cb5a0cac85d1e1310d89fdfa4f1b88
2021-08-24 09:24:26 TJ description Afffects Ubuntu 18.04 through 21.04 (fixes are in systemd v248) With systemd v245 (and v247) and systemd-resolved we're seeing frequent problems due to resolved rapidly closing the socket on which it sends out a query before the server has answered. The server answers and then resolved sends an ICMP Destination Unreachable (Port Unreachable) response! This breaks name lookups frequently. In our case the DNS server is reached via a Wireguard tunnel over a satellite link and latencies can vary. A typical example captured via tcpdump: 07:22:03.446919 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338 > fddc:7e00:e001:ee00::1.53: 2963+ [1au] AAAA? contile-images.services.mozilla.com. (64) 07:22:03.501089 IP6 fddc:7e00:e001:ee00::1.53 > fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338: 2963 1/0/1 AAAA 2a01:7e00:e001:ee64::2278:7366 (92) 07:22:03.501152 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 > fddc:7e00:e001:ee00::1: ICMP6, destination unreachable, unreachable port, fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 udp port 45338, length 148 The time difference here is only 0.054170 and there is no way to alter the timeout in resolved. There are recent upstream commits to fix this which ought to be cherry-picked. See: https://github.com/systemd/systemd/issues/17421 https://github.com/systemd/systemd/pull/17535 https://github.com/systemd/systemd/commit/e03d156f78cb5a0cac85d1e1310d89fdfa4f1b88 Afffects Ubuntu 18.04 through 21.04 (fixes are in systemd v248) With systemd v245 (and v247) and systemd-resolved we're seeing frequent problems due to resolved rapidly closing the socket on which it sends out a query before the server has answered. The server answers and then resolved sends an ICMP Destination Unreachable (Port Unreachable) response! This breaks name lookups frequently. In our case the DNS server is reached via a Wireguard tunnel over a satellite link and latencies can vary. A typical example captured via tcpdump: 07:22:03.446919 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338 > fddc:7e00:e001:ee00::1.53: 2963+ [1au] AAAA? contile-images.services.mozilla.com. (64) 07:22:03.501089 IP6 fddc:7e00:e001:ee00::1.53 > fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4.45338: 2963 1/0/1 AAAA 2a01:7e00:e001:ee64::2278:7366 (92) 07:22:03.501152 IP6 fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 > fddc:7e00:e001:ee00::1: ICMP6, destination unreachable, unreachable port, fddc:7e00:e001:ee00:fffe:f875:a4f3:42b4 udp port 45338, length 148 The time difference here is only 0.054170 and there is no way to alter the timeout in resolved. There are recent upstream commits to fix this which ought to be cherry-picked. See: https://github.com/systemd/systemd/issues/17421 https://github.com/systemd/systemd/pull/17535 https://github.com/systemd/systemd/commit/e03d156f78cb5a0cac85d1e1310d89fdfa4f1b88 If I am reading the code correctly the timeout is very short: src/resolve/resolved-dns-transaction.c:22:#define DNS_TIMEOUT_USEC (SD_RESOLVED_QUERY_TIMEOUT_USEC / DNS_TRANSACTION_ATTEMPTS_MAX) src/resolve/resolved-def.h:79:#define SD_RESOLVED_QUERY_TIMEOUT_USEC (120 * USEC_PER_SEC) src/resolve/resolved-dns-transaction.h:212:#define DNS_TRANSACTION_ATTEMPTS_MAX 24 So in micro-seconds that is 120 /24 = 5 per query with, as inferred, up to 24 attempts (I don't see multiple duplicate requests on the wire so not sure DNS_TRANSACTION_ATTEMPTS_MAX affects this.
2021-08-24 11:40:24 Dan Streetman bug added subscriber Dan Streetman
2021-08-24 11:42:40 Dan Streetman systemd (Ubuntu): status New Incomplete
2021-10-25 04:17:25 Launchpad Janitor systemd (Ubuntu): status Incomplete Expired