Bug #1988119 “systemd-udevd: Run net_setup_link on 'change' ueve...” : Bugs : systemd package : Ubuntu

Revision history for this message

Launchpad Janitor (janitor) wrote on 2022-08-30:

#1

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in systemd (Ubuntu):
status:	New → Confirmed

Revision history for this message

Pieter Lexis (pieter-lexis-tt) wrote on 2022-08-30:

#2

Download full text (15.3 KiB)

We've just had the same problem, on multiple VMs running in Azure.

In the dpkg log we can see that systemd was indeed updated (times in UTC):

2022-08-30 06:31:18 status unpacked udev:amd64 237-3ubuntu10.54
2022-08-30 06:31:18 status half-configured udev:amd64 237-3ubuntu10.54
2022-08-30 06:31:19 status installed udev:amd64 237-3ubuntu10.54
2022-08-30 06:31:19 status triggers-pending initramfs-tools:all 0.130ubuntu3.13
2022-08-30 06:31:19 trigproc man-db:amd64 2.8.3-2ubuntu0.1 <none>
2022-08-30 06:31:19 status half-configured man-db:amd64 2.8.3-2ubuntu0.1
2022-08-30 06:31:19 status installed man-db:amd64 2.8.3-2ubuntu0.1
2022-08-30 06:31:19 trigproc ureadahead:amd64 0.100.0-21 <none>
2022-08-30 06:31:19 status half-configured ureadahead:amd64 0.100.0-21
2022-08-30 06:31:20 status installed ureadahead:amd64 0.100.0-21
2022-08-30 06:31:20 trigproc libc-bin:amd64 2.27-3ubuntu1.5 <none>
2022-08-30 06:31:20 status half-configured libc-bin:amd64 2.27-3ubuntu1.5
2022-08-30 06:31:20 status installed libc-bin:amd64 2.27-3ubuntu1.5
2022-08-30 06:31:20 trigproc systemd:amd64 237-3ubuntu10.53 <none>
2022-08-30 06:31:20 status half-configured systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:20 status installed systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:20 trigproc initramfs-tools:all 0.130ubuntu3.13 <none>
2022-08-30 06:31:20 status half-configured initramfs-tools:all 0.130ubuntu3.13
2022-08-30 06:31:34 status installed initramfs-tools:all 0.130ubuntu3.13
2022-08-30 06:31:37 startup archives unpack
2022-08-30 06:31:38 upgrade libnss-systemd:amd64 237-3ubuntu10.53 237-3ubuntu10.54
2022-08-30 06:31:38 status triggers-pending libc-bin:amd64 2.27-3ubuntu1.5
2022-08-30 06:31:38 status half-configured libnss-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status unpacked libnss-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status half-installed libnss-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status triggers-pending man-db:amd64 2.8.3-2ubuntu0.1
2022-08-30 06:31:38 status half-installed libnss-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status unpacked libnss-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:38 status unpacked libnss-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:38 upgrade libpam-systemd:amd64 237-3ubuntu10.53 237-3ubuntu10.54
2022-08-30 06:31:38 status half-configured libpam-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status unpacked libpam-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status half-installed libpam-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status half-installed libpam-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status unpacked libpam-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:38 status unpacked libpam-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:38 upgrade systemd:amd64 237-3ubuntu10.53 237-3ubuntu10.54
2022-08-30 06:31:38 status half-configured systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status unpacked systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status half-installed systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:39 status triggers-pending ureadahead:amd64 0.100.0-21
2022-08-30 06:31:39 status triggers-pending dbus:amd64 1.12.2-1ubuntu1.3
2022-08-...

We've just had the same problem, on multiple VMs running in Azure.

In the dpkg log we can see that systemd was indeed updated (times in UTC):

2022-08-30 06:31:18 status unpacked udev:amd64 237-3ubuntu10.54
2022-08-30 06:31:18 status half-configured udev:amd64 237-3ubuntu10.54
2022-08-30 06:31:19 status installed udev:amd64 237-3ubuntu10.54
2022-08-30 06:31:19 status triggers-pending initramfs-tools:all 0.130ubuntu3.13
2022-08-30 06:31:19 trigproc man-db:amd64 2.8.3-2ubuntu0.1 <none>
2022-08-30 06:31:19 status half-configured man-db:amd64 2.8.3-2ubuntu0.1
2022-08-30 06:31:19 status installed man-db:amd64 2.8.3-2ubuntu0.1
2022-08-30 06:31:19 trigproc ureadahead:amd64 0.100.0-21 <none>
2022-08-30 06:31:19 status half-configured ureadahead:amd64 0.100.0-21
2022-08-30 06:31:20 status installed ureadahead:amd64 0.100.0-21
2022-08-30 06:31:20 trigproc libc-bin:amd64 2.27-3ubuntu1.5 <none>
2022-08-30 06:31:20 status half-configured libc-bin:amd64 2.27-3ubuntu1.5
2022-08-30 06:31:20 status installed libc-bin:amd64 2.27-3ubuntu1.5
2022-08-30 06:31:20 trigproc systemd:amd64 237-3ubuntu10.53 <none>
2022-08-30 06:31:20 status half-configured systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:20 status installed systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:20 trigproc initramfs-tools:all 0.130ubuntu3.13 <none>
2022-08-30 06:31:20 status half-configured initramfs-tools:all 0.130ubuntu3.13
2022-08-30 06:31:34 status installed initramfs-tools:all 0.130ubuntu3.13
2022-08-30 06:31:37 startup archives unpack
2022-08-30 06:31:38 upgrade libnss-systemd:amd64 237-3ubuntu10.53 237-3ubuntu10.54
2022-08-30 06:31:38 status triggers-pending libc-bin:amd64 2.27-3ubuntu1.5
2022-08-30 06:31:38 status half-configured libnss-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status unpacked libnss-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status half-installed libnss-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status triggers-pending man-db:amd64 2.8.3-2ubuntu0.1
2022-08-30 06:31:38 status half-installed libnss-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status unpacked libnss-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:38 status unpacked libnss-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:38 upgrade libpam-systemd:amd64 237-3ubuntu10.53 237-3ubuntu10.54
2022-08-30 06:31:38 status half-configured libpam-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status unpacked libpam-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status half-installed libpam-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status half-installed libpam-systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status unpacked libpam-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:38 status unpacked libpam-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:38 upgrade systemd:amd64 237-3ubuntu10.53 237-3ubuntu10.54
2022-08-30 06:31:38 status half-configured systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status unpacked systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:38 status half-installed systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:39 status triggers-pending ureadahead:amd64 0.100.0-21
2022-08-30 06:31:39 status triggers-pending dbus:amd64 1.12.2-1ubuntu1.3
2022-08-30 06:31:39 status triggers-pending dbus:amd64 1.12.2-1ubuntu1.3
2022-08-30 06:31:39 status half-installed systemd:amd64 237-3ubuntu10.53
2022-08-30 06:31:39 status unpacked systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:39 status unpacked systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 upgrade libsystemd0:amd64 237-3ubuntu10.53 237-3ubuntu10.54
2022-08-30 06:31:40 status half-configured libsystemd0:amd64 237-3ubuntu10.53
2022-08-30 06:31:40 status unpacked libsystemd0:amd64 237-3ubuntu10.53
2022-08-30 06:31:40 status half-installed libsystemd0:amd64 237-3ubuntu10.53
2022-08-30 06:31:40 status half-installed libsystemd0:amd64 237-3ubuntu10.53
2022-08-30 06:31:40 status unpacked libsystemd0:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status unpacked libsystemd0:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 startup packages configure
2022-08-30 06:31:40 configure libsystemd0:amd64 237-3ubuntu10.54 <none>
2022-08-30 06:31:40 status unpacked libsystemd0:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status half-configured libsystemd0:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status installed libsystemd0:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 startup packages configure
2022-08-30 06:31:40 configure systemd:amd64 237-3ubuntu10.54 <none>
2022-08-30 06:31:40 status unpacked systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status unpacked systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status unpacked systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status unpacked systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status unpacked systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status unpacked systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status unpacked systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status unpacked systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status unpacked systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status unpacked systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:40 status half-configured systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:41 status installed systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:41 configure libnss-systemd:amd64 237-3ubuntu10.54 <none>
2022-08-30 06:31:41 status unpacked libnss-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:41 status half-configured libnss-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:41 status installed libnss-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:41 configure libpam-systemd:amd64 237-3ubuntu10.54 <none>
2022-08-30 06:31:41 status unpacked libpam-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:41 status half-configured libpam-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:42 status installed libpam-systemd:amd64 237-3ubuntu10.54
2022-08-30 06:31:42 trigproc libc-bin:amd64 2.27-3ubuntu1.5 <none>
2022-08-30 06:31:42 status half-configured libc-bin:amd64 2.27-3ubuntu1.5
2022-08-30 06:31:42 status installed libc-bin:amd64 2.27-3ubuntu1.5
2022-08-30 06:31:42 trigproc man-db:amd64 2.8.3-2ubuntu0.1 <none>
2022-08-30 06:31:42 status half-configured man-db:amd64 2.8.3-2ubuntu0.1
2022-08-30 06:31:44 status installed man-db:amd64 2.8.3-2ubuntu0.1
2022-08-30 06:31:44 trigproc dbus:amd64 1.12.2-1ubuntu1.3 <none>
2022-08-30 06:31:44 status half-configured dbus:amd64 1.12.2-1ubuntu1.3
2022-08-30 06:31:44 status installed dbus:amd64 1.12.2-1ubuntu1.3
2022-08-30 06:31:44 trigproc ureadahead:amd64 0.100.0-21 <none>
2022-08-30 06:31:44 status half-configured ureadahead:amd64 0.100.0-21
2022-08-30 06:31:44 status installed ureadahead:amd64 0.100.0-21

syslog also shows that systemd was reloaded after the update:

Aug 30 06:31:01 provisioning systemd[1]: Starting Daily apt upgrade and clean activities...
Aug 30 06:31:17 provisioning systemd[1]: Reloading.
Aug 30 06:31:17 provisioning systemd[1]: dev-disk-cloud-azure_resource\x2dpart1.device: Dependency Before=network-online.target ignored (.device units cannot be delayed)
Aug 30 06:31:17 provisioning systemd[1]: dev-disk-cloud-azure_resource\x2dpart1.device: Dependency Before=network.target ignored (.device units cannot be delayed)
Aug 30 06:31:17 provisioning systemd[1]: Starting Message of the Day...
Aug 30 06:31:18 provisioning 50-motd-news[13975]:  * Super-optimized for small spaces - read how we shrank the memory
Aug 30 06:31:18 provisioning 50-motd-news[13975]:    footprint of MicroK8s to make it the smallest full K8s around.
Aug 30 06:31:18 provisioning 50-motd-news[13975]:    https://ubuntu.com/blog/microk8s-memory-optimisation
Aug 30 06:31:18 provisioning systemd[1]: Started Message of the Day.
Aug 30 06:31:18 provisioning systemd[1]: Reloading.
Aug 30 06:31:19 provisioning systemd[1]: dev-disk-cloud-azure_resource\x2dpart1.device: Dependency Before=network-online.target ignored (.device units cannot be delayed)
Aug 30 06:31:19 provisioning systemd[1]: dev-disk-cloud-azure_resource\x2dpart1.device: Dependency Before=network.target ignored (.device units cannot be delayed)
Aug 30 06:31:19 provisioning systemd[1]: Reloading.
Aug 30 06:31:19 provisioning systemd[1]: dev-disk-cloud-azure_resource\x2dpart1.device: Dependency Before=network-online.target ignored (.device units cannot be delayed)
Aug 30 06:31:19 provisioning systemd[1]: dev-disk-cloud-azure_resource\x2dpart1.device: Dependency Before=network.target ignored (.device units cannot be delayed)
Aug 30 06:31:19 provisioning systemd[1]: Stopping udev Kernel Device Manager...
Aug 30 06:31:19 provisioning systemd[1]: Stopped udev Kernel Device Manager.
Aug 30 06:31:19 provisioning systemd[1]: Starting udev Kernel Device Manager...
Aug 30 06:31:19 provisioning systemd[1]: Started udev Kernel Device Manager.
Aug 30 06:31:19 provisioning systemd[1]: Reloading.
Aug 30 06:31:19 provisioning systemd[1]: dev-disk-cloud-azure_resource\x2dpart1.device: Dependency Before=network-online.target ignored (.device units cannot be delayed)
Aug 30 06:31:19 provisioning systemd[1]: dev-disk-cloud-azure_resource\x2dpart1.device: Dependency Before=network.target ignored (.device units cannot be delayed)
Aug 30 06:31:20 provisioning systemd[1]: Reloading.
Aug 30 06:31:20 provisioning systemd[1]: dev-disk-cloud-azure_resource\x2dpart1.device: Dependency Before=network-online.target ignored (.device units cannot be delayed)
Aug 30 06:31:20 provisioning systemd[1]: dev-disk-cloud-azure_resource\x2dpart1.device: Dependency Before=network.target ignored (.device units cannot be delayed)
Aug 30 06:31:39 provisioning dbus-daemon[1096]: [system] Reloaded configuration
Aug 30 06:31:39 provisioning dbus-daemon[1096]: message repeated 15 times: [ [system] Reloaded configuration]
Aug 30 06:31:40 provisioning systemd[1]: Reexecuting.
Aug 30 06:31:40 provisioning kernel: [6452333.017505] printk: systemd: 36 output lines suppressed due to ratelimiting
Aug 30 06:31:40 provisioning kernel: [6452333.025951] systemd[1]: systemd 237 running in system mode. (+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN2 +IDN -PCRE2 default-hierarchy=hybrid)
Aug 30 06:31:40 provisioning kernel: [6452333.026024] systemd[1]: Detected virtualization microsoft.
Aug 30 06:31:40 provisioning kernel: [6452333.026029] systemd[1]: Detected architecture x86-64.
Aug 30 06:31:40 provisioning kernel: [6452333.119296] systemd[1]: dev-disk-cloud-azure_resource\x2dpart1.device: Dependency Before=network-online.target ignored (.device units cannot be delayed)
Aug 30 06:31:40 provisioning kernel: [6452333.119301] systemd[1]: dev-disk-cloud-azure_resource\x2dpart1.device: Dependency Before=network.target ignored (.device units cannot be delayed)
Aug 30 06:31:40 provisioning systemd[1]: Stopped Wait for Network to be Configured.
Aug 30 06:31:40 provisioning systemd[1]: Stopping Wait for Network to be Configured...
Aug 30 06:31:40 provisioning systemd[1]: Stopping Network Service...
Aug 30 06:31:40 provisioning systemd-timesyncd[759]: Network configuration changed, trying to establish connection.
Aug 30 06:31:40 provisioning systemd[1]: Stopped Network Service.
Aug 30 06:31:40 provisioning systemd[1]: Starting Network Service...
Aug 30 06:31:40 provisioning systemd-timesyncd[759]: Synchronized to time server 91.189.94.4:123 (ntp.ubuntu.com).
Aug 30 06:31:41 provisioning systemd-networkd[20642]: eth0: Gained IPv6LL
Aug 30 06:31:41 provisioning systemd-timesyncd[759]: Network configuration changed, trying to establish connection.
Aug 30 06:31:41 provisioning systemd-networkd[20642]: Enumeration completed
Aug 30 06:31:41 provisioning systemd[1]: Started Network Service.
Aug 30 06:31:41 provisioning systemd[1]: Starting Wait for Network to be Configured...
Aug 30 06:31:41 provisioning systemd[1]: Stopping Network Name Resolution...
Aug 30 06:31:41 provisioning systemd[1]: Stopped Network Name Resolution.
Aug 30 06:31:41 provisioning systemd[1]: Starting Network Name Resolution...
Aug 30 06:31:41 provisioning systemd-networkd-wait-online[20649]: ignoring: lo
Aug 30 06:31:41 provisioning systemd-timesyncd[759]: Synchronized to time server 91.189.94.4:123 (ntp.ubuntu.com).
Aug 30 06:31:41 provisioning systemd-networkd-wait-online[20649]: ignoring: lo
Aug 30 06:31:41 provisioning systemd-timesyncd[759]: Network configuration changed, trying to establish connection.
Aug 30 06:31:41 provisioning systemd[1]: Started Wait for Network to be Configured.
Aug 30 06:31:41 provisioning systemd-timesyncd[759]: Synchronized to time server 91.189.94.4:123 (ntp.ubuntu.com).
Aug 30 06:31:41 provisioning systemd-resolved[20653]: Positive Trust Anchors:
Aug 30 06:31:41 provisioning systemd-resolved[20653]: . IN DS 19036 8 2 49aac11d7b6f6446702e54a1607371607a1a41855200fd2ce1cdde32f24e8fb5
Aug 30 06:31:41 provisioning systemd-resolved[20653]: . IN DS 20326 8 2 e06d44b80b8f1d39a95c0b0d7c65d08458e880409bbc683457104237c7f8ec8d
Aug 30 06:31:41 provisioning systemd-resolved[20653]: Negative trust anchors: 10.in-addr.arpa 16.172.in-addr.arpa 17.172.in-addr.arpa 18.172.in-addr.arpa 19.172.in-addr.arpa 20.172.in-addr.arpa 21.172.in-addr.arpa 22.172.in-addr.arpa 23.172.in-addr.arpa 24.172.in-addr.arpa 25.172.in-addr.arpa 26.172.in-addr.arpa 27.172.in-addr.arpa 28.172.in-addr.arpa 29.172.in-addr.arpa 30.172.in-addr.arpa 31.172.in-addr.arpa 168.192.in-addr.arpa d.f.ip6.arpa corp home internal intranet lan local private test
Aug 30 06:31:41 provisioning systemd-resolved[20653]: Using system hostname 'provisioning'.
Aug 30 06:31:41 provisioning systemd[1]: Started Network Name Resolution.
Aug 30 06:31:41 provisioning systemd[1]: Stopping Network Time Synchronization...
Aug 30 06:31:41 provisioning systemd[1]: Stopped Network Time Synchronization.
Aug 30 06:31:41 provisioning systemd[1]: Starting Network Time Synchronization...
Aug 30 06:31:41 provisioning systemd[1]: Started Network Time Synchronization.
Aug 30 06:31:41 provisioning kernel: [6452333.564134] systemd[1]: Stopping Journal Service...
Aug 30 06:31:41 provisioning kernel: [6452333.564676] systemd-journald[434]: Received SIGTERM from PID 1 (systemd).
Aug 30 06:31:41 provisioning systemd[1]: Stopped Flush Journal to Persistent Storage.
Aug 30 06:31:41 provisioning systemd[1]: Stopping Flush Journal to Persistent Storage...
Aug 30 06:31:41 provisioning kernel: [6452333.754503] systemd[1]: Stopped Journal Service.
Aug 30 06:31:41 provisioning kernel: [6452333.756181] systemd[1]: Starting Journal Service...
Aug 30 06:31:41 provisioning kernel: [6452333.791287] systemd[1]: Started Journal Service.
Aug 30 06:31:41 provisioning systemd[1]: Starting Flush Journal to Persistent Storage...
Aug 30 06:31:41 provisioning systemd[1]: Started Flush Journal to Persistent Storage.
Aug 30 06:31:44 provisioning dbus-daemon[1096]: [system] Reloaded configuration
Aug 30 06:31:47 provisioning python3[1141]: 2022-08-30T06:31:47.178650Z INFO EnvHandler ExtHandler EnvMonitor: Detected dhcp client restart. Restoring routing table.
Aug 30 06:31:47 provisioning python3[1141]: 2022-08-30T06:31:47.179058Z INFO EnvHandler ExtHandler Configure routes
Aug 30 06:31:47 provisioning python3[1141]: 2022-08-30T06:31:47.179171Z INFO EnvHandler ExtHandler Gateway:None
Aug 30 06:31:47 provisioning python3[1141]: 2022-08-30T06:31:47.179278Z INFO EnvHandler ExtHandler Routes:None
Aug 30 06:31:47 provisioning systemd[1]: Started Daily apt upgrade and clean activities.

PID 1141 is the Azure agent.

As we're using custom DNS resolvers, the work-around does not work. But a reboot made the issue go away as the system would re-initialize its network configuration using DHCP

Revision history for this message

Lutz Willek (willek) wrote on 2022-08-30:

#3

Seems to be a duplicate of https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1938791 - same symptoms.

[Workaround]

Reboot the node, DNS should return back to normal.

Revision history for this message

Pieter Lexis (pieter-lexis-tt) wrote on 2022-08-30:

#4

Microsoft has created an incident for this. https://azure.status.microsoft/en-us/status reports:

Azure customers running Canonical Ubuntu 18.04 experiencing DNS errors - Investigating

Starting at approximately 06:00 UTC on 30 Aug 2022, a number of customers running Ubuntu 18.04 (bionic) VMs recently upgraded to systemd version 237-3ubuntu10.54 reported experiencing DNS errors when trying to access their resources. Reports of this issue are confined to this single Ubuntu version.

This bug and a potential fix have been highlighted on the Canonical / Ubuntu site, which we encourage impacted customers to read:

https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1988119

An additional potential workaround customers can consider is to reboot impacted VM instances so that they receive a fresh DHCP lease and new DNS resolver(s).

Any Azure service, including AKS, that uses Canonical Ubuntu version 18.04 of Linux may have some impact from this issue. We are working on mitigations across Azure services that are impacted.

More information will be provided within 60 minutes, when we expect to know more about the root cause and mitigation workstreams.

This message was last updated at 09:20 UTC on 30 August 2022

Revision history for this message

Iain Lane (laney) wrote on 2022-08-30:

#5

I've removed the update from bionic-security and bionic-updates, and restored the versions which were previously in there.

This won't help anyone that has already received the broken update - I think the advice there is to restart, or there is a workaround in the OP here - but it should prevent any further occurrences.

Note that there will be a delay of up to an hour or so for mirrors to receive the deletion.

Revision history for this message

Pieter Lexis (pieter-lexis-tt) wrote on 2022-08-30:

#6

> This won't help anyone that has already received the broken update - I think the advice there is to restart, or there is a workaround in the OP here - but it should prevent any further occurrences.

Do note this is not a solution for those using non-Azure resolvers provided via DHCP through their VNET. These users must reboot or manually set the fallback servers to their custom DNS resolver addresses

Vasili (vasili.namatov) on 2022-08-30

no longer affects:

systemd

Revision history for this message

Lee Van Steerthem (leevs) wrote on 2022-08-30:

#7

Not sure if this is the best place to help people out understanding if nodes are impacted.
We already saw 2 different types of impact on our Azure AKS clusters.
- Pod not able to Terminate
- New images being pulled from ACR (or any container registry

Sometimes it was very clear that we saw the nodes where "Not Ready` in order cases it's very hard to detect.

We have found a way to detect if your nodes are affected.

kubectl logs <pod name>
When you get the following error you know it's impacted: Error from server (InternalError): Internal error occurred: Authorization error (user=masterclient, verb=get, resource=nodes, subresource=proxy)

So restarting the node will help and especially if your cluster is sensitive then you can be more granular about the restart.

I hope it helps some visitors from the azure status page

Revision history for this message

Luciano Santos da Silva (lucianosilva7374) wrote on 2022-08-30:

#8

Hey guys, nothing is working. My application has been out since this early morning. We have already tried to restart the nodes, restart the VM, but nothing has been working and we don't have any update from Microsoft. 4 hours ago they said "More information will be provided within 60 minutes, when we expect to know more about the root cause and mitigation workstreams.".

Revision history for this message

Mark Gerrits (skinny79) wrote on 2022-08-30 (last edit on 2022-08-30):

#9

For anyone hitting this issue with AKS clusters: I have embedded the workaround above in a daemonset to avoid having to restart all nodes (for now)

https://gist.github.com/skinny/96e7feb6b347299ebfacaa76295a82e7

- Please check the image+tag used in the daemonset to whatever is available in your cluster.
- Deploy this daemonset to the cluster (default namespace is used)
- After all pods are running 1/1 for a bit, you can delete it
- Images can be pulled again :)

HTH

Revision history for this message

Robert Bopko (zer69) wrote on 2022-08-30 (last edit on 2022-08-30):

#10

For people having this is issue on AKS clusters with custom DNS...

We have done this on all affected nodepools:

$ VMSS=XXX-vmss
$ nodeResourceGroup=XXX-worker
$ az vmss list-instances -g $nodeResourceGroup -n $VMSS --query "[].id" --output tsv | az vmss run-command invoke --scripts "systemd-resolve --set-dns=your_dns --set-dns=your_dns --set-domain=reddog.microsoft.com --interface=eth0" --command-id RunShellScript --ids @-

Revision history for this message

JG (jgentworth) wrote on 2022-08-30:

#11

We are testing this in our AKS clusters now, but we were able to manually scale up a node pool which brought up new "working" nodes. Then manually scaled the pool back down to remove the "non-working" nodes. This left only new nodes up and the services are functioning properly now.

Revision history for this message

Richard Prammer (richardprammer) wrote on 2022-08-30:

#12

Could this be a related issue, when deployment to aks fails, due to a connection refused when pulling images from azure container registry(ImagePullBackOff). This problem started this morning out of the blue.
Credentials for azure container service are ok and about every 20 image pulls I get one, and the container would start.

Revision history for this message

Stefan Zwanenburg (zwaantje) wrote on 2022-08-30 (last edit on 2022-08-30):

#13

> Could this be a related issue, when deployment to aks fails, due to a connection refused when pulling images from azure container registry(ImagePullBackOff).

I you look closer at the message accompanying the ImagePullBackOff, you should see something like:
dial tcp: lookup registry-1.docker.io on [::1]:53: read udp [::1]:36288->[::1]:53: read: connection refused

Port 53 is the port a DNS server usually listens on.

If this is what you're seeing, then yes: your problems are caused by the issue described in here.

Revision history for this message

Mark Lopez (silvenga) wrote on 2022-08-30:

#14

Yes @richardprammer, it appears ImagePullBackOff is one of the symptoms of this issue.

Revision history for this message

William Bergmann Børresen (williambb) wrote on 2022-08-30:

#15

To temporary mitigate the ImagePullBackOff I scaled up a new functional node (DNS wise) and used this command to reconcile the AKS cluster:
az resource update --resource-group <RESOURCE_GROUP> --name <CLUSTER_NAME> --namespace Microsoft.ContainerService --resource-type ManagedClusters

This recovered CoreDNS in the kube-system namespace, which fixed the ImagePullBackOff

Revision history for this message

Liam Macgillavry (cjdmax) wrote on 2022-08-30:

#16

az cli from cmd.exe, something like this for AKS nodes experiencing the issue: az vmss list-instances -g <resourcegroup> -n vmss --query "[].id" --output tsv | az vmss run-command invoke --scripts "echo FallbackDNS=168.63.129.16 >> /etc/systemd/resolved.conf; systemctl restart systemd-resolved.service" --command-id RunShellScript --ids @-

Revision history for this message

Anton Tykhyi (atykhyy) wrote on 2022-08-30:

#17

Is it safe to downgrade from systemd 237-3ubuntu10.54 to the previous 237-3ubuntu10.50?

Andreas Hasenack (ahasenack) on 2022-08-30

tags:

added: regression-update

Revision history for this message

James Adler (jamesadler) wrote on 2022-08-30:

#18

@atykhyy thank you that worked for VMSS!

I also had some VMs without scale sets, fixed those with:

az vm availability-set list -g <resourcegroup> --query "[].virtualMachines[].id" --output tsv | az vm run-command invoke --scripts "echo FallbackDNS=168.63.129.16 >> /etc/systemd/resolved.conf; systemctl restart systemd-resolved.service" --command-id RunShellScript --ids @-

Revision history for this message

Sebastien Tardif (sebastientardifverituity) wrote on 2022-08-30:

#19

Microsoft Support provided fix for AKS, which I also tested successfully is:

kubectl get no -o json | jq -r '.items[].spec.providerID' | cut -c 9- | az vmss run-command invoke --ids @- \
--command-id RunShellScript \
--scripts 'grep nameserver /etc/resolv.conf || { dhclient -x; dhclient -i eth0; sleep 10; pkill dhclient; grep nameserver /etc/resolv.conf; }'

Revision history for this message

Adrian Joian (ajoian-2) wrote on 2022-08-30:

#20

I've added a few alternatives how to fix the problem, mainly using az cli for vmss, ansible or running a daemonset in this gist : https://gist.github.com/naioja/eb8bac307a711e704b7923400b10bc14

Revision history for this message

bob sacamano (bobsacamano) wrote on 2022-08-30:

#21

this worked for us: https://github.com/joaguas/aksdnsfallback#if-the-above-method-fails-because-dhclient-might-stall-another-alternative-is-to-configure-resolved-to-use-a-fallback-dns-server-which-we-can-hardcode-in-its-configuration

Revision history for this message

ForEachToil (foreachtoil) wrote on 2022-08-30:

#22

You can find here some simple Python script to run a command to the VMSS instances for all subscriptions [or filtered ones]: https://github.com/foreachtoil/execute-command-on-all-vmss
I still lack threading, so this might take a little bit.

Matthew Ruffell (mruffell) on 2022-08-31

Changed in systemd (Ubuntu Bionic):
status:	New → In Progress
Changed in systemd (Ubuntu):
status:	Confirmed → Fix Released
Changed in systemd (Ubuntu Bionic):
importance:	Undecided → Critical

Revision history for this message

AMAN PURWAR (aman1159) wrote on 2022-08-31:

#23

manually scaling node pool/reboot nodes solves this issue.

Matthew Ruffell (mruffell) on 2022-08-31

Changed in systemd (Ubuntu Bionic):
assignee:	nobody → Matthew Ruffell (mruffell)
tags:	added: bionic sts

Revision history for this message

Matthew Ruffell (mruffell) wrote on 2022-08-31:

#24

Debdiff for systemd on Bionic Edit (5.3 KiB, text/plain)

Attached is a debdiff for systemd on Bionic which fixes this bug.

description:	updated
summary:	- Update to systemd 237-3ubuntu10.54 broke dns + systemd-udevd: Run net_setup_link on 'change' uevents to prevent DNS + outages on Azure

Revision history for this message

Westerman (corwesterman) wrote on 2022-08-31:

#25

Is there an workaround for Azure Container Apps at this point?

Revision history for this message

Severity1 (johnreilly-pospos) wrote on 2022-08-31:

#26

@sebastientardifverituity the Microsoft support fix you mentioned worked for me.

Revision history for this message

Łukasz Zemczak (sil2100) wrote on 2022-08-31:

#27

@mruffel thank you for the debdiff! With my limited systemd codebase knowledge, this change feels fine. But I agree with the regression potential section of the SRU description - we should make sure that the update is well tested before going out as potentially it can change behavior.

Revision history for this message

Steffen Vinther Sørensen (arihtmtrx) wrote on 2022-08-31:

#28

Sebastien Tardif (sebastientardifverituity) the fix you mentioned works for me, thanks

Revision history for this message

Matthew Ruffell (mruffell) wrote on 2022-08-31:

#29

Hello everyone,

I know there are quite a few people watching this bug, so I will provide a status update.

The test package has been looking good throughout our internal testing, and we have proceeded to build the next systemd update, version 237-3ubuntu10.55, and it is currently in the bionic-security -proposed ppa.

If you would like to help test, that would be greatly appreciated. Please use a fresh VM on Azure, and please don't put the package into production just yet.

Instructions to install (On a Bionic system):
1) sudo add-apt-repository ppa:ubuntu-security-proposed/ppa
2) sudo apt update
3) sudo apt install libnss-systemd libpam-systemd libsystemd0 libudev1 systemd systemd-sysv udev
4) sudo apt-cache policy systemd | grep Installed
Installed: 237-3ubuntu10.55
5) sudo rm /etc/apt/sources.list.d/ubuntu-security-proposed-ubuntu-ppa-bionic.list
6) sudo apt update

From there you can run the reproducer:

$ sudo udevadm trigger && sudo systemctl restart systemd-networkd
$ ping google.com
PING google.com (172.253.122.138) 56(84) bytes of data.
64 bytes from bh-in-f138.1e100.net (172.253.122.138): icmp_seq=1 ttl=103 time=1.67 ms

if you do test, comment here on how it went. Again, please don't put the package into production until it has had a little more testing, and we will get this released to the world as quickly and safely as we can.

Thanks,
Matthew

Revision history for this message

Milan Barton (miba1248) wrote on 2022-08-31:

#30

Hi Matthew,
on our production Ubuntu VM in Azure we have problem to ping google.com
The version of prod Ubuntu is:

DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.5 LTS"
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.5 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

I have installed new test Ubuntu VM now:

DISTRIB_ID=Ubuntu
DISTRIB_RELEASE=18.04
DISTRIB_CODENAME=bionic
DISTRIB_DESCRIPTION="Ubuntu 18.04.6 LTS"
NAME="Ubuntu"
VERSION="18.04.6 LTS (Bionic Beaver)"
ID=ubuntu
ID_LIKE=debian
PRETTY_NAME="Ubuntu 18.04.6 LTS"
VERSION_ID="18.04"
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
VERSION_CODENAME=bionic
UBUNTU_CODENAME=bionic

ping google.com was working fine.

Then I have applied your steps above and ping google.com is still working fine.

Milan

Revision history for this message

maniak (maruniakl) wrote on 2022-08-31:

#31

I confirm that Sebastiens approach worked also for my AKS instance.
Thank you and to everyone involved, I owe you a pint :)

https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1988119/comments/19

Revision history for this message

Ray Veldkamp (rayveldkamp) wrote on 2022-08-31 (last edit on 2022-08-31):

#32

@mruffell, spinning up a clean Azure 18.04 Bionic VM and following your steps + reproducer, I can confirm DNS and network connectivity work fine after installing systemd from the security proposed ppa:

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 18.04.6 LTS
Release: 18.04
Codename: bionic

$ sudo apt-cache policy systemd | grep Installed
Installed: 237-3ubuntu10.55
$ sudo udevadm trigger && sudo systemctl restart systemd-networkd
$ ping google.com
PING google.com (216.58.214.14) 56(84) bytes of data.
64 bytes from lhr26s05-in-f14.1e100.net (216.58.214.14): icmp_seq=1 ttl=112 time=2.46 ms
64 bytes from lhr26s05-in-f14.1e100.net (216.58.214.14): icmp_seq=2 ttl=112 time=2.87 ms
64 bytes from lhr26s05-in-f14.1e100.net (216.58.214.14): icmp_seq=3 ttl=112 time=2.30 ms

Revision history for this message

Andres Hojman (ahojman) wrote on 2022-08-31 (last edit on 2022-08-31):

#33

Can confirm we get rid off this issue on our Azure AKS setup; by updating our NodePool's OS Image to
AKSUbuntu-1804gen2containerd-2022.08.10 (using k8s version 1.22.11)

RAVI SHANKAR TEKKAM (trsmcse) on 2022-08-31

Changed in systemd (Ubuntu Bionic):
status:	In Progress → Fix Released

Revision history for this message

Luciano Santos da Silva (lucianosilva7374) wrote on 2022-08-31:

#34

I confirm that Mark Gerrits' approach worked also for my AKS instance.
Tahnk you very much.

Revision history for this message

Sander Aerts (vonkenketser) wrote on 2022-08-31:

#35

I just created a new worker nodepool this morning, en redschedulded all pods to the new workers. Solved it for us.

Matthew Ruffell (mruffell) on 2022-08-31

Changed in systemd (Ubuntu Bionic):
status:	Fix Released → Fix Committed

Revision history for this message

Matthew Ruffell (mruffell) wrote on 2022-09-01:

#36

Download full text (5.4 KiB)

The failure mode still exists if "udevadm trigger" has been issued before the package upgrade to systemd 237-3ubuntu10.55.

That is, if unattended-upgrades or the user had installed open-vm-tools, and has not rebooted yet, they will lose network connection on upgrade to 237-3ubuntu10.55.

We need to implement a way to add ID_NET_DRIVER back to the device before the systemd upgrade takes place, otherwise an outage will occur.

Release admins - DO NOT RELEASE systemd 237-3ubuntu10.55 yet.

Tagging block-proposed.

$ ping google.com
PING google.com (142.251.45.110) 56(84) bytes of data.
64 bytes from iad23s04-in-f14.1e100.net (142.251.45.110): icmp_seq=1 ttl=56 time=1.51 ms
64 bytes from iad23s04-in-f14.1e100.net (142.251.45.110): icmp_seq=2 ttl=56 time=1.35 ms
64 bytes from iad23s04-in-f14.1e100.net (142.251.45.110): icmp_seq=3 ttl=56 time=1.17 ms
^C
--- google.com ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2003ms
rtt min/avg/max/mdev = 1.172/1.349/1.516/0.140 ms
azureuser@mruffell-test:~$ sudo apt-cache policy systemd | grep Installed
  Installed: 237-3ubuntu10.53
azureuser@mruffell-test:~$ udevadm info /sys/class/net/eth0 | grep ID_NET_DRIVER
E: ID_NET_DRIVER=hv_netvsc
azureuser@mruffell-test:~$ sudo udevadm trigger
azureuser@mruffell-test:~$ ping google.com
PING google.com (142.251.45.110) 56(84) bytes of data.
64 bytes from iad23s04-in-f14.1e100.net (142.251.45.110): icmp_seq=1 ttl=56 time=2.15 ms
64 bytes from iad23s04-in-f14.1e100.net (142.251.45.110): icmp_seq=2 ttl=56 time=1.21 ms
^C
--- google.com ping statistics ---
2 packets transmitted, 2 received, 0% packet loss, time 1001ms
rtt min/avg/max/mdev = 1.212/1.682/2.152/0.470 ms
azureuser@mruffell-test:~$ udevadm info /sys/class/net/eth0 | grep ID_NET_DRIVER
azureuser@mruffell-test:~$ sudo apt install libnss-systemd libpam-systemd libsystemd0 libudev1 systemd systemd-sysv udev
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following package was automatically installed and is no longer required:
  linux-headers-4.15.0-191
Use 'sudo apt autoremove' to remove it.
Suggested packages:
  systemd-container
The following packages will be upgraded:
  libnss-systemd libpam-systemd libsystemd0 libudev1 systemd systemd-sysv udev
7 upgraded, 0 newly installed, 0 to remove and 8 not upgraded.
Need to get 4497 kB of archives.
After this operation, 8192 B of additional disk space will be used.
Get:1 http://ppa.launchpad.net/ubuntu-security-proposed/ppa/ubuntu bionic/main amd64 libsystemd0 amd64 237-3ubuntu10.55 [205 kB]
Get:2 http://ppa.launchpad.net/ubuntu-security-proposed/ppa/ubuntu bionic/main amd64 libnss-systemd amd64 237-3ubuntu10.55 [105 kB]
Get:3 http://ppa.launchpad.net/ubuntu-security-proposed/ppa/ubuntu bionic/main amd64 libpam-systemd amd64 237-3ubuntu10.55 [107 kB]
Get:4 http://ppa.launchpad.net/ubuntu-security-proposed/ppa/ubuntu bionic/main amd64 systemd amd64 237-3ubuntu10.55 [2915 kB]
Get:5 http://ppa.launchpad.net/ubuntu-security-proposed/ppa/ubuntu bionic/main amd64 udev amd64 237-3ubuntu10.55 [1099 kB]
Get:6 http://ppa.launchpad.net/ubuntu-security-proposed/ppa/ubuntu bionic/main amd64 libudev1 am...

Ubuntu
systemd package

systemd-udevd: Run net_setup_link on 'change' uevents to prevent DNS outages on Azure

Bug Description

Duplicates of this bug

Other bug subscribers

Patches

Remote bug watches

Affects		Status	Importance	Assigned to	Milestone
	systemd (Ubuntu)	Fix Released	Undecided	Unassigned
	Bionic	Fix Released	Critical	Matthew Ruffell

Changed in systemd (Ubuntu Bionic):
status:	Fix Committed → Fix Released

Ubuntusystemd package

systemd-udevd: Run net_setup_link on 'change' uevents to prevent DNS outages on Azure

Bug Description

Duplicates of this bug

Other bug subscribers

Patches

Remote bug watches

Ubuntu
systemd package