udevadm trigger subsystem-match=net doesn't always run rules because of reconfiguration rate-limiting

Bug #1669564 reported by Ryan Harper on 2017-03-02
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
nplan (Ubuntu)
Undecided
Unassigned
Xenial
Undecided
Unassigned
Zesty
Undecided
Unassigned
Artful
Undecided
Unassigned
Bionic
Undecided
Unassigned

Bug Description

[Impact]
Proper udev trigger behavior following a 'netplan apply' is essential to having all the configuration applied for an interface.

[Test case]
1- Write a netplan configuration file that sets MTU for a device, or renames a device.
2- Run 'netplan apply'
Validate that all the changes were correctly applied: netplan apply will run 'udevadm trigger' for the user, and udev should apply all low-level link changes (MTU and renames).
Make sure to watch out for device renames on devices that are blacklisted for replugging such as mwifiex, XEN VIF, etc.

[Regression potential]
Verification should watch out for a MTU being set, but set to the wrong value, or MTUs being applied to all interfaces rather than just the interface for which it was set. Users should also watch out for the network device to be renamed correctly, and then seen as the correct name and state in both networkd and the ip command.

---

1. root@ubuntu:~# lsb_release -rd
Description: Ubuntu Zesty Zapus (development branch)
Release: 17.04

2. root@ubuntu:~# apt-cache policy udev
udev:
  Installed: 232-18ubuntu1
  Candidate: 232-18ubuntu1
  Version table:
 *** 232-18ubuntu1 500
        500 http://archive.ubuntu.com/ubuntu zesty/main amd64 Packages
        100 /var/lib/dpkg/status

3. udevadm trigger --verbose --subsystem-match=net --action=add will run and read .link files from /run/systemd/network/10-netplan-interface1.link
and apply MTU settings

4. during system boot running (3) does not set the MTU; running (3) after boot has completed MTU is set correctly.

Here'a log during boot where cloud-init generates a netplan config,
invokes `netplan generate` which writes the networkd config out
and then udevadm trigger (3). Upon logging in interface1 has an MTU of 1500. Re-running udevadm trigger now runs the rules/link files and updates the MTU.

Note that, if you run udevadm test /sys/class/net/interface1; this also will
apply the MTU (test probably shouldn't change the interface, I'll file a
bug for that as well).

# journalctl -o short-precise --no-pager -b | grep WARK
Mar 02 19:17:19.839797 ubuntu cloud-init[647]: WARK: ['netplan', '--debug', 'generate']:
Mar 02 19:17:19.839797 ubuntu cloud-init[647]: WARK: ['stat', '/run/systemd/network/10-netplan-interface1.link']:
Mar 02 19:17:19.839797 ubuntu cloud-init[647]: WARK: ['cat', '/run/systemd/network/10-netplan-interface1.link']:
Mar 02 19:17:19.839797 ubuntu cloud-init[647]: WARK: ['systemctl', 'start', '--no-block', 'systemd-udev-trigger.service']:
Mar 02 19:17:19.839797 ubuntu cloud-init[647]: WARK: ['udevadm', 'trigger', '--verbose', '--subsystem-match=net', '--action=add']:

root@ubuntu:~# cat /run/systemd/network/10-netplan-interface1.link
[Match]
MACAddress=52:54:00:12:34:02

[Link]
Name=interface1
WakeOnLan=off
MTUBytes=1492

root@ubuntu:~# ifconfig interface1
interface1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
        inet 10.0.2.100 netmask 255.255.255.0 broadcast 10.0.2.255
        inet6 fe80::5054:ff:fe12:3402 prefixlen 64 scopeid 0x20<link>
        inet6 fec0::5054:ff:fe12:3402 prefixlen 64 scopeid 0x40<site>
        ether 52:54:00:12:34:02 txqueuelen 1000 (Ethernet)
        RX packets 16 bytes 5053 (5.0 KB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 35 bytes 3287 (3.2 KB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

root@ubuntu:~# udevadm trigger --verbose --subsystem-match=net --action=add
/sys/devices/pci0000:00/0000:00:04.0/virtio1/net/interface1
/sys/devices/pci0000:00/0000:00:05.0/virtio2/net/interface2
  ys/devices/pci0000:00/0000:00:06.0/virtio3/net/interface0
/sys/devices/virtual/net/lo

root@ubuntu:~# ifconfig interface1
interface1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1492
        inet 10.0.2.100 netmask 255.255.255.0 broadcast 10.0.2.255
        inet6 fe80::5054:ff:fe12:3402 prefixlen 64 scopeid 0x20<link>
        inet6 fec0::5054:ff:fe12:3402 prefixlen 64 scopeid 0x40<site>
        ether 52:54:00:12:34:02 txqueuelen 1000 (Ethernet)
        RX packets 16 bytes 5053 (5.0 KB)
        RX errors 0 dropped 0 overruns 0 frame 0
        TX packets 37 bytes 3504 (3.5 KB)
        TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

ProblemType: Bug
DistroRelease: Ubuntu 17.04
Package: udev 232-18ubuntu1
ProcVersionSignature: Ubuntu 4.10.0-8.10-generic 4.10.0-rc8
Uname: Linux 4.10.0-8-generic x86_64
ApportVersion: 2.20.4-0ubuntu2
Architecture: amd64
Date: Thu Mar 2 19:22:14 2017
Lsusb: Error: command ['lsusb'] failed with exit code 1:
MachineType: QEMU Standard PC (i440FX + PIIX, 1996)
ProcEnviron:
 TERM=vt220
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=/boot/vmlinuz-4.10.0-8-generic root=UUID=8bbb84fe-91e8-4a9a-bd91-f6af4793727e ro console=ttyS0
SourcePackage: systemd
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 04/01/2014
dmi.bios.vendor: SeaBIOS
dmi.bios.version: 1.10.1-1ubuntu1
dmi.chassis.type: 1
dmi.chassis.vendor: QEMU
dmi.chassis.version: pc-i440fx-zesty
dmi.modalias: dmi:bvnSeaBIOS:bvr1.10.1-1ubuntu1:bd04/01/2014:svnQEMU:pnStandardPC(i440FX+PIIX,1996):pvrpc-i440fx-zesty:cvnQEMU:ct1:cvrpc-i440fx-zesty:
dmi.product.name: Standard PC (i440FX + PIIX, 1996)
dmi.product.version: pc-i440fx-zesty
dmi.sys.vendor: QEMU

Related branches

Ryan Harper (raharper) wrote :
Ryan Harper (raharper) on 2017-03-02
description: updated
description: updated
tags: added: rls-z-incoming
Dimitri John Ledkov (xnox) wrote :

Should one not restart systemd-networkd after writing out .link et.al. units?

Dimitri John Ledkov (xnox) wrote :

Please provide full journal log for the boot.

Changed in systemd (Ubuntu):
status: New → Incomplete

On Tue, Mar 7, 2017 at 6:27 AM, Dimitri John Ledkov <email address hidden>
wrote:

> Should one not restart systemd-networkd after writing out .link et.al.
> units?
>

systemd-networkd only deals with .netdev and .network files; .link files
are handled by udev

This occurs during boot, before systemd-networkd starts.

cloud-init-local.service runs before network-pre.target and writes out a
/etc/netplan/nplan.yaml,
invokes `netplan generate` which writes out
/run/systemd/network/10-netplan-xx.{.link,network,netdev}
as needed; however, the cold plug of devices (systemd-udev-trigger.service)
happens prior to
cloud-init-local.service; as it's required for mounting the rootfs among
other things.

The udev service is also used for device renaming, which would need to
occur before starting
networkd. As such, in cloud-init-local, after generating netplan
configuration, we re-trigger
the net subsystem events which *should* process any .link files.

The bug, I believe, is that during cloud-init-local execution, udev does
*not* process the .link files
prior to starting systemd.

I'm currently working around this using the side-effect of using udevamd
test-builtin setup_net_link
which will process the .link files.

I'll collect a journal and attach it.

>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1669564
>
> Title:
> udevadm trigger subsystem-match=net doesn't always run rules
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/
> 1669564/+subscriptions
>

Dimitri John Ledkov (xnox) wrote :

On 7 March 2017 at 14:37, Ryan Harper <email address hidden> wrote:
> On Tue, Mar 7, 2017 at 6:27 AM, Dimitri John Ledkov <email address hidden>
> wrote:
>
>> Should one not restart systemd-networkd after writing out .link et.al.
>> units?
>>
>
> systemd-networkd only deals with .netdev and .network files; .link files
> are handled by udev
>
>
> This occurs during boot, before systemd-networkd starts.
>
> cloud-init-local.service runs before network-pre.target and writes out a
> /etc/netplan/nplan.yaml,
> invokes `netplan generate` which writes out
> /run/systemd/network/10-netplan-xx.{.link,network,netdev}
> as needed; however, the cold plug of devices (systemd-udev-trigger.service)

does netplan rename the interfaces? this will then introduce a race,
ideally we'd want to rename interfaces only once.
(E.g. does udev in the initramfs rename interface from eth0 to e.g.
ens3, and then netplan's .link files rename enc3 again into something
else?)

> happens prior to
> cloud-init-local.service; as it's required for mounting the rootfs among
> other things.
>
> The udev service is also used for device renaming, which would need to
> occur before starting
> networkd. As such, in cloud-init-local, after generating netplan
> configuration, we re-trigger
> the net subsystem events which *should* process any .link files.
>
> The bug, I believe, is that during cloud-init-local execution, udev does
> *not* process the .link files
> prior to starting systemd.
>

Is there a typo here, and/or can you rephrase this assertion?
cloud-init-local is a systemd unit therefore by definition systemd and
udev are running "during cloud-init-local execution".

The rest of the comment sort of makes sense.

> I'm currently working around this using the side-effect of using udevamd
> test-builtin setup_net_link
> which will process the .link files.
>
> I'll collect a journal and attach it.

--
Regards,

Dimitri.

Ryan Harper (raharper) wrote :

On Tue, Mar 7, 2017 at 9:41 AM, Dimitri John Ledkov <email address hidden>
wrote:

> On 7 March 2017 at 14:37, Ryan Harper <email address hidden> wrote:
> > On Tue, Mar 7, 2017 at 6:27 AM, Dimitri John Ledkov <
> <email address hidden>>
> > wrote:
> >
> >> Should one not restart systemd-networkd after writing out .link et.al.
> >> units?
> >>
> >
> > systemd-networkd only deals with .netdev and .network files; .link files
> > are handled by udev
> >
> >
> > This occurs during boot, before systemd-networkd starts.
> >
> > cloud-init-local.service runs before network-pre.target and writes out a
> > /etc/netplan/nplan.yaml,
> > invokes `netplan generate` which writes out
> > /run/systemd/network/10-netplan-xx.{.link,network,netdev}
> > as needed; however, the cold plug of devices
> (systemd-udev-trigger.service)
>
> does netplan rename the interfaces?

Netplan does not rename interfaces directly; it writes .link files which
are
processed by udev. cloud-init itself may rename interfaces via ip set link
names due to udev refusing to rename interfaces after they've been
renamed once.

> this will then introduce a race,
> ideally we'd want to rename interfaces only once.
>

All of the renames are in sequence, not parallel so I'm not sure I'm
following
the race. The kernel boots with one name for the devices; udev will rename
the interfaces by policy; then during rootfs boot, any .link files will
apply a third
rename, if needed. In my case, all of the .link files are matched via MAC,
so
interface name does not come into play.

> (E.g. does udev in the initramfs rename interface from eth0 to e.g.
> ens3, and then netplan's .link files rename enc3 again into something
> else?)
>

Yes, see above.

>
> > happens prior to
> > cloud-init-local.service; as it's required for mounting the rootfs among
> > other things.
> >
> > The udev service is also used for device renaming, which would need to
> > occur before starting
> > networkd. As such, in cloud-init-local, after generating netplan
> > configuration, we re-trigger
> > the net subsystem events which *should* process any .link files.
> >
> > The bug, I believe, is that during cloud-init-local execution, udev does
> > *not* process the .link files
> > prior to starting systemd.
> >
>
> Is there a typo here, and/or can you rephrase this assertion?
> cloud-init-local is a systemd unit therefore by definition systemd and
> udev are running "during cloud-init-local execution".
>

Sure. The cloud-init-local unit will do the following things:

1) write /etc/netplan/nplan.yml
2) exec 'netplan generate'
3) exec 'udevadm trigger --subsystem-match=net'

After (2), we have .link files in /run/systemd/network/ which *should* get
processed
by "cold-plugging" the net subsystem(3); but they do not.

After boot is complete, one can login and re-run (3) and udev will at that
time
process the .link files. This bug wonders what the difference is between
running
(3) under cloud-init-local unit, and as a user logged in after boot is
complete.

We'd kind of need this to work anyway outside of the context of cloud-init-local.

If one sets up netplan rules for a device, and for instance, attempts to set the MTU in networkd .link files, this change would not take effect until the system is rebooted. netplan could trigger --subsystem-match=net; but it's probably best to make sure anything configured in the link is properly (re-)applied when networkd restarts.

Dimitri John Ledkov (xnox) wrote :

I wonder if following should be done (in netplan code that does the re-trigger, or outside after .link files modified):

$ sync
$ udevadm control --reload
$ udevadm trigger --verbose --subsystem-match=net --action=add

This is a hypothesis, to try out first to reproduce the original bug, and then see if extra reload helps udevd to re-read .link files.

Download full text (3.3 KiB)

On Wed, Mar 8, 2017 at 5:03 AM, Dimitri John Ledkov <email address hidden>
wrote:

> I wonder if following should be done (in netplan code that does the re-
> trigger, or outside after .link files modified):
>
> $ sync
> $ udevadm control --reload
> $ udevadm trigger --verbose --subsystem-match=net --action=add
>
> This is a hypothesis, to try out first to reproduce the original bug,
> and then see if extra reload helps udevd to re-read .link files.
>

I suspect there's something in the subsystem trigger which does not
replay add events on devices (for some reason unknown).

A more focused path is:
udevadm trigger --verbose --action=add /sys/class/net/$iface

Which also fails to get .link files read.

I'm working writing steps for a recreate; but in the mean time I tested this
in a zesty container:

 root@z2:~# cat /etc/netplan/50-cloud-init.yaml
# This file is generated from information provided by
# the datasource. Changes to it will not persist across an instance.
# To disable cloud-init's network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
    version: 2
    ethernets:
        eth0:
            dhcp4: true
            mtu: 1492
            match:
              macaddress: '00:16:3e:67:2b:8f'
        eth1:
            addresses:
            - 192.168.23.2/14
            mtu: 9000
            match:
              macaddress: '00:16:3e:b9:7b:7a'

root@z2:~# cat /run/systemd/network/10-netplan-eth0.link
[Match]
MACAddress=00:16:3e:67:2b:8f

[Link]
WakeOnLan=off
MTUBytes=1492

root@z2:~# ifconfig eth0 mtu 4800
root@z2:~# cat /sys/class/net/eth0/mtu
4800
root@z2:~# udevadm control --reload
root@z2:~# cat /sys/class/net/eth0/mtu
4800
root@z2:~# udevadm trigger --verbose --subsystem-match=net --action=add
/sys/devices/virtual/net/eth0
/sys/devices/virtual/net/eth1
/sys/devices/virtual/net/lo
root@z2:~# cat /sys/class/net/eth0/mtu
4800
root@z2:~# udevadm test-builtin net_setup_link /sys/class/net/eth0
calling: test-builtin
=== trie on-disk ===
tool version: 232
file size: 8441068 bytes
header size 80 bytes
strings 1846908 bytes
nodes 6594080 bytes
Load module index
Found container virtualization lxc
timestamp of '/etc/systemd/network' changed
timestamp of '/run/systemd/network' changed
Parsed configuration file /lib/systemd/network/99-default.link
Parsed configuration file /run/systemd/network/10-netplan-eth1.link
Parsed configuration file /run/systemd/network/10-netplan-eth0.link
Created link configuration context.
ID_NET_DRIVER=veth
Assertion 'udev_device' failed at ../src/libudev/libudev-device.c:128,
function udev_device_get_driver(). Ignoring.
Config file /run/systemd/network/10-netplan-eth0.link applies to device eth0
Could not set WakeOnLan of eth0 to off: Operation not supported
ID_NET_LINK_FILE=/run/systemd/network/10-netplan-eth0.link
Unload module index
Unloaded link configuration context.
root@z2:~# cat /sys/class/net/eth0/mtu
1492

> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1669564
>
> Tit...

Read more...

Here's the journal from a boot which fails.

Ryan Harper (raharper) wrote :

Script to generate a zesty cloud-image that can recreate the issue.

Ryan Harper (raharper) wrote :

Script which launches the image created with the previous script to recreate issue.

Ryan Harper (raharper) wrote :

I've attached the requested journal. I've also attached two scripts which can be used to recreate the issue for further investigation.

Changed in systemd (Ubuntu):
status: Incomplete → New
tags: added: rls-aa-incoming
removed: rls-z-incoming
Dimitri John Ledkov (xnox) wrote :

Looking at the journal files from the netplan-udev that fail, I see that links are already renamed, before cloud-init renders things:

Mar 09 17:51:12 ubuntu kernel: virtio_net virtio0 ens3: renamed from eth0
Mar 09 17:51:12 ubuntu kernel: virtio_net virtio1 ens4: renamed from eth1
Mar 09 17:51:14 ubuntu kernel: virtio_net virtio0 interface0: renamed from ens3
Mar 09 17:51:14 ubuntu kernel: virtio_net virtio1 interface1: renamed from ens4
Mar 09 17:51:15 ubuntu cloud-init[382]: WARK: ['netplan', 'generate']: stdout:
Mar 09 17:51:15 ubuntu cloud-init[382]: WARK: ['udevadm', 'trigger', '--verbose', '--subsystem-match=net', '--action=add']: stdout:

This seems to me as a test case that did not fail, or there are stray files left during this test run.

A full sosreport might be helpful, maybe there are persistent-net rules written out, in addition to link files?

I will try to use your scripts to recreate the bug.

On Tue, May 2, 2017 at 3:05 AM, Dimitri John Ledkov <email address hidden>
wrote:

> Looking at the journal files from the netplan-udev that fail, I see that
> links are already renamed, before cloud-init renders things:
>
> Mar 09 17:51:12 ubuntu kernel: virtio_net virtio0 ens3: renamed from eth0
> Mar 09 17:51:12 ubuntu kernel: virtio_net virtio1 ens4: renamed from eth1
> Mar 09 17:51:14 ubuntu kernel: virtio_net virtio0 interface0: renamed from
> ens3
> Mar 09 17:51:14 ubuntu kernel: virtio_net virtio1 interface1: renamed from
> ens4
> Mar 09 17:51:15 ubuntu cloud-init[382]: WARK: ['netplan', 'generate']:
> stdout:
> Mar 09 17:51:15 ubuntu cloud-init[382]: WARK: ['udevadm', 'trigger',
> '--verbose', '--subsystem-match=net', '--action=add']: stdout:
>

> This seems to me as a test case that did not fail, or there are stray
> files left during this test run.
>

The renames are not the issue; it's the .link files which set things like
MTU.

cloud-init may issue ip set dev link name commands to apply a new name, if
it's not already set but won't
issue them if the names are what it expects.

>
> A full sosreport might be helpful, maybe there are persistent-net rules
> written out, in addition to link files?
>

I can collect a sosreport; I don't think it provides anything additional,
the core issue
is inside systemd itself w.r.t how the udev code is invoked by
systemd-udevd

Hm.

udevd applies MTU property from the link file in the link_config_apply() function, which is called by builtin_net_setup_link(), from the net_setup_link builtin.

Reading the conditions for builtin_net_setup_link_init() and builtin_set_setup_link_validate(), the link configuration context is never recreated; but paths_check_timestamp(link_dirs, &ctx->link_dirs_ts_usec, false) is used to determine if things need to be reloaded. It verifies the time of last modification (st_mtim) on the directories that hold the link files.

However, udevd only calls validate on the builtins at most every 3 seconds (see udevd.c). Thus if one changes link files (e.g. modification time on the directory less than 3 seconds since the last udev scan), and one triggers to run 80-net-setup-link, the link files will not be re-read and nothing new will happen.

From the original logs, it seems like everything happens within the same second on boot. Hence the race.

It would be interesting to boot the bad instances with udev debugging enabled, to observe the messages related to realoding net link configuration and/or applying it, e.g. the interesting messages are:

Created link configuration context.
Unloaded link configuration context.
Check if link configuration needs reloading.

Udev debugging can be set by changing /etc/udev/udev.conf and use udev_log="debug", or by booting with udev.log-priority=debug on the kernel command line.

Also it would be interesting to see, if we can sleep for 3 seconds, before triggering udevadm add.

I understand that hardcoding 3 second sleep is sub-optimal, this is purely to establish if the above analysis is a complete red-herring or not =)

Dimitri John Ledkov (xnox) wrote :

Ideally, the following should happen:
* boot
* Created link configuration context
<netplan creates/changes .link files to have a new MTU setting>
* Check if link configuration needs reloading -> appears in the debug logs
* New MTU is successfully applied

If the 'Check ...' is missing from the debug logs, after netplan has run, udevd will not reload the configs.

I also would have thought that calling udevadm control --reload would force it to reload the contexts for the builtins.

From the original bug report description there is a call to:
'systemctl', 'start', '--no-block', 'systemd-udev-trigger.service'

But if one is doing that, to avoid races, one should call udevadm settle -t 3 before re-triggering add.

These things are possibly red herrings too.

On Tue, May 2, 2017 at 8:54 AM, Dimitri John Ledkov <email address hidden>
wrote:

> Ideally, the following should happen:
> * boot
> * Created link configuration context
> <netplan creates/changes .link files to have a new MTU setting>
> * Check if link configuration needs reloading -> appears in the debug logs
> * New MTU is successfully applied
>

Let me put in the cloud-init sequence and see if we can figure out what may
be missing.

* boot
* systemd-udev* sockets, trigger, settle, udevd , cloud-init-local : run
* cloud-init writes a netplan.yaml
* cloud-init invokes netplan-generate
* system reaches sysinit.target

Looking at the deps on the systemd-udev units; there's no strict ordering
in cloud-init
that says it would run before, or strictly after udev related units.

If you note about the ignoring updated files that are less than 3 seconds
old as well
as systemd's variability w.r.t unit ordering due to missing explicit
Before/After directives
might explain the race.

I'd prefer not to wait for 3 seconds just because; so it maybe use useful
to see if we can
run cloud-init before udev; however it's not clear if invoking udev settle
in cloudinit (which
is done at various places directly (or indirectly through calling programs
like blkid or other
system programs) would end up waiting.

We could also explicitly order cloud-init-local after the udevd service
which depends on
all other udev units; however, if cloud-init runs immediately after it,
the 3 seconds may not
have passed.

I like running after udevd, but I would like to see if we can
force/configure udev to not
wait that 3 seconds;

> If the 'Check ...' is missing from the debug logs, after netplan has
> run, udevd will not reload the configs.
>
> I also would have thought that calling udevadm control --reload would
> force it to reload the contexts for the builtins.
>
> >From the original bug report description there is a call to:
> 'systemctl', 'start', '--no-block', 'systemd-udev-trigger.service'
>
> But if one is doing that, to avoid races, one should call udevadm settle
> -t 3 before re-triggering add.
>

Note that -t 3, --timeout=3 only sets the maxium wait time; this means
that if it
processes any event with in those 3 seconds, that we may not have waited
long
enough for udev when it re-reads link files and says it's not 3 seconds
since the last
time.

The 3 second re-read is arbitrary and ideally should be replaced by content
hashing or
use of inotify such that changes to the file (.link) can be triggered
automatically without
hacky things like sleep 3.

> These things are possibly red herrings too.
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1669564
>
> Title:
> udevadm trigger subsystem-match=net doesn't always run rules
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/
> 1669564/+subscriptions
>

Steve Langasek (vorlon) on 2017-07-27
summary: - udevadm trigger subsystem-match=net doesn't always run rules
+ udevadm trigger subsystem-match=net doesn't always run rules because of
+ reconfiguration rate-limiting
Changed in systemd (Ubuntu):
importance: Undecided → High
tags: removed: rls-aa-incoming
Ryan Harper (raharper) wrote :

Revisting this on an artful image, and nothing besides the driver replug (what netplan apply does) appears to work to process .link files. Something changed I suspect in systemd w.r.t the builtin-test setup_net_link path which would process .link files.

Changed in systemd (Ubuntu Artful):
assignee: nobody → Balint Reczey (rbalint)
Dimitri John Ledkov (xnox) wrote :

Something similar got reported recently upstream, and I got a brand new discovery that one should actually call $ udevadm control --reload -> whenever configuration is changed, as upstream knows that all of their internal state caching and reloading is racy.

Could you please modify your reproducer test case to have: udevadm control --reload, before calling trigger action and check if this resolves your issue?

If true, I believe netplan must call udevadm control --reload after writing out anything that udevd reads, which is udev .rules files and .link files.

tags: added: id-597a09900a9f730ee1bfade0
Dimitri John Ledkov (xnox) wrote :

@original reporter have you added `udevadm control --reload` in the appropriate points in the test-harness and/or cloud-init? And does this resolve the race you have previously observed?

Unassigning, Removing artful series target and marking incomplete, until further information is provided.

no longer affects: systemd (Ubuntu Artful)
no longer affects: nplan (Ubuntu Artful)
Changed in systemd (Ubuntu):
assignee: Balint Reczey (rbalint) → nobody
status: New → Incomplete
Changed in nplan (Ubuntu):
status: New → Incomplete
Ryan Harper (raharper) wrote :

I re-ran the recreate on a daily artful image, updated cloud-init in the image to use the udevadm trigger as before and noticed that it still fails to apply the MTU to the second interface.

Taking the suggestion of including a udevadm control reload, I further modified the image to add that reload instruction.

When using the reload then I can confirm that the MTU setting is applied. It appears that netplan indeed should run the reload operation in the 'netplan generate' call.

ubuntu@ubuntu:/$ cat /etc/cloud/build.info
build_name: server
serial: 20171003
ubuntu@ubuntu:/$ apt-cache policy cloud-init
cloud-init:
  Installed: 17.1-0ubuntu1
  Candidate: 17.1-0ubuntu1
  Version table:
 *** 17.1-0ubuntu1 500
        500 http://archive.ubuntu.com/ubuntu artful/main amd64 Packages
        100 /var/lib/dpkg/status
ubuntu@ubuntu:/$
ubuntu@ubuntu:/$ cat /run/systemd/network/*.link
[Match]
MACAddress=52:54:00:12:34:00

[Link]
Name=interface0
WakeOnLan=off
[Match]
MACAddress=52:54:00:12:34:02

[Link]
Name=interface1
WakeOnLan=off
MTUBytes=1492
ubuntu@ubuntu:/$ cat /sys/class/net/interface0/mtu
1500
ubuntu@ubuntu:/$ cat /sys/class/net/interface1/mtu
1492
ubuntu@ubuntu:/$ grep udevadm /var/log/cloud-init.log
2017-10-04 21:05:40,636 - util.py[DEBUG]: Running command ['udevadm', 'control', '--reload'] with allowed return codes [0] (shell=False, capture=True)
2017-10-04 21:05:40,642 - util.py[DEBUG]: Running command ['udevadm', 'trigger', '--verbose', '--subsystem-match=net', '--action=add'] with allowed return codes [0] (shell=False, capture=True)

Changed in nplan (Ubuntu):
status: Incomplete → New
Changed in systemd (Ubuntu):
status: Incomplete → Invalid
Changed in nplan (Ubuntu):
status: New → Triaged
no longer affects: systemd (Ubuntu)
no longer affects: systemd (Ubuntu Xenial)
no longer affects: systemd (Ubuntu Zesty)
no longer affects: systemd (Ubuntu Artful)
no longer affects: systemd (Ubuntu Bb-series)
Dimitri John Ledkov (xnox) wrote :

I agree, made a merge proposal to the same effect.

Dimitri John Ledkov (xnox) wrote :

I have merged the netplan code to that effect in netplan master now - i.e. generate reloads udevadm rules.

However, I have now realised something.

MTUBytes can be set in two places: NetDev.MTUBytes key in the .link file and in Link.MTUBytes key in the .network file.

The .link setting is applied by udev, whereas the .network setting is applied by networkd on link up.

I guess it matters if the link is managed by networkd subsequently or not. I wonder if we want to set mtu in .link file, .network, or both. Such that mtu is correct if networkd doesn't manage the interface (.link file alone), and the mtu is corrected upon link up too (.network setting) if it is changed, given that .link files are not.

On Wed, Nov 1, 2017 at 8:11 AM, Dimitri John Ledkov <email address hidden>
wrote:

> I have merged the netplan code to that effect in netplan master now -
> i.e. generate reloads udevadm rules.
>
> However, I have now realised something.
>
> MTUBytes can be set in two places: NetDev.MTUBytes key in the .link file
> and in Link.MTUBytes key in the .network file.
>
> The .link setting is applied by udev, whereas the .network setting is
> applied by networkd on link up.
>

It used to be that networkd did not apply any MTU values under .network
At the time, it was only applied by .link; it seems a bit to me like a bug
w.r.t networkd applying
mtu outside of the link; in general it is a property of the device.

> I guess it matters if the link is managed by networkd subsequently or
> not. I wonder if we want to set mtu in .link file, .network, or both.
> Such that mtu is correct if networkd doesn't manage the interface (.link
> file alone), and the mtu is corrected upon link up too (.network
> setting) if it is changed, given that .link files are not.
>

I think it's generally a bug to have the setting in multiple places; that
just can't be
the right thing and smells like it's working around some fundamental issue
with
networkd and udev (which is what consumes the .link files).

>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1669564
>
> Title:
> udevadm trigger subsystem-match=net doesn't always run rules because
> of reconfiguration rate-limiting
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/nplan/+bug/
> 1669564/+subscriptions
>

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nplan - 0.32

---------------
nplan (0.32) bionic; urgency=medium

  * src/nm.c: better handle the UUID generation; the order of iterating
    through interaces may affect things here. Also make sure the tests catch
    a null UUID.

 -- Mathieu Trudel-Lapierre <email address hidden> Tue, 14 Nov 2017 08:53:51 -0500

Changed in nplan (Ubuntu Bionic):
status: New → Fix Released
Łukasz Zemczak (sil2100) wrote :

Can we get the description follow https://wiki.ubuntu.com/StableReleaseUpdates#SRU_Bug_Template ? Thank you!

description: updated
description: updated

Hello Ryan, or anyone else affected,

Accepted nplan into zesty-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nplan/0.32~17.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-zesty to verification-done-zesty. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-zesty. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nplan (Ubuntu Zesty):
status: New → Fix Committed
tags: added: verification-needed verification-needed-zesty
Łukasz Zemczak (sil2100) wrote :

Hello Ryan, or anyone else affected,

Accepted nplan into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nplan/0.32~16.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nplan (Ubuntu Xenial):
status: New → Fix Committed
tags: added: verification-needed-xenial

I cannot verify this one yet, the previous SRU to xenial has blocked this:

https://bugs.launchpad.net/ubuntu/+source/nplan/+bug/1713142

On Thu, Nov 23, 2017 at 11:00 AM, Łukasz Zemczak <<email address hidden>
> wrote:

> Hello Ryan, or anyone else affected,
>
> Accepted nplan into xenial-proposed. The package will build now and be
> available at https://launchpad.net/ubuntu/+source/nplan/0.32~16.04.1 in
> a few hours, and then in the -proposed repository.
>
> Please help us by testing this new package. See
> https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how
> to enable and use -proposed.Your feedback will aid us getting this
> update out to other Ubuntu users.
>
> If this package fixes the bug for you, please add a comment to this bug,
> mentioning the version of the package you tested and change the tag from
> verification-needed-xenial to verification-done-xenial. If it does not
> fix the bug for you, please add a comment stating that, and change the
> tag to verification-failed-xenial. In either case, details of your
> testing will help us make a better decision.
>
> Further information regarding the verification process can be found at
> https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in
> advance!
>
> ** Changed in: nplan (Ubuntu Xenial)
> Status: New => Fix Committed
>
> ** Tags added: verification-needed-xenial
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/1669564
>
> Title:
> udevadm trigger subsystem-match=net doesn't always run rules because
> of reconfiguration rate-limiting
>
> To manage notifications about this bug go to:
> https://bugs.launchpad.net/ubuntu/+source/nplan/+bug/
> 1669564/+subscriptions
>

nplan 0.32~16.04.2 fails to build because I mismerged 0.32 and broke the code skipping the test_routes_v6 test in the NetworkManager case. Therefore, it can't possibly pass SRU verification.

tags: added: verification-failed-xenial
removed: verification-needed-xenial

Hello Ryan, or anyone else affected,

Accepted nplan into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nplan/0.32~16.04.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-needed-xenial
removed: verification-failed-xenial

Autopktests still failing for xenial; the test is still not being skipped (we know it won't work on Xenial due to the version of NM shipped there). Marking verification-failed-xenial.

tags: added: verification-failed-xenial
removed: verification-needed-xenial
Łukasz Zemczak (sil2100) wrote :

Hello Ryan, or anyone else affected,

Accepted nplan into artful-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nplan/0.32~17.10.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-artful to verification-done-artful. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-artful. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

Changed in nplan (Ubuntu Artful):
status: Triaged → Fix Committed
tags: added: verification-needed-artful
Łukasz Zemczak (sil2100) wrote :

Hello Ryan, or anyone else affected,

Accepted nplan into xenial-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/nplan/0.32~16.04.3 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed.Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-xenial to verification-done-xenial. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-xenial. In either case, details of your testing will help us make a better decision.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance!

tags: added: verification-needed-xenial
removed: verification-failed-xenial

verification-done for xenial 0.32~16.04.3, and zesty 0.32~17.04.1:

netplan appears to behave correctly w.r.t setting MTU, using the example from Ryan the MTU appears to be correctly applied every time.

tags: added: verification-done-artful verification-done-xenial verification-done-zesty
removed: verification-needed verification-needed-artful verification-needed-xenial verification-needed-zesty

verification-done for artful with 0.32~17.10.1:

As above, MTU changes are applied correctly.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package nplan - 0.32~17.10.1

---------------
nplan (0.32~17.10.1) artful; urgency=medium

  * Backport 0.32 to Ubuntu 17.10. (LP: #1713142)

nplan (0.32) bionic; urgency=medium

  * src/nm.c: better handle the UUID generation; the order of iterating
    through interaces may affect things here. Also make sure the tests catch
    a null UUID.

nplan (0.31) bionic; urgency=medium

  [ Mathieu Trudel-Lapierre ]
  * src/nm.c: generate a UUID for a connection only as needed; when we're
    dealing with NM VLANs. (LP: #1712921)
  * debian/tests/autostart: Make the autostart test more verbose and avoid
    failing right from the start when systemd-networkd is disabled.
    (LP: #1699371)
  * tests/integration.py: bump the NetworkManager timeout for settling to
    120 seconds, autopkgtest infrastructure tends to be a little slow for the
    network device configuration to be applied and noticed by NM.
    (LP: #1699371)

  [ Dimitri John Ledkov ]
  * Reload udevd to invalidate configuration cache of .rules/.link files
    as generate step may have changed them. LP: #1669564

  [ Dan Streetman ]
  * Add another interface driver exception to netplan replug to prevent unbind
    of the Xen VIF interfaces. (LP: #1729573)

 -- Mathieu Trudel-Lapierre <email address hidden> Thu, 23 Nov 2017 12:30:51 -0500

Changed in nplan (Ubuntu Artful):
status: Fix Committed → Fix Released

The verification of the Stable Release Update for nplan has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Launchpad Janitor (janitor) wrote :
Download full text (5.3 KiB)

This bug was fixed in the package nplan - 0.32~16.04.3

---------------
nplan (0.32~16.04.3) xenial; urgency=medium

  * tests/integration.py: Really fix skipping test_routes_v6 for the NM
    backend.

nplan (0.32~16.04.2) xenial; urgency=medium

  * tests/integration.py: Fix test_routes_v6 that I clobbered when I re-applied
    the skip rules for 16.04 after merging in 0.32.

nplan (0.32~16.04.1) xenial; urgency=medium

  * Backport netplan 0.32 to 16.04. (LP: #1713142)
  * debian/control: Depend on systemd (>= 229-4ubuntu20) for the PrimarySlave
    feature backported in that revision.
  * tests/integration.py: Skip tests that are still not yet supported in xenial

nplan (0.32) bionic; urgency=medium

  * src/nm.c: better handle the UUID generation; the order of iterating
    through interaces may affect things here. Also make sure the tests catch
    a null UUID.

nplan (0.31) bionic; urgency=medium

  [ Mathieu Trudel-Lapierre ]
  * src/nm.c: generate a UUID for a connection only as needed; when we're
    dealing with NM VLANs. (LP: #1712921)
  * debian/tests/autostart: Make the autostart test more verbose and avoid
    failing right from the start when systemd-networkd is disabled.
    (LP: #1699371)
  * tests/integration.py: bump the NetworkManager timeout for settling to
    120 seconds, autopkgtest infrastructure tends to be a little slow for the
    network device configuration to be applied and noticed by NM.
    (LP: #1699371)

  [ Dimitri John Ledkov ]
  * Reload udevd to invalidate configuration cache of .rules/.link files
    as generate step may have changed them. LP: #1669564

  [ Dan Streetman ]
  * Add another interface driver exception to netplan replug to prevent unbind
    of the Xen VIF interfaces. (LP: #1729573)

nplan (0.30) artful; urgency=medium

  * Add an "optional" syntax node for now to all devices. This is unimplemented
    for now, but intended to allow users to mark some devices as optional: to
    make sure they do not delay boot when configured. (LP: #1664844)

nplan (0.29) artful; urgency=medium

  * Fix autopkgtests in a world where /run/NetworkManager/conf.d already
    exists. nplan is enabled by default, so it might well have the directory
    already created on the filesystem.

nplan (0.28) artful; urgency=medium

  * Revert 56cd3eec which disabled IPv6 Router Advertisements by default. It
    broke default network config in LXD and was contrary to the defaults used
    by the kernel. Reopens LP: 1655440. (LP: #1717404)
  * Add "accept-ra:" key for all device types; this will default to OFF but
    allow users to disable processing Router Advertisements when required by
    their network setup. (LP: #1655440)

nplan (0.27) artful; urgency=medium

  [ Mathieu Trudel-Lapierre ]
  * Fix crash in systemd generator if called by an user on the command-line
  * coverage: fix exclusions to properly not cover our "never reached defaults"

  [ Dimitri John Ledkov ]
  * tests/integration.py: In teardown, stop systemd-networkd.socket.
  * src/networkd.c: Set UseMTU=true by default, whenever DHCP is in use.
    (LP: #1717471)
  * tests/integration.py: fix resolved detection.

nplan (0.26) artful; urgency=medium

 ...

Read more...

Changed in nplan (Ubuntu Xenial):
status: Fix Committed → Fix Released
Launchpad Janitor (janitor) wrote :
Download full text (4.8 KiB)

This bug was fixed in the package nplan - 0.32~17.04.1

---------------
nplan (0.32~17.04.1) zesty; urgency=medium

  * Backport 0.32 to 17.04. (LP: #1713142)

nplan (0.32) bionic; urgency=medium

  * src/nm.c: better handle the UUID generation; the order of iterating
    through interaces may affect things here. Also make sure the tests catch
    a null UUID.

nplan (0.31) bionic; urgency=medium

  [ Mathieu Trudel-Lapierre ]
  * src/nm.c: generate a UUID for a connection only as needed; when we're
    dealing with NM VLANs. (LP: #1712921)
  * debian/tests/autostart: Make the autostart test more verbose and avoid
    failing right from the start when systemd-networkd is disabled.
    (LP: #1699371)
  * tests/integration.py: bump the NetworkManager timeout for settling to
    120 seconds, autopkgtest infrastructure tends to be a little slow for the
    network device configuration to be applied and noticed by NM.
    (LP: #1699371)

  [ Dimitri John Ledkov ]
  * Reload udevd to invalidate configuration cache of .rules/.link files
    as generate step may have changed them. LP: #1669564

  [ Dan Streetman ]
  * Add another interface driver exception to netplan replug to prevent unbind
    of the Xen VIF interfaces. (LP: #1729573)

nplan (0.30) artful; urgency=medium

  * Add an "optional" syntax node for now to all devices. This is unimplemented
    for now, but intended to allow users to mark some devices as optional: to
    make sure they do not delay boot when configured. (LP: #1664844)

nplan (0.29) artful; urgency=medium

  * Fix autopkgtests in a world where /run/NetworkManager/conf.d already
    exists. nplan is enabled by default, so it might well have the directory
    already created on the filesystem.

nplan (0.28) artful; urgency=medium

  * Revert 56cd3eec which disabled IPv6 Router Advertisements by default. It
    broke default network config in LXD and was contrary to the defaults used
    by the kernel. Reopens LP: 1655440. (LP: #1717404)
  * Add "accept-ra:" key for all device types; this will default to OFF but
    allow users to disable processing Router Advertisements when required by
    their network setup. (LP: #1655440)

nplan (0.27) artful; urgency=medium

  [ Mathieu Trudel-Lapierre ]
  * Fix crash in systemd generator if called by an user on the command-line
  * coverage: fix exclusions to properly not cover our "never reached defaults"

  [ Dimitri John Ledkov ]
  * tests/integration.py: In teardown, stop systemd-networkd.socket.
  * src/networkd.c: Set UseMTU=true by default, whenever DHCP is in use.
    (LP: #1717471)
  * tests/integration.py: fix resolved detection.

nplan (0.26) artful; urgency=medium

  * Bonding:
    - Add support for specifying a primary slave. (LP: #1709135)
  * Rebind:
    - Fix brcmfmac harder. Treat any 'brcmfmac' driver as not supporting
      rebind. (LP: #1712224)
  * Autopkgtests:
    - Add allow-stderr. Systemd now bleats about a the networkd socket still
      being around and enabled when we restart the service; but we don't need
      to care since we're /restarting/ the service to load the new config.
    - Fix the autostart package to be more sensible: we don't really care if
 ...

Read more...

Changed in nplan (Ubuntu Zesty):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers