netplan apply fails if NIC alias exists

Bug #1810043 reported by Don Thornton Jr.
22
This bug affects 3 people
Affects Status Importance Assigned to Milestone
Netplan
Fix Released
Undecided
Deltik
netplan.io (Ubuntu)
Fix Released
Undecided
Unassigned
Bionic
Fix Released
Undecided
Unassigned
Disco
Fix Released
Undecided
Unassigned

Bug Description

[Impact]
Running 'netplan apply' when an interface has an alias/label defined for it.

[Test case]
On a system which has a label set for an interface:
ip addr add 192.168.0.1/24 dev eth0 label eth0:0

1) run 'netplan apply'
2) verify that the configuration for netplan can be applied without errors.

[Regression potential]
This has minimal risk, and only adds an extra check for a subprocess call that will otherwise fail, but is safely skipped. Avoid crashing in netplan when the files required to run 'net_setup_link' are not available, as the rest of the process will already run net_setup_link for the "master" interface which will have its own files in /sys/class/net.

---

# netplan apply
Traceback (most recent call last):
  File "/usr/sbin/netplan", line 23, in <module>
    netplan.main()
  File "/usr/share/netplan/netplan/cli/core.py", line 50, in main
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 130, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 43, in run
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 130, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 93, in command_apply
    stderr=subprocess.DEVNULL)
  File "/usr/lib/python3.6/subprocess.py", line 291, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['udevadm', 'test-builtin', 'net_setup_link', '/sys/class/net/eth0:0']' returned non-zero exit status 4.

Revision history for this message
Don Thornton Jr. (donthorntonjr) wrote :

more specifically... if there is a label set for alias as in:

ifconfig eth0:0 192.168.0.1

 -- or --

ip addr add 192.168.0.1/24 dev eth0 label eth0:0

Ryan Harper (raharper)
Changed in netplan:
status: New → Confirmed
Revision history for this message
Matt Heller (matthew-f-heller) wrote :

A very similar problem, perhaps the same bug seems to impact the use of vlan interfaces. "netplan apply" fails the first run with the output shown below (and leaves the networking in a bad state) then succeeds the second time it is run.

# netplan apply
Traceback (most recent call last):
  File "/usr/sbin/netplan", line 23, in <module>
    netplan.main()
  File "/usr/share/netplan/netplan/cli/core.py", line 50, in main
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 130, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 43, in run
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 130, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 93, in command_apply
    stderr=subprocess.DEVNULL)
  File "/usr/lib/python3.6/subprocess.py", line 291, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['udevadm', 'test-builtin', 'net_setup_link', '/sys/class/net/vlan557']' returned non-zero exit status 4.

# netplan apply
(successful the second time)

# lsb_release -d
Description: Ubuntu 18.04.2 LTS

# dpkg -l | grep -i netplan
ii netplan.io 0.40.1~18.04.4 amd64 YAML network conf...

# cat /etc/netplan/network.yaml
network:
  version: 2
  renderer: NetworkManager
  ethernets:
    ext:
      match:
        macaddress: 00:1b:21:xx:xx:xx
      set-name: ext
      mtu: 9000
  vlans:
          vlan557:
                  id: 557
                  link: ext
                  mtu: 9000
                  addresses: [172.16.1.100/22]

(network.yaml edited for privacy and brevity)

Revision history for this message
Deltik (deltik) wrote :

I have a patch that fixes this bug.

================================================================================
EXPLANATION
================================================================================

When using the NetworkManager renderer, the logic in `NetplanApply.command_apply()` tells `nmcli` to disconnect the devices previously populated by `netifaces.interfaces()`.

In this bug, interface "eth0:0" was deleted by NetworkManager, but other NetworkManager-managed devices such as "vlan557" or "br0" may also be deleted.

This means that the `devices` variable may be out-of-date with more devices than actually exist after `nmcli device disconnect` runs.

The attached patch fixes this bug by rescanning the devices after `nmcli device disconnect` so that the upcoming `udevadm test-builtin net_setup_link /sys/class/net/XXX` works with existing devices.

summary: - netplan apply fails if NIC alias exists
+ netplan apply fails if NetworkManager deletes devices during apply
Revision history for this message
Deltik (deltik) wrote : Re: netplan apply fails if NetworkManager deletes devices during apply

This bug affects bridges like "br0" as well: https://superuser.com/q/1435615/83694

information type: Public → Public Security
information type: Public Security → Public
Revision history for this message
Don Thornton Jr. (donthorntonjr) wrote :

@Deltik - applied patch has no effect on original bug: netplan apply fails if NIC alias exists

Revision history for this message
Deltik (deltik) wrote :

@donthorntonjr: It looks like you're using the networkd renderer. I had only considered the NetworkManager renderer for the patch.

I was able to reproduce your issue with the networkd renderer and will work on a patch that fixes the bug for both networkd and NetworkManager.

Deltik (deltik)
summary: - netplan apply fails if NetworkManager deletes devices during apply
+ netplan apply fails if NIC alias exists
Revision history for this message
Deltik (deltik) wrote :

@donthorntonjr: Please try the updated patch attached to this comment.

Revision history for this message
Don Thornton Jr. (donthorntonjr) wrote :

@Deltik - applied most recent patch - confirmed resolves original bug on Ubuntu 18.04.2 LTS

Deltik (deltik)
Changed in netplan:
assignee: nobody → Deltik (deltik)
status: Confirmed → Fix Committed
Revision history for this message
Deltik (deltik) wrote :

The fix has been merged into CanonicalLtd/netplan: https://github.com/CanonicalLtd/netplan/pull/86

Revision history for this message
AW01545 (ricknickle) wrote :

I ran into this one on 18.04 because I have a Pacemaker/Corosync virtual IP failover, which aliases the ethernet in the same way. Thanks for the bug fix, I will try to test soon.

Changed in netplan:
status: Fix Committed → Fix Released
Revision history for this message
Ubuntu Foundations Team Bug Bot (crichton) wrote :

The attachment "Rescans network interfaces after NetworkManager may have deleted some" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package netplan.io - 0.98-0ubuntu1

---------------
netplan.io (0.98-0ubuntu1) eoan; urgency=medium

  * New upstream release: 0.98 (LP: #1840832)
    - Added new "feature flags" to identify new features
    - Added support for "use-domains" for DHCP overrides
    - Added support for setting IPv6 MTU Bytes (LP: #1671951)
    - Added a DBus interface to query and run 'netplan apply' via other apps
    - Various build system fixes
    - Improved validation for bonding modes
    - Added support for "hash:" for hashed 802.1x passwords (LP: #1819831)
    - Tolerate devices without a /sys path (LP: #1810043)
    - Fix incorrect separator for networkd with ARP IP targets (LP: #1829264)
  * debian/control: Add Build-Depends on libsystemd-dev for DBus feature, and
    on dbus-x11 for dbus-launch used in tests.

 -- Mathieu Trudel-Lapierre <email address hidden> Wed, 21 Aug 2019 14:49:16 -0400

Changed in netplan.io (Ubuntu):
status: New → Fix Released
description: updated
Revision history for this message
Brian Murray (brian-murray) wrote : Please test proposed package

Hello Don, or anyone else affected,

Accepted netplan.io into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/netplan.io/0.98-0ubuntu1~18.04.1 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in netplan.io (Ubuntu Bionic):
status: New → Fix Committed
tags: added: verification-needed verification-needed-bionic
Revision history for this message
Deltik (deltik) wrote :
Download full text (4.1 KiB)

Fix confirmed in LXD container:

============================================================================
BEFORE
============================================================================

root@demo:~# ip -c a
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
    inet6 ::1/128 scope host
       valid_lft forever preferred_lft forever
15: eth0@if16: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 00:16:3e:01:ac:f3 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 10.64.229.148/24 brd 10.64.229.255 scope global dynamic eth0
       valid_lft 3575sec preferred_lft 3575sec
    inet6 fd42:7a56:79d:e8cb:216:3eff:fe01:acf3/64 scope global dynamic mngtmpaddr noprefixroute
       valid_lft 3593sec preferred_lft 3593sec
    inet6 fe80::216:3eff:fe01:acf3/64 scope link
       valid_lft forever preferred_lft forever
root@demo:~# netplan apply
root@demo:~# ip addr add 192.168.0.1/24 dev eth0 label eth0:0
root@demo:~# netplan apply
Traceback (most recent call last):
  File "/usr/sbin/netplan", line 23, in <module>
    netplan.main()
  File "/usr/share/netplan/netplan/cli/core.py", line 50, in main
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 130, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 43, in run
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 130, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 106, in command_apply
    stderr=subprocess.DEVNULL)
  File "/usr/lib/python3.6/subprocess.py", line 311, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['udevadm', 'test-builtin', 'net_setup_link', '/sys/class/net/eth0:0']' returned non-zero exit status 4.

============================================================================
FIX
============================================================================

root@demo:~# apt install netplan.io
Reading package lists... Done
Building dependency tree
Reading state information... Done
Suggested packages:
  network-manager | wpasupplicant
The following packages will be upgraded:
  netplan.io
1 upgraded, 0 newly installed, 0 to remove and 21 not upgraded.
Need to get 64.8 kB of archives.
After this operation, 22.5 kB of additional disk space will be used.
Get:1 http://security.ubuntu.com/ubuntu bionic-proposed/main amd64 netplan.io amd64 0.98-0ubuntu1~18.04.1 [64.8 kB]
Fetched 64.8 kB in 0s (247 kB/s)
(Reading database ... 13924 files and directories currently installed.)
Preparing to unpack .../netplan.io_0.98-0ubuntu1~18.04.1_amd64.deb ...
Unpacking netplan.io (0.98-0ubuntu1~18.04.1) over (0.97-0ubuntu1~18.04.1) ...
Setting up netplan.io (0.98-0ubuntu1~18.04.1) ...
Processing triggers for dbus (1.12.2-1ubuntu1.1) ...

============================================================================
AFTER
=======================================================...

Read more...

tags: added: verification-done-bionic
removed: verification-needed verification-needed-bionic
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

Verification-done on disco as well:

After applying the update I can successfully run 'netplan apply' on a system on which a label exists; this would otherwise fail even if the interface isn't mentioned in YAML:

ubuntu@oddish:~$ cat /etc/netplan/01-network-manager-all.yaml
# Let NetworkManager manage all devices on this system
network:
  version: 2
  renderer: NetworkManager
ubuntu@oddish:~$ ip addr show wlp58s0
2: wlp58s0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether 44:85:00:1d:8f:df brd ff:ff:ff:ff:ff:ff
    inet 10.3.1.243/22 brd 10.3.3.255 scope global dynamic noprefixroute wlp58s0
       valid_lft 14118sec preferred_lft 14118sec
    inet 192.168.0.1/24 scope global wlp58s0:0
       valid_lft forever preferred_lft forever
    inet6 2001:470:b0cc::78b/128 scope global dynamic noprefixroute
       valid_lft 28528sec preferred_lft 6928sec
    inet6 fe80::3070:bc33:a203:cbc3/64 scope link noprefixroute
       valid_lft forever preferred_lft forever

Before:

ubuntu@oddish:~$ sudo netplan apply
Traceback (most recent call last):
  File "/usr/sbin/netplan", line 23, in <module>
    netplan.main()
  File "/usr/share/netplan/netplan/cli/core.py", line 50, in main
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 130, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 43, in run
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 130, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 106, in command_apply
    stderr=subprocess.DEVNULL)
  File "/usr/lib/python3.7/subprocess.py", line 347, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['udevadm', 'test-builtin', 'net_setup_link', '/sys/class/net/wlp58s0:0']' returned non-zero exit status 1.

After:

ubuntu@oddish:~$ sudo netplan apply
[prompt returns with no error]

tags: added: verification-done-disco
Changed in netplan.io (Ubuntu Disco):
status: New → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package netplan.io - 0.98-0ubuntu1~19.04.1

---------------
netplan.io (0.98-0ubuntu1~19.04.1) disco; urgency=medium

  * Backport netplan.io 0.98 to 19.04. (LP: #1840832)

 -- Mathieu Trudel-Lapierre <email address hidden> Mon, 26 Aug 2019 16:41:36 -0400

Changed in netplan.io (Ubuntu Disco):
status: Fix Committed → Fix Released
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package netplan.io - 0.98-0ubuntu1~18.04.1

---------------
netplan.io (0.98-0ubuntu1~18.04.1) bionic; urgency=medium

  * Backport netplan.io 0.98 to 18.04. (LP: #1840832)
  * Keep patches specific to 18.04 support:
    - disable-networkd-tunnels-ipip-gre.patch: disable tests for unsupported
      tunnel types (ipip and gre) in the 18.04 version of systemd-networkd.
  * Drop debian/patches/glib_changes.patch: No longer necessary, changes were
    made upstream to better account for the changes in HashTable.
  * debian/netplan.io.install: add /usr/share/dbus-1

netplan.io (0.98-0ubuntu1) eoan; urgency=medium

  * New upstream release: 0.98 (LP: #1840832)
    - Added new "feature flags" to identify new features
    - Added support for "use-domains" for DHCP overrides
    - Added support for setting IPv6 MTU Bytes (LP: #1671951)
    - Added a DBus interface to query and run 'netplan apply' via other apps
    - Various build system fixes
    - Improved validation for bonding modes
    - Added support for "hash:" for hashed 802.1x passwords (LP: #1819831)
    - Tolerate devices without a /sys path (LP: #1810043)
    - Fix incorrect separator for networkd with ARP IP targets (LP: #1829264)
  * debian/control: Add Build-Depends on libsystemd-dev for DBus feature, and
    on dbus-x11 for dbus-launch used in tests.

 -- Mathieu Trudel-Lapierre <email address hidden> Mon, 26 Aug 2019 16:36:03 -0400

Changed in netplan.io (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Brian Murray (brian-murray) wrote : Update Released

The verification of the Stable Release Update for netplan.io has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.