netplan crash on ubuntu 20.04 disabling network

Bug #1930482 reported by Ruben Cheng
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
netplan
Invalid
Undecided
Unassigned
systemd (Ubuntu)
Fix Released
Undecided
Unassigned
Focal
Incomplete
Medium
Unassigned

Bug Description

On ubuntu 20.04 with netplan 0.102-0ubuntu1~20.04.2, the server loss both IPv4 and IPv6 address in a while with DHCP config. The are no way to restore network, only by rebooting the server.

Trying to restart the server networking using "systemctl restart systemd-networkd" or "netplan apply" fails.

Note: systemctl and netplay apply fails also with a netplan with static IP address

The server runs on OVS in cloud.ramnode.com

I'm pasting config, logs, netplan and restart output (IP and mac address are masked)

I don't see a workaound yet, only to restart server to change the network configuration.

File: 50-cloud-init.yaml
==============================================
# This file is generated from information provided by the datasource. Changes
# to it will not persist across an instance reboot. To disable cloud-init's
# network configuration capabilities, write a file
# /etc/cloud/cloud.cfg.d/99-disable-network-config.cfg with the following:
# network: {config: disabled}
network:
    version: 2
    ethernets:
        ens3:
            accept-ra: true
            dhcp4: true
            dhcp6: true
            match:
                macaddress: **:**:**:**:**:**:**
            mtu: 1500
            set-name: ens3
=============================================

Syslog
=============================================
2021-06-01T19:31:01.063935-04:00 server systemd[1]: Stopped Network Service.
2021-06-01T19:31:01.066766-04:00 server systemd[1]: Starting Network Service...
2021-06-01T19:31:01.161797-04:00 server systemd-networkd[2302]: /run/systemd/network/10-netplan-ens3.network: MTUBytes= in [Link] section and UseMTU= in [DHCP] section are set. Disabling UseMTU=.
2021-06-01T19:31:01.161990-04:00 server systemd-networkd[2302]: loop3456: netdev ready
2021-06-01T19:31:01.162073-04:00 server systemd-networkd[2302]: Tunnel127: Gained IPv6LL
2021-06-01T19:31:01.162184-04:00 server systemd-networkd[2302]: Tunnel126: Gained IPv6LL
2021-06-01T19:31:01.162275-04:00 server systemd-networkd[2302]: loop3456: Gained IPv6LL
2021-06-01T19:31:01.162349-04:00 server systemd-networkd[2302]: ens3: Gained IPv6LL
2021-06-01T19:31:01.164685-04:00 server systemd-networkd[2302]: Assertion 'ifindex' failed at src/network/networkd-link.c:757, function link_get(). Aborting.
2021-06-01T19:31:01.553611-04:00 server systemd[1]: systemd-networkd.service: Main process exited, code=dumped, status=6/ABRT
2021-06-01T19:31:01.553796-04:00 server systemd[1]: systemd-networkd.service: Failed with result 'core-dump'.
2021-06-01T19:31:01.553892-04:00 server systemd[1]: Failed to start Network Service.
2021-06-01T19:31:01.553983-04:00 server systemd[1]: systemd-networkd.service: Scheduled restart job, restart counter is at 2.
============================================

output of: netapply --debug try
============================================
DEBUG:ens3 not found in {}
DEBUG:loop3456 not found in {}
DEBUG:Merged config:
network:
  bridges:
    loop3456:
      accept-ra: false
      addresses:
      - 10.65.0.3/32
      dhcp4: false
      dhcp6: false
      interfaces: []
  ethernets:
    ens3:
      accept-ra: true
      addresses:
      - XXX.XXX.XXX.XXX/24
      dhcp4: false
      dhcp6: true
      gateway4: XXX.XXX.XXX.XXX
      match:
        macaddress: **:**:**:**:**:**:**
      mtu: 1500
      nameservers:
        addresses:
        - 8.8.8.8
        - 8.8.4.4
        search:
        - uc.edu.ve
      set-name: ens3
  version: 2

DEBUG:New interfaces: set()
** (generate:1484): DEBUG: 20:14:22.895: Processing input file /etc/netplan/50-static.yaml..
** (generate:1484): DEBUG: 20:14:22.895: starting new processing pass
** (generate:1484): DEBUG: 20:14:22.895: Processing input file /etc/netplan/99-local.yaml..
** (generate:1484): DEBUG: 20:14:22.895: starting new processing pass
** (generate:1484): DEBUG: 20:14:22.895: We have some netdefs, pass them through a final round of validation
** (generate:1484): DEBUG: 20:14:22.895: ens3: setting default backend to 1
** (generate:1484): DEBUG: 20:14:22.895: Configuration is valid
** (generate:1484): DEBUG: 20:14:22.895: loop3456: setting default backend to 1
** (generate:1484): DEBUG: 20:14:22.895: Configuration is valid
** (generate:1484): DEBUG: 20:14:22.896: Generating output files..
** (generate:1484): DEBUG: 20:14:22.896: openvswitch: definition ens3 is not for us (backend 1)
** (generate:1484): DEBUG: 20:14:22.896: NetworkManager: definition ens3 is not for us (backend 1)
** (generate:1484): DEBUG: 20:14:22.896: openvswitch: definition loop3456 is not for us (backend 1)
** (generate:1484): DEBUG: 20:14:22.896: NetworkManager: definition loop3456 is not for us (backend 1)
(generate:1484): GLib-DEBUG: 20:14:22.896: posix_spawn avoided (fd close requested)
(generate:1484): GLib-DEBUG: 20:14:22.897: posix_spawn avoided (fd close requested)
DEBUG:netplan generated networkd configuration changed, restarting networkd

Job for systemd-networkd.service failed because a fatal signal was delivered causing the control process to dump core.
See "systemctl status systemd-networkd.service" and "journalctl -xe" for details.

An error occurred: Command '['systemctl', 'start', 'systemd-networkd.service', 'netplan-ovs-cleanup.service']' returned non-zero exit status 1.

Reverting.
DEBUG:netplan generated networkd configuration changed, restarting networkd
DEBUG:ens3 not found in {}
DEBUG:loop3456 not found in {}
DEBUG:Merged config:
network:
  bridges:
    loop3456:
      accept-ra: false
      addresses:
      - 10.65.0.3/32
      dhcp4: false
      dhcp6: false
      interfaces: []
  ethernets:
    ens3:
      accept-ra: true
      addresses:
      - XXX.XXX.XXX.XXX/24
      dhcp4: false
      dhcp6: true
      gateway4: XXX.XXX.XXX.XXX
      match:
        macaddress: **:**:**:**:**:**:**
      mtu: 1500
      nameservers:
        addresses:
        - 8.8.8.8
        - 8.8.4.4
        search:
        - uc.edu.ve
      set-name: ens3
  version: 2

Job for systemd-networkd.service failed.
See "systemctl status systemd-networkd.service" and "journalctl -xe" for details.
Traceback (most recent call last):
  File "/usr/share/netplan/netplan/cli/commands/try_command.py", line 84, in command_try
    NetplanApply().command_apply(run_generate=True, sync=True, exit_on_error=False)
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 236, in command_apply
    utils.systemctl_networkd('start', sync=True, extra_services=netplan_wpa + netplan_ovs)
  File "/usr/share/netplan/netplan/cli/utils.py", line 131, in systemctl_networkd
    subprocess.check_call(command)
  File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['systemctl', 'start', 'systemd-networkd.service', 'netplan-ovs-cleanup.service']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/sbin/netplan", line 23, in <module>
    netplan.main()
  File "/usr/share/netplan/netplan/cli/core.py", line 50, in main
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 264, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/try_command.py", line 66, in run
    self.run_command()
  File "/usr/share/netplan/netplan/cli/utils.py", line 264, in run_command
    self.func()
  File "/usr/share/netplan/netplan/cli/commands/try_command.py", line 95, in command_try
    self.revert()
  File "/usr/share/netplan/netplan/cli/commands/try_command.py", line 118, in revert
    NetplanApply().command_apply(run_generate=False, sync=True, exit_on_error=False)
  File "/usr/share/netplan/netplan/cli/commands/apply.py", line 236, in command_apply
    utils.systemctl_networkd('start', sync=True, extra_services=netplan_wpa + netplan_ovs)
  File "/usr/share/netplan/netplan/cli/utils.py", line 131, in systemctl_networkd
    subprocess.check_call(command)
  File "/usr/lib/python3.8/subprocess.py", line 364, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['systemctl', 'start', 'systemd-networkd.service', 'netplan-ovs-cleanup.service']' returned non-zero exit status 1.
==============================================

output of: journalctl -xe
==============================================
-- An ExecStart= process belonging to unit systemd-networkd.service has exited.
--
-- The process' exit code is 'dumped' and its exit status is 6.
Jun 01 20:14:25 snotra systemd[1]: systemd-networkd.service: Failed with result 'core-dump'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- The unit systemd-networkd.service has entered the 'failed' state with result 'core-dump'.
Jun 01 20:14:25 snotra systemd[1]: Failed to start Network Service.
-- Subject: A start job for unit systemd-networkd.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- A start job for unit systemd-networkd.service has finished with a failure.
--
-- The job identifier is 784 and the job result is failed.
Jun 01 20:14:25 snotra systemd[1]: systemd-networkd.service: Scheduled restart job, restart counter is at 3.
-- Subject: Automatic restarting of a unit has been scheduled
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- Automatic restarting of the unit systemd-networkd.service has been scheduled, as the result for
-- the configured Restart= setting for the unit.
Jun 01 20:14:25 snotra systemd[1]: Condition check resulted in OpenVSwitch configuration for cleanup being skipped.
-- Subject: A start job for unit netplan-ovs-cleanup.service has finished successfully
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- A start job for unit netplan-ovs-cleanup.service has finished successfully.
--
-- The job identifier is 791.
Jun 01 20:14:25 snotra systemd[1]: Stopped Network Service.
-- Subject: A stop job for unit systemd-networkd.service has finished
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- A stop job for unit systemd-networkd.service has finished.
--
-- The job identifier is 796 and the job result is done.
Jun 01 20:14:25 snotra systemd[1]: systemd-networkd.service: Start request repeated too quickly.
Jun 01 20:14:25 snotra systemd[1]: systemd-networkd.service: Failed with result 'core-dump'.
-- Subject: Unit failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
--
-- The unit systemd-networkd.service has entered the 'failed' state with result 'core-dump'.
Jun 01 20:14:25 snotra systemd[1]: Failed to start Network Service.
-- Subject: A start job for unit systemd-networkd.service has failed
-- Defined-By: systemd
-- Support: http://www.ubuntu.com/support
=================================================

Revision history for this message
Arunas B. (arunasb) wrote :
Download full text (13.4 KiB)

since last apt update server lost network completely. Trying to restart give this line which I suspect most:

Assertion 'ifindex' failed at src/network/networkd-link.c:757, function link_get(). Aborting.

Server configuration uses bonding and bridging.
Which could mean, that problems appeared to create bond (was working fine starting 22.04 install).

netplan --debug generate
DEBUG:command generate: running ['/lib/netplan/generate']
** (generate:12272): DEBUG: 15:03:42.857: starting new processing pass
** (generate:12272): DEBUG: 15:03:42.858: starting new processing pass
** (generate:12272): DEBUG: 15:03:42.858: We have some netdefs, pass them through a final round of validation
** (generate:12272): DEBUG: 15:03:42.858: br0: setting default backend to 1
** (generate:12272): DEBUG: 15:03:42.858: Configuration is valid
** (generate:12272): DEBUG: 15:03:42.858: enp6s0: setting default backend to 1
** (generate:12272): DEBUG: 15:03:42.858: Configuration is valid
** (generate:12272): DEBUG: 15:03:42.858: enp3s0: setting default backend to 1
** (generate:12272): DEBUG: 15:03:42.858: Configuration is valid
** (generate:12272): DEBUG: 15:03:42.858: enp66s0f1: setting default backend to 1
** (generate:12272): DEBUG: 15:03:42.858: Configuration is valid
** (generate:12272): DEBUG: 15:03:42.858: enp4s0: setting default backend to 1
** (generate:12272): DEBUG: 15:03:42.858: Configuration is valid
** (generate:12272): DEBUG: 15:03:42.858: enp66s0f0: setting default backend to 1
** (generate:12272): DEBUG: 15:03:42.858: Configuration is valid
** (generate:12272): DEBUG: 15:03:42.858: bond0: setting default backend to 1
** (generate:12272): DEBUG: 15:03:42.858: Configuration is valid
** (generate:12272): DEBUG: 15:03:42.858: Generating output files..
** (generate:12272): DEBUG: 15:03:42.858: openvswitch: definition enp3s0 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: NetworkManager: definition enp3s0 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: openvswitch: definition enp4s0 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: NetworkManager: definition enp4s0 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: openvswitch: definition enp6s0 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: NetworkManager: definition enp6s0 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: openvswitch: definition enp66s0f0 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: NetworkManager: definition enp66s0f0 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: openvswitch: definition enp66s0f1 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: NetworkManager: definition enp66s0f1 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: openvswitch: definition bond0 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: NetworkManager: definition bond0 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: openvswitch: definition br0 is not for us (backend 1)
** (generate:12272): DEBUG: 15:03:42.858: NetworkManager: definition br0 is not for us (backend 1)
(generate:12272): GLib-...

Revision history for this message
Lukas Märdian (slyon) wrote (last edit ):

This hits an assert() in systemd-networkd so I'm assigning it to the src:systemd package.

=> Assertion 'ifindex' failed at src/network/networkd-link.c:757, function link_get(). Aborting.

Might be fixed upstream via https://github.com/systemd/systemd/pull/16271

Lukas Märdian (slyon)
tags: added: rls-ff-incoming
Changed in netplan:
status: New → Invalid
Revision history for this message
Lukas Märdian (slyon) wrote :

Sould be fixed in systemd v246+

Changed in systemd (Ubuntu Focal):
status: New → Triaged
importance: Undecided → Medium
tags: added: fr-2567
tags: removed: fr-2567 rls-ff-incoming
Changed in systemd (Ubuntu):
status: New → Fix Released
Revision history for this message
Nick Rosbrook (enr0n) wrote :

Hi Ruben,

I have tried to reproduce this by duplicating your config as best I can, but I have not been able to trigger the issue. Can you please try and provide a minimal reproducer that can be demonstrated on another machine?

Changed in systemd (Ubuntu Focal):
status: Triaged → Incomplete
Revision history for this message
Ruben Cheng (rcheng) wrote :

Hi Nick,

I cannot longer reproduce this. This was about 1 year ago in a VPS in OVH that we no longer have.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers