[master] Restarting systemd-networkd breaks keepalived, heartbeat, corosync, pacemaker (interface aliases are restarted)

Bug #1815101 reported by Leroy Tennison
226
This bug affects 46 people
Affects Status Importance Assigned to Milestone
Netplan
Triaged
Low
Unassigned
heartbeat (Ubuntu)
Won't Fix
Low
Unassigned
keepalived (Ubuntu)
Fix Released
Medium
Unassigned
Bionic
Won't Fix
Medium
Athos Ribeiro
Focal
Confirmed
Undecided
Athos Ribeiro
systemd (Ubuntu)
Fix Released
Medium
Unassigned
Xenial
Won't Fix
Medium
Unassigned
Bionic
Fix Released
Medium
Eric Desrochers
Disco
Won't Fix
Medium
Unassigned
Eoan
Fix Released
Medium
Unassigned
Focal
Fix Released
Undecided
Unassigned

Bug Description

[impact]

- ALL related HA software has a small problem if interfaces are being managed by systemd-networkd: nic restarts/reconfigs are always going to wipe all interfaces aliases when HA software is not expecting it to (no coordination between them.

- keepalived, smb ctdb, pacemaker, all suffer from this. Pacemaker is smarter in this case because it has a service monitor that will restart the virtual IP resource, in affected node & nic, before considering a real failure, but other HA service might consider a real failure when it is not.

[test case]

- comment #14 is a full test case: to have 3 node pacemaker, in that example, and cause a networkd service restart: it will trigger a failure for the virtual IP resource monitor.

- other example is given in the original description for keepalived. both suffer from the same issue (and other HA softwares as well).

[regression potential]

- this backports KeepConfiguration parameter, which adds some significant complexity to networkd's configuration and behavior, which could lead to regressions in correctly configuring the network at networkd start, or incorrectly maintaining configuration at networkd restart, or losing network state at networkd stop.

- Any regressions are most likely to occur during networkd start, restart, or stop, and most likely to involve missing or incorrect ip address(es).

- the change is based in upstream patches adding the exact feature we needed to fix this issue & it will be integrated with a netplan change to add the needed stanza to systemd nic configuration file (KeepConfiguration=)

[other info]

original description:
---

Configure netplan for interfaces, for example (a working config with IP addresses obfuscated)

network:
    ethernets:
        eth0:
            addresses: [192.168.0.5/24]
            dhcp4: false
            nameservers:
              search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, phone.blah.com]
              addresses: [10.22.11.1]
        eth2:
            addresses:
              - 12.13.14.18/29
              - 12.13.14.19/29
            gateway4: 12.13.14.17
            dhcp4: false
            nameservers:
              search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, phone.blah.com]
              addresses: [10.22.11.1]
        eth3:
            addresses: [10.22.11.6/24]
            dhcp4: false
            nameservers:
              search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, phone.blah.com]
              addresses: [10.22.11.1]
        eth4:
            addresses: [10.22.14.6/24]
            dhcp4: false
            nameservers:
              search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, phone.blah.com]
              addresses: [10.22.11.1]
        eth7:
            addresses: [9.5.17.34/29]
            dhcp4: false
            optional: true
            nameservers:
              search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, phone.blah.com]
              addresses: [10.22.11.1]
    version: 2

Configure keepalived (again, a working config with IP addresses obfuscated)

global_defs # Block id
{
notification_email {
        <email address hidden>
}
        notification_email_from <email address hidden>
        smtp_server 10.22.11.7 # IP
        smtp_connect_timeout 30 # integer, seconds
        router_id system3 # string identifying the machine,
                                     # (doesn't have to be hostname).
        vrrp_mcast_group4 224.0.0.18 # optional, default 224.0.0.18
        vrrp_mcast_group6 ff02::12 # optional, default ff02::12
        enable_traps # enable SNMP traps
}
vrrp_sync_group collection {
        group {
                wan
                lan
                phone
        }
vrrp_instance wan {
        state MASTER
        interface eth2
        virtual_router_id 77
        priority 150
        advert_int 1
        smtp_alert
        authentication {
                auth_type PASS
                auth_pass BlahBlah
        }
        virtual_ipaddress {
        12.13.14.20
        }
}
vrrp_instance lan {
        state MASTER
        interface eth3
        virtual_router_id 78
        priority 150
        advert_int 1
        smtp_alert
        authentication {
                auth_type PASS
                auth_pass MoreBlah
        }
        virtual_ipaddress {
                10.22.11.13/24
        }
}
vrrp_instance phone {
        state MASTER
        interface eth4
        virtual_router_id 79
        priority 150
        advert_int 1
        smtp_alert
        authentication {
                auth_type PASS
                auth_pass MostBlah
        }
        virtual_ipaddress {
                10.22.14.3/24
        }
}

At boot the affected interfaces have:
5: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ab:cd:ef:90:c0:e3 brd ff:ff:ff:ff:ff:ff
    inet 10.22.14.6/24 brd 10.22.14.255 scope global eth4
       valid_lft forever preferred_lft forever
    inet 10.22.14.3/24 scope global secondary eth4
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:fe90:c0e3/64 scope link
       valid_lft forever preferred_lft forever
7: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ab:cd:ef:b0:26:29 brd ff:ff:ff:ff:ff:ff
    inet 10.22.11.6/24 brd 10.22.11.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.22.11.13/24 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:feb0:2629/64 scope link
       valid_lft forever preferred_lft forever
9: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ab:cd:ef:b0:26:2b brd ff:ff:ff:ff:ff:ff
    inet 12.13.14.18/29 brd 12.13.14.23 scope global eth2
       valid_lft forever preferred_lft forever
    inet 12.13.14.20/32 scope global eth2
       valid_lft forever preferred_lft forever
    inet 12.33.89.19/29 brd 12.13.14.23 scope global secondary eth2
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:feb0:262b/64 scope link
       valid_lft forever preferred_lft forever

Run 'netplan try' (didn't even make any changes to the configuration) and the keepalived addresses disappear never to return, the affected interfaces have:
5: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ab:cd:ef:90:c0:e3 brd ff:ff:ff:ff:ff:ff
    inet 10.22.14.6/24 brd 10.22.14.255 scope global eth4
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:fe90:c0e3/64 scope link
       valid_lft forever preferred_lft forever
7: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ab:cd:ef:b0:26:29 brd ff:ff:ff:ff:ff:ff
    inet 10.22.11.6/24 brd 10.22.11.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:feb0:2629/64 scope link
       valid_lft forever preferred_lft forever
9: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ab:cd:ef:b0:26:2b brd ff:ff:ff:ff:ff:ff
    inet 12.13.14.18/29 brd 12.13.14.23 scope global eth2
       valid_lft forever preferred_lft forever
    inet 12.33.89.19/29 brd 12.13.14.23 scope global secondary eth2
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:feb0:262b/64 scope link
       valid_lft forever preferred_lft forever

Related branches

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

This isn't netplan, it's systemd-networkd. Netplan only writes configuration for the chosen renderer (in this case, systemd-networkd).

Either systemd needs to not wipe out foreign addresses (I believe there is a PR in git for that) or keepalived should somehow interface with systemd so they can collaborate on setting and keeping up the IP addresses.

Reassigning.

no longer affects: ubuntu
Changed in netplan:
status: New → Invalid
Changed in keepalived (Ubuntu):
status: New → Incomplete
Changed in systemd (Ubuntu):
status: New → Triaged
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

Kept a task for keepalived (Incomplete) in case it turns out there's something we can do there.

Also added a task for systemd, since that would definitely require development work.

Marked Invalid for netplan, as since netplan only translates config from the YAML to what networkd or NetworkManager require, there isn't really anything I see we can do in netplan directly. Applying absolutely does need to 'poke' the renderer somehow for the configuration to be applied; but if it turns out there's something to change in netplan we can update the task.

Turns out there isn't really a PR about foreign addresses handling; though two are somewhat relevant:

https://github.com/systemd/systemd/pull/9956
and
https://github.com/systemd/systemd/pull/7403

But neither will completely address the problem: systemd-networks expects to be authoritative on the network setup, which is somewhat counter to its use in conjunction with keepalived.

As a workaround, for now, one can use /etc/network/interfaces (and/or no configuration in netplan for the interfaces handled by keepalived) to configure the network.

Revision history for this message
Leroy Tennison (ltennison) wrote :

I am trying ifupdown. Do I need to do anythnig else or is what I've done adequate?

Revision history for this message
cdmiller (cdmiller) wrote :

Newer keepalived (> 2.0.x) addresses the systemd-networkd behavior. From keepalived 2.0.0 release notes: "Transition to backup state if a VIP or eVIP is removed When we next transition to master the addresses will be restored. If nopreempt is not set, that will be almost immediately."

Any chance of a keepalived 2.0.x backport package for Ubuntu 18.0.4?

Revision history for this message
Leroy Tennison (ltennison) wrote :

I note this bug is marked Incomplete meaning that information is missing, what else is needed?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Might I ask - how much is this bug related or a dup to bug 1819074?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Seems a dupe to me.

For the bionic case, with keepalived < 2.0, is there some keepalived script that can be run to restore the vip, after networkd removed it? We could run it as a network-dispatcher hook then. Has this been considered?

summary: - netplan removes keepalived configuration
+ Restarting systemd-networkd breaks keepalived clusters
summary: - Restarting systemd-networkd breaks keepalived clusters
+ [master] Restarting systemd-networkd breaks keepalived clusters
Revision history for this message
Leroy Tennison (ltennison) wrote : Re: [master] Restarting systemd-networkd breaks keepalived clusters

If I understand the keepalived > 2.0.x behavior referred to by cdmiller above (see 2019-03-07 comment) that is not the appropriate response to the problem. Granted, it mitigates the consequences butr doesn't address the underlying issue. A systemd-source issue should not cause keepalived failover since failover is designed to address issues of system or hardware failure, not the bad behavior of other system software. systemd needs to be made to cooperate with other software rather than assuming it is the only authority on the system.

Revision history for this message
Robie Basak (racb) wrote :

It looks like there is some clear and actionable work in keepalived here (even if as a workaround and the real fix ends up being in systemd), so I'm marking it as Triaged.

FTR, the Ubuntu Server Team is aware of this as a high level issue and it is high up in our list of priorities to determine how to address it properly.

Changed in keepalived (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Jorge Niedbalski (niedbalski) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

The aforementioned link shows there's been work towards a fix in systemd. Can't say if that suggests what can be done to improve keepalived, but I've tagged this "server-next" to get it on the Ubuntu SErver Team's high priority list, as per Robie's earlier comment.

tags: added: server-next
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

The following 3 bugs:

https://bugs.launchpad.net/bugs/1815101
https://bugs.launchpad.net/bugs/1819074
https://bugs.launchpad.net/bugs/1810583

Have the same root cause: the fact that systemd-network messes with secondary IP addresses in NICs managed by systemd.

I'm marking all other cases as a duplicate of LP: #1815101.

TODO here is the following:

- There are mainly 2 "fixes" for this issue:

1) keepalived is able to recognize systemd-networkd changes and change cluster status in order to reconfigure managed NICs (keepalived (> 2.0.x)).

2) systemd-networkd implements a new stanza (KeepConfiguration=) to systemd service unit files in order to fix not only this behavior but all those HA related software that manages secondary IPs and/or aliases to NICs being managed by systemd-networkd.

I think the most appropriate would make sure those 2 features work in Eoan, both, together, and then make sure the SRUs are done to Disco and Bionic. One problem w/ the item (2) is that netplan will also have to support the new "KeepConfiguration=" systemd service file stanza, but, the fix (2) is more appropriate for all other HA related softwares controlling virtual IPs (CTDB, Pacemaker, and so ...).

Changed in netplan:
status: Invalid → Confirmed
Changed in keepalived (Ubuntu):
status: Triaged → Confirmed
Changed in systemd (Ubuntu):
status: Triaged → Confirmed
Changed in keepalived (Ubuntu Bionic):
status: New → Confirmed
Changed in keepalived (Ubuntu Disco):
status: New → Confirmed
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Based on comment #12, and other comments from other duplicate cases, I'll summarize here in a better (and consolidated way) how to reproduce the issue, how to mitigate it using the dummy workaround, and how to fix it (with the backports/merge requests). At the end I might provide a PPA asking for feedback.

Changed in systemd (Ubuntu Bionic):
status: New → Confirmed
Changed in systemd (Ubuntu Disco):
status: New → Confirmed
Changed in keepalived (Ubuntu Bionic):
importance: Undecided → Medium
Changed in keepalived (Ubuntu Disco):
importance: Undecided → Medium
Changed in keepalived (Ubuntu Eoan):
importance: Undecided → Medium
Changed in systemd (Ubuntu Bionic):
importance: Undecided → Medium
Changed in systemd (Ubuntu Disco):
importance: Undecided → Medium
Changed in systemd (Ubuntu Eoan):
importance: Undecided → Medium
Changed in keepalived (Ubuntu Bionic):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in keepalived (Ubuntu Disco):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in keepalived (Ubuntu Eoan):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in systemd (Ubuntu Bionic):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in systemd (Ubuntu Disco):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in systemd (Ubuntu Eoan):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in netplan:
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in systemd (Ubuntu Eoan):
status: Confirmed → In Progress
Changed in keepalived (Ubuntu Eoan):
status: Confirmed → In Progress
Changed in heartbeat (Ubuntu Bionic):
importance: Undecided → Medium
status: New → Triaged
Changed in heartbeat (Ubuntu Disco):
importance: Undecided → Medium
status: New → Triaged
Changed in heartbeat (Ubuntu Eoan):
importance: Undecided → Low
status: New → Triaged
Changed in heartbeat (Ubuntu Bionic):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in heartbeat (Ubuntu Disco):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in heartbeat (Ubuntu Eoan):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
Download full text (5.9 KiB)

Alright,

As this is a problem that does not only affect keepalived, but, all cluster-like softwares dealing with aliases in any existing interface, managed or not by systemd, I have tested the same test case in a pacemaker based cluster, with 3 nodes, having 1 virtual IP + a lighttpd instance running in the same resource group:

----

(k)inaddy@kcluster01:~$ crm config show
node 1: kcluster01
node 2: kcluster02
node 3: kcluster03
primitive fence_kcluster01 stonith:fence_virsh \
 params ipaddr=192.168.100.205 plug=kcluster01 action=off login=stonithmgr passwd=xxxx use_sudo=true delay=2 \
 op monitor interval=60s
primitive fence_kcluster02 stonith:fence_virsh \
 params ipaddr=192.168.100.205 plug=kcluster02 action=off login=stonithmgr passwd=xxxx use_sudo=true delay=4 \
 op monitor interval=60s
primitive fence_kcluster03 stonith:fence_virsh \
 params ipaddr=192.168.100.205 plug=kcluster03 action=off login=stonithmgr passwd=xxxx use_sudo=true delay=6 \
 op monitor interval=60s
primitive virtual_ip IPaddr2 \
 params ip=10.0.3.1 nic=eth3 \
 op monitor interval=10s
primitive webserver systemd:lighttpd \
 op monitor interval=10 timeout=60
group webserver_virtual_ip webserver virtual_ip
location l_fence_kcluster01 fence_kcluster01 -inf: kcluster01
location l_fence_kcluster02 fence_kcluster02 -inf: kcluster02
location l_fence_kcluster03 fence_kcluster03 -inf: kcluster03
property cib-bootstrap-options: \
 have-watchdog=true \
 dc-version=2.0.1-9e909a5bdd \
 cluster-infrastructure=corosync \
 cluster-name=debian \
 stonith-enabled=true \
 stonith-action=off \
 no-quorum-policy=stop

----

(k)inaddy@kcluster01:~$ cat /etc/netplan/cluster.yaml
network:
    version: 2
    renderer: networkd
    ethernets:
        eth1:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.1.2/24]
        eth2:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.2.2/24]
        eth3:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.3.2/24]
        eth4:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.4.2/24]
        eth5:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.5.2/24]

----

AND the virtual IP failed right after the netplan acted in systemd network interface.

(k)inaddy@kcluster03:~$ sudo netplan apply
(k)inaddy@kcluster03:~$ ping 10.0.3.1
PING 10.0.3.1 (10.0.3.1) 56(84) bytes of data.
From 10.0.3.4 icmp_seq=1 Destination Host Unreachable
From 10.0.3.4 icmp_seq=2 Destination Host Unreachable
From 10.0.3.4 icmp_seq=3 Destination Host Unreachable
From 10.0.3.4 icmp_seq=4 Destination Host Unreachable
From 10.0.3.4 icmp_seq=5 Destination Host Unreachable
From 10.0.3.4 icmp_seq=6 Destination Host Unreachable
64 bytes from 10.0.3.1: icmp_seq=7 ttl=64 time=0.088 ms
64 bytes from 10.0.3.1: icmp_seq=8 ttl=64 time=0.076 ms

--- 10.0.3.1 ping statistics ---
8 packets transmitted, 2 received, +6 errors, 75% packet loss, time 7128ms
rtt min/avg/max/mdev = 0.076/0.082/0.088/0.006 ms, pipe 4

Liked explained in this bug description. With that, virtual_ip_monitor, from pacemaker, realized the virtual IP was gone and re-started it in the same node:

----

(k)inaddy@k...

Read more...

summary: - [master] Restarting systemd-networkd breaks keepalived clusters
+ [master] Restarting systemd-networkd breaks keepalived, heartbeat,
+ corosync, pacemaker (interface aliases are restarted)
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

The commits bellow implement support to "keep configuration":

commit 1e498853a39b46155cb89b5c9e74ecb27aaba3ed
Author: Yu Watanabe <email address hidden>
Date: Mon Jun 3 01:21:13 2019

    test-network: add tests for KeepConfiguration=

commit c98d78d32abba6aadbe89eece7acf0742f59047c
Author: Yu Watanabe <email address hidden>
Date: Mon Jun 3 03:37:25 2019

    man: add documentation about KeepConfiguration

commit db51778f85cb076e9ed1fe7f7e29cc740365c245
Author: Yu Watanabe <email address hidden>
Date: Mon Jun 3 00:33:13 2019

    network: make KeepConfiguration=static drop DHCP addresses and routes

    Also, KeepConfiguration=dhcp drops static foreign addresses and routes.

commit 95355a281c06c5970b7355c38b066910c3be4958
Author: Yu Watanabe <email address hidden>
Date: Mon Jun 3 14:05:26 2019

    network: add KeepConfiguration=dhcp-on-stop

    The option prevents to drop lease address on stop.
    By setting this, we can safely restart networkd.

commit 7da377ef16a2112a673247b39041a180b07e973a
Author: Susant Sahani <email address hidden>
Date: Mon Jun 3 00:31:13 2019

    networkd: add support to keep configuration

for systemd-networkd.

IMO, we should rely in setting the keep configuration flag for the interfaes to be managed by 3rd part software (adding/removing aliases for virtual networks, VRRP interfaces, etc).

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Edward Hope-Morley (hopem) wrote :

Thanks Rafael/Christian,

I see that all those patches are in 243 and Eoan is currently on 242 (albeit -6 but i dont think any are already backported) so we'll need to get this backported all the way down to Bionic.

max@power:~/git/systemd$ _c=( 7da377e 95355a2 db51778 c98d78d 1e49885 )
max@power:~/git/systemd$ for c in ${_c[@]}; do git tag --contains $c| egrep -v "\-rc"; done| sort -u
v243

Do we have a feel for if/when the keepalived fix(es) will be backportable to B (1.x) as well? Since those fixes already exist in Discco (2.0.10) it might be easier to start with those?

I will add the charm-keepalived to this LP since it will need support for the networkd/netplan fix once that is available.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

@ed,

I just finished the backport to Eoan it was straightforward, I'll finish tests tomorrow with HA related software and networkd enabled HA clusters. After that I'll give you a better estimation about Disco and Bionic.

This is the total size of changes (systemd-networkd-tests.py is not so great to backport, will review that):

$ cat *.patch | diffstat
 man/systemd.network.xml | 27 +-
 src/network/networkd-dhcp4.c | 8
 src/network/networkd-link.c | 57 +++++-
 src/network/networkd-link.h | 2
 src/network/networkd-manager.c | 2
 src/network/networkd-network-gperf.gperf | 3
 src/network/networkd-network.c | 44 ++++
 src/network/networkd-network.h | 26 ++
 test/fuzz/fuzz-network-parser/directives.network | 1
 test/test-network/conf/24-keep-configuration-static.network | 5
 test/test-network/conf/dhcp-client-keep-configuration-dhcp-on-stop.network | 4
 test/test-network/conf/dhcp-client-keep-configuration-dhcp.network | 7
 test/test-network/systemd-networkd-tests.py | 94 +++++++++-
 13 files changed, 235 insertions(+), 45 deletions(-)

Good thing is that the logic is not drastically changed for this feature to exist. Sorry for the delay here, because of freeze we were running to close out some urgent issues for Eoan.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Test Case:

(k)rafaeldtinoco@kcluster03:~$ crm status
Stack: corosync
Current DC: kcluster02 (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Thu Oct 10 17:13:19 2019
Last change: Thu Oct 10 17:11:48 2019 by root via cibadmin on kcluster01

3 nodes configured
5 resources configured

Online: [ kcluster01 kcluster02 kcluster03 ]

Full list of resources:

 fence_kcluster01 (stonith:fence_virsh): Started kcluster02
 fence_kcluster02 (stonith:fence_virsh): Started kcluster01
 fence_kcluster03 (stonith:fence_virsh): Started kcluster01
 Resource Group: webserver_virtual_ip
     webserver (systemd:lighttpd): Started kcluster03
     virtual_ip (ocf::heartbeat:IPaddr2): Started kcluster03

(k)rafaeldtinoco@kcluster03:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:c3:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:c306/64 scope link
       valid_lft forever preferred_lft forever

(k)rafaeldtinoco@kcluster03:~$ systemctl restart systemd-networkd

(k)rafaeldtinoco@kcluster03:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:c3:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:c306/64 scope link
       valid_lft forever preferred_lft forever

<wait for resource monitor timeout, pacemaker starts virtual_ip again>

(k)rafaeldtinoco@kcluster03:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:c3:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:c306/64 scope link
       valid_lft forever preferred_lft forever
(k)rafaeldtinoco@kcluster03:~$

Pacemaker logs:

Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6901]: INFO: IP status = no, IP_CIP=
Oct 10 17:14:37 kcluster03 pacemaker-controld[1266]: notice: Result of stop operation for virtual_ip on kcluster03: 0 (ok)
Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6951]: INFO: Adding inet address 10.0.3.1/24 with broadcast address 10.0.3.255 to device eth3
Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6956]: INFO: Bringing device eth3 up
Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6961]: INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p /run/resource-agents/send_arp-10.0.3.1 eth3 10.0.3.1 auto not_used not_used
Oct 10 17:14:37 kcluster03 pacemaker-controld[1266]: notice: Result of start operation for virtual_ip on kcluster03: 0 (ok)

for the operation.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

(k)rafaeldtinoco@kcluster01:~$ sudo vi /etc/systemd/network/10-netplan-eth3.network

<add KeepConfiguration=static to .network file>

(k)rafaeldtinoco@kcluster01:~$ systemctl restart systemd-networkd

(k)rafaeldtinoco@kcluster01:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:f0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:f003/64 scope link
       valid_lft forever preferred_lft forever

(k)rafaeldtinoco@kcluster01:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:f0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:f003/64 scope link
       valid_lft forever preferred_lft forever

(k)rafaeldtinoco@kcluster01:~$ systemctl restart systemd-networkd

(k)rafaeldtinoco@kcluster01:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:f0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:f003/64 scope link
       valid_lft forever preferred_lft forever

<interface does NOT restart the aliases>

Voila. Needs better testing with KeepConfiguration=dhcp.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
tags: added: sts
Dan Streetman (ddstreet)
description: updated
description: updated
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Leroy, or anyone else affected,

Accepted systemd into eoan-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/242-7ubuntu3.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-eoan to verification-done-eoan. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-eoan. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Eoan):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-eoan
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (systemd/242-7ubuntu3.2)

All autopkgtests for the newly accepted systemd (242-7ubuntu3.2) for eoan have finished running.
The following regressions have been reported in tests triggered by the package:

gvfs/1.42.1-1ubuntu1 (amd64)
systemd/242-7ubuntu3.2 (ppc64el)
ndctl/unknown (armhf)
casper/1.427 (amd64)
netplan.io/0.98-0ubuntu1 (ppc64el)
munin/unknown (armhf)
linux-oem-osp1/5.0.0-1026.29 (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/eoan/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
Download full text (4.5 KiB)

(k)rafaeldtinoco@kcluster01:~$ dpkg -l | grep "ii systemd "
ii systemd 243-3ubuntu1 amd64 system and service manager

k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "dpkg -l | grep systemd "; done | grep "ii systemd "

ii systemd 243-3ubuntu1 amd64 system and service manager
ii systemd 243-3ubuntu1 amd64 system and service manager
ii systemd 243-3ubuntu1 amd64 system and service manager
----

(k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "cat /etc/systemd/network/10-netplan-eth3.network"; done
[Match]
Name=eth3

[Network]
LinkLocalAddressing=ipv6
Address=10.0.3.2/24
KeepConfiguration=static
[Match]
Name=eth3

[Network]
LinkLocalAddressing=ipv6
Address=10.0.3.3/24
KeepConfiguration=static
[Match]
Name=eth3

[Network]
LinkLocalAddressing=ipv6
Address=10.0.3.4/24
KeepConfiguration=static

----

(k)rafaeldtinoco@kcluster01:~$ crm status
Stack: corosync
Current DC: kcluster01 (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Tue Nov 19 16:38:15 2019
Last change: Mon Nov 18 12:41:14 2019 by root via crm_resource on kcluster01

3 nodes configured
5 resources configured

Online: [ kcluster01 kcluster02 kcluster03 ]

Full list of resources:

 fence_kcluster01 (stonith:fence_virsh): Started kcluster02
 fence_kcluster02 (stonith:fence_virsh): Started kcluster01
 fence_kcluster03 (stonith:fence_virsh): Started kcluster01
 Resource Group: webserver_virtual_ip
     webserver (systemd:lighttpd): Started kcluster01
     virtual_ip (ocf::heartbeat:IPaddr2): Started kcluster01

----

(k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "hostname ; ip addr show eth3"; done

kcluster01
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:a0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:a003/64 scope link
       valid_lft forever preferred_lft forever
kcluster02
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:1d:1a:cc brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.3/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe1d:1acc/64 scope link
       valid_lft forever preferred_lft forever
kcluster03
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:13:16 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:1316/64 scope link
       valid_lft forever preferred_lft forever

----

in parallel:

(k)rafaeldtinoco@kcluster01:~$ journalctl -f -u pacemaker

and check if events are generated (vip monitor detects changes)

----

(k)rafaeldtinoco@kcluster01:~$ systemctl restart sy...

Read more...

tags: added: verification-done verification-done-eoan
removed: verification-needed verification-needed-eoan
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Flagging this as wont fix as heartbeat is already being kept just for historical reasons (and systemd-networkd can workaround that by the fix we're backporting to it: KeepConfiguration .service file stanza).

Changed in heartbeat (Ubuntu Eoan):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in heartbeat (Ubuntu Disco):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in heartbeat (Ubuntu Bionic):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in heartbeat (Ubuntu):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
no longer affects: heartbeat (Ubuntu Eoan)
no longer affects: heartbeat (Ubuntu Disco)
no longer affects: heartbeat (Ubuntu Bionic)
Changed in heartbeat (Ubuntu):
status: Triaged → Won't Fix
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 242-7ubuntu3.2

---------------
systemd (242-7ubuntu3.2) eoan; urgency=medium

  [ Dan Streetman ]
  * d/extra/dhclient-enter-resolved-hook:
    - Replace use of bash-only &> with > and 2> (LP: #1849608)
  * d/p/lp1849658-resolved-set-stream-type-during-DnsStream-creation.patch:
    - Fix bug in refcounting TCP stream types (LP: #1849658)
  * d/extra/dhclient-enter-resolved-hook: cleanup temp $newstate file

  [ Rafael David Tinoco ]
  * Add support to KeepConfiguration= fixing behaviour for HA (LP: #1815101)
    - d/p/lp1815101-01-networkd-add-support-to-keep-configuration.patch
    - d/p/lp1815101-02-networkd-stop-clients-when-networkd-shuts-down.patch
    - d/p/lp1815101-03-network-add-KeepConfiguration-dhcp-on-stop.patch
    - d/p/lp1815101-04-network-make-KeepConfiguration-static-drop-DHCP-addr.patch
    - d/p/lp1815101-05-man-add-documentation-about-KeepConfiguration.patch

systemd (242-7ubuntu3.1) eoan; urgency=medium

  [ Balint Reczey ]
  * Fix shutdown and related actions from the login screen (LP: #1847896)
    File: debian/patches/logind-consider-greeter-sessions-suitable-as-display-sess.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=b407dfd8c9dc81594553c27467c35b383333d74c
  * debian/gbp.conf: Set debian-branch to ubuntu-eoan
    File: debian/gbp.conf
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=f399ce2cf4701a2dbb4b3505d2dd17a210d62f5c

  [ Dan Streetman ]
  * Fix bogus routes after DHCP lease change (LP: #1831787)
    Files:
    - debian/patches/lp1831787/0001-networkd-Add-back-static-routes-after-DHCPv4-lease-e.patch
    - debian/patches/lp1831787/0002-network-set-preferred-source-in-removing-route-entry.patch
    - debian/patches/lp1831787/0003-network-lower-log-level-about-critical-connection.patch
    - debian/patches/lp1831787/0004-network-reset-Link-dhcp4_configured-flag-earlier.patch
    - debian/patches/lp1831787/0005-network-split-dhcp_lease_lost-into-small-pieces.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=ced3f5c2f619083f7beb164d94d4ccfe52222fe8
  * Set src address for dhcp 'classless' routes (LP: #1835581)
    File: debian/patches/lp1835581-src-network-networkd-dhcp4.c-set-prefsrc-for-classle.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=6a7ef370fb1335548448920be4ae6176b67044a8
  * Allows cache=no-negative option to be set, ignoring negative answers to
    be cached (LP: #1668771)
    File: debian/patches/lp1668771-resolved-switch-cache-option-to-a-tri-state-option-s.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=27def26f5b1d1b8ba314c4a925fc1b7c43837f86

 -- Dan Streetman <email address hidden> Fri, 01 Nov 2019 16:33:08 -0400

Changed in systemd (Ubuntu Eoan):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for systemd has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Dan Streetman (ddstreet) wrote :

as disco reaches EOL next week, marking this as wontfix for disco.

Changed in systemd (Ubuntu Disco):
status: Confirmed → Won't Fix
Changed in keepalived (Ubuntu Disco):
status: Confirmed → Won't Fix
Revision history for this message
David Negreira (dnegreira) wrote :

Can we backport the fixes to Bionic?

Revision history for this message
Balint Kovacs (kovacs-balint-o) wrote :

Hi all,

thanks for the fixes in Eoan. Unfortunately we have a product based on disco and cannot move forward at this time. Being a networking shop, this issue has a serious effect on us and we would like to avoid moving to something like ifupdown2 within our stable branch.

For our users the real impact of the bug is not that that the interface that we are currently reconfiguring is suffering a downtime, but the fact that _all_ interfaces have their aliases removed if networkd is restarted. The proposed KeepConfiguration solution kind of beats the purpose of reconfiguring the interfaces, as old addresses are kept and need to be handled manually. Also it interferes with how DHCP works. I believe this might be an issue for others as well.

From our point of view the ideal solution would be a combination of the keepalived patch that detects VIP removal and systemd version 244 that already supports "networkctl reconfigure" and "networkctl reload".

Is there any chance that v244 is backported to bionic? It is already included in focal and debian stable backports, but unfortunately I am not familiar enough with systemd development to tell what the impact of this would be.

As for keepalived, in bug #1819074 there was an ongoing investigation on the patch, that implements the keepalived transition on removing the VIP. We have traced back this functionality to this patch:

https://github.com/acassen/keepalived/commit/0b1528c76d3fe8d1c5765841df86c59570a036da

It was born before v1.3.6 was released, so we hope that it is self-contained enough for a backport if v2.0 of keepalived is not included in bionic-backports.

Best,
Balint

Changed in keepalived (Ubuntu Xenial):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
importance: Undecided → Medium
status: New → Confirmed
no longer affects: heartbeat (Ubuntu Xenial)
Changed in systemd (Ubuntu Xenial):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote : Re: [Bug 1815101] Re: [master] Restarting systemd-networkd breaks keepalived, heartbeat, corosync, pacemaker (interface aliases are restarted)
Download full text (5.1 KiB)

Balint, based on your input...

> thanks for the fixes in Eoan. Unfortunately we have a product based on
> disco and cannot move forward at this time. Being a networking shop,
> this issue has a serious effect on us and we would like to avoid moving
> to something like ifupdown2 within our stable branch.

So, Disco is EOL as it is not a LTS version, that is why it did not
get a fix (as the fix is very close to the one done in Eoan). Since
its unsupported by the community, it's up to you backport the Eoan
fixes to Disco if you'd like... you can even create a PPA for your
product and distribute along.

> For our users the real impact of the bug is not that that the interface
> that we are currently reconfiguring is suffering a downtime, but the
> fact that _all_ interfaces have their aliases removed if networkd is
> restarted. The proposed KeepConfiguration solution kind of beats the
> purpose of reconfiguring the interfaces, as old addresses are kept and
> need to be handled manually. Also it interferes with how DHCP works. I
> believe this might be an issue for others as well.

We are following systemd-networkd upstream decisions here. The option
"dhcp" only exists for CERTAIN scenarios (when root disk depends on
that connection, for iSCSI and/or NFS/ROOT for example). It is
explicitly said in the documentation:

"""
Takes a boolean or one of "static", "dhcp-on-stop", "dhcp". When
"static", systemd-networkd will not drop static addresses and routes
on starting up process. When set to "dhcp-on-stop", systemd-networkd
will not drop addresses and routes on stopping the daemon. When
"dhcp", the addresses and routes provided by a DHCP server will never
be dropped even if the DHCP lease expires. This is contrary to the
DHCP specification, but may be the best choice if, e.g., the root
filesystem relies on this connection. The setting "dhcp" implies
"dhcp-on-stop", and "yes" implies "dhcp" and "static". Defaults to
"no".
"""

and it is a question of choice: to have a window of opportunity for
duplicate IPs - in cases where there is no dynamic IP mapping to that
mac address - but possibly maintain the connection instead of causing
uninterruptable I/Os trying to shutdown a machine, for example. I
particularly don't like this option but it is not the default one and
was meant for a specific purpose.

>
> >From our point of view the ideal solution would be a combination of the
> keepalived patch that detects VIP removal and systemd version 244 that
> already supports "networkctl reconfigure" and "networkctl reload".

networkctl reconfigure/reload is a new functionality and won't be
added to previous already released versions as this is against SRU
guidelines. Systemd 244.2-1ubuntu1 is being included in 20.04, our
NEXT LTS version.

Like said before, you can try backporting systemd 244 to disco, or
bionic, if you are willing to support it on your own as it was already
EOL for community support. You should follow:
https://packaging.ubuntu.com/html/backports.html if you would like to
do that.

For the keepalived patches, they could be backported to Eoan, maybe
Bionic and Xenial depending on the amount of work. But then I would
need a practical example of wh...

Read more...

Dan Streetman (ddstreet)
tags: added: ddstreet
Revision history for this message
George Kraft (cynerva) wrote :

Removing charm-keepalived since I believe no changes are needed there. It should pick up fixes once they are available on apt archives.

no longer affects: charm-keepalived
Revision history for this message
Claudio Kuenzler (napsty) wrote :

FYI I stumbled on this problem after a system update (which broke production!) and collected data in a troubleshooting session and documented this here https://www.claudiokuenzler.com/blog/959/keepalived-virtual-ip-addresses-gone-lost-after-systemd-update. Once I found out the restart of systemd-networkd causes the keepalived vips to be gone, I finally came across this bug.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

@napsty: the "workaround" (from your blog) is actually to use:

- ifupdown/bridge-utils/vlan/resolvconf for network setup OR
- use systemd-networkd DIRECTLY with the KeepConfiguration= option in .network file

Just highlighting it here.

@ddstreet, you said you would try to come up with the netplan change for KeepConfiguration. Did you have time to check on this ? (just checking).

Cheers o/

Changed in keepalived (Ubuntu):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in keepalived (Ubuntu Xenial):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in keepalived (Ubuntu Bionic):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in keepalived (Ubuntu Disco):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in keepalived (Ubuntu Eoan):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in systemd (Ubuntu):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in systemd (Ubuntu Xenial):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in systemd (Ubuntu Bionic):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in systemd (Ubuntu Disco):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in systemd (Ubuntu Eoan):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in netplan:
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
no longer affects: keepalived (Ubuntu Eoan)
no longer affects: keepalived (Ubuntu Disco)
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

TL;DR TODO SUMMARY:

- netplan change to support KeepConfiguration= for systemd-networkd backend (Groovy)
- backport this change: netplan for Ubuntu Focal (SRU)
- backport this change: netplan for Ubuntu Eoan (SRU, WontFix due to EOL ?)
- backport this change: netplan for Ubuntu Bionic (SRU)
- backport this change: netplan for Ubuntu Xenial (SRU, WontFix ?)

Changed in systemd (Ubuntu Focal):
status: New → Fix Released
Changed in keepalived (Ubuntu Focal):
status: New → Confirmed
no longer affects: heartbeat (Ubuntu Focal)
Dan Streetman (ddstreet)
Changed in systemd (Ubuntu Bionic):
assignee: nobody → Jorge Niedbalski (niedbalski)
status: Confirmed → In Progress
Dan Streetman (ddstreet)
Changed in systemd (Ubuntu Bionic):
assignee: Jorge Niedbalski (niedbalski) → Eric Desrochers (slashd)
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Ubuntu HA wise:

I'm recommending all HA clusters to have the cluster interfaces configured with systemd-networkd DIRECTLY instead of wrapping it through netplan.io. At least until we're sure that HA has no issues with netplan.io, having it configured directly will allow us to isolate possible issues.

I see that this has been assigned to @slashd. Eric, Important thing here is to have netplan fix in focal (as it is the latest LTS) in order for HA to be supported with it. KeepConfiguration= is good enough, for now, if using systemd-networkd only.

Thank you!

tags: removed: server-next
Revision history for this message
Eric Desrochers (slashd) wrote :

I have a first iteration of a package:

It's not a final solution nor a long term solution. It is only made to determine if its fix the problem before considering an SRU: (Ideally one would test this package in non-production area)

Adding this PPA to your system
sudo add-apt-repository ppa:slashd/sf263217
sudo apt-get update

Please report back any feedbacks in this bug.

- Eric

Revision history for this message
Eric Desrochers (slashd) wrote :

The above test package has been made for 'systemd' in bionic ^

Revision history for this message
Sheng-Kai Lin (kester-lin) wrote :

Dear Eric Desrochers,
  I add the PPA into my ubuntu 18.04 corosync/pacemaker service node.
Then I upgrade the following libnss-systemd libpam-systemd libsystemd0 libudev1 systemd systemd-sysv udev package.

But it still failed after re-connect the notwork.
The crmsh show as below:
ERROR: status: crm_mon (rc=107): Connection to cluster failed: Transport endpoint is not connected.

I also check the dmesg but it seem ok.
e1000: enp0s8 NIC Link is Down
e1000: enp0s8 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX

Could you describe more detail to help me figure out it is my mistake operation or something wrong in my environment?

Thank you.

Eric Desrochers (slashd)
Changed in systemd (Ubuntu Bionic):
assignee: Eric Desrochers (slashd) → nobody
Revision history for this message
Dan Streetman (ddstreet) wrote :

i'm marking this as wont-fix for xenial.

i'm inclined to also mark this as wont-fix for bionic, unless there are still people affected by this problem using bionic without some other workaround.

Changed in systemd (Ubuntu Xenial):
status: Confirmed → Won't Fix
Changed in systemd (Ubuntu Bionic):
status: In Progress → Incomplete
Revision history for this message
Jasper Spaans (jap171) wrote :

For the people on Focal that want to use netplan and keepalived together: you can just put in an override for the network unit file, to keep systemd-networkd from touching your interface!

```
$ cat /etc/systemd/network/10-netplan-eno1.network.d/override.conf
[Network]
KeepConfiguration=static
$
```

This might be a good enough workaround until this is really fixed.

Revision history for this message
Sebastian (slovdahl) wrote :

> i'm inclined to also mark this as wont-fix for bionic, unless there are still people affected by this problem using bionic without some other workaround.

What kind of workarounds for binoic does this refer to? I have not found any reliable workarounds yet at least.

Revision history for this message
Jianan Wang (wangjianan-zju) wrote :

> > i'm inclined to also mark this as wont-fix for bionic, unless there are still people affected
  by this problem using bionic without some other workaround.

> What kind of workarounds for binoic does this refer to? I have not found any reliable workarounds
  yet at least.

+1. I did not find any other solution either. Please help backward fix binoic as well. Thanks

Revision history for this message
Eric Desrochers (slashd) wrote :

Any volunteer to test a package in Bionic in the attempt to support keepconfiguration ?

Revision history for this message
Maanus Kask (maanus) wrote :

I want to test it in Bionic.

Revision history for this message
Eric Desrochers (slashd) wrote :

@maanus, That is a first iteration.

sudo add-apt-repository ppa:slashd/keepconfiguration
sudo apt-get update

Let me know the outcome.

- Eric

Revision history for this message
Maanus Kask (maanus) wrote :

I am using netplan - default in bionic.
I added the repo, made update and reboot.
'systemctl restart systemd-networkd' resulted with keepalived VIP lost.

I added following file and rebooted:
/etc/systemd/network/10-netplan-ens160.network :
---
[Match]
Name=ens160

[Network]
KeepConfiguration=static
---
Network did not come up after reboot.

changed /etc/systemd/network/10-netplan-ens160.network and rebooted:
---
[Match]
Name=ens160

[Network]
Address=10.1.1.233/29
Gateway=10.1.1.238
KeepConfiguration=static
---
'systemctl restart systemd-networkd' did not have any bad results - keepalived VIP remained as expected.

Now I need to have IP and gateway in two files - /etc/netplan/01-netcfg.yaml and /etc/systemd/network/10-netplan-ens160.network

Maanus

Revision history for this message
Eric Desrochers (slashd) wrote :

Thanks @Maanus.

So the outcome is positive here then.

- Eric

Revision history for this message
Dan Streetman (ddstreet) wrote :

> Now I need to have IP and gateway in two files - /etc/netplan/01-netcfg.yaml and
> /etc/systemd/network/10-netplan-ens160.network

to clarify, don't do that, you should create a systemd-networkd 'drop-in' instead of copying/modifying the file.

For example, if the networkd config filename for your interface is '10-netplan-ens160.network' (regardless of which directory it's located in), you should create a new file '/etc/systemd/network/10-netplan-ens160.network.d/override.conf' with only the content you want to add/modify, specifically in this case:

---
[Network]
KeepConfiguration=static
---

For more detail on systemd-networkd 'drop-in' files, see 'man systemd.network' in the first paragraph.

Revision history for this message
Eric Desrochers (slashd) wrote :

Good catch Dan.

Maanus could you repeat the testing with what Dan brought up ?

Revision history for this message
Maanus Kask (maanus) wrote :

The "default bionic install" uses only netplan yaml and /etc/systemd/network/ is empty.

If I switch to systemd.network style, I have to create config files from scratch anyway and then this KeepConfiguration helps. (Of course I do not need to keep the netplan yaml after enabling systemd-networkd, my remark for having the IP configuration in two files was wrong)

Using '/etc/systemd/network/10-netplan-ens160.network.d/override.conf' with KeepConfiguration works perfectly the same.

Revision history for this message
Dan Streetman (ddstreet) wrote :

> The "default bionic install" uses only netplan yaml and /etc/systemd/network/ is empty.

yes, but netplan creates .network config files named in a deterministic way so you know what the name of the networkd file it creates will be. I don't think netplan currently has a mechanism to include keep-configuration options in the networkd config it creates.

> Using '/etc/systemd/network/10-netplan-ens160.network.d/override.conf' with KeepConfiguration works perfectly the same.

exactly, which is why you should use the drop-in instead of manually duplicating and editing the entire netplan-created .network config file, as you complained about when you said:
> Now I need to have IP and gateway in two files
since, no, you don't.

Or, drop netplan and just directly configure systemd-networkd.

Revision history for this message
Teluka (mateusz-p) wrote :

I've tested 237-3ubuntu10.50+testpkg20210802b3 the ppa and it resolves the issue.

Revision history for this message
Eric Desrochers (slashd) wrote :

Uploaded into bionic upload queue, now waiting for SRU approval.

- Eric

Changed in systemd (Ubuntu Bionic):
assignee: nobody → Eric Desrochers (slashd)
status: Incomplete → In Progress
Revision history for this message
Maanus Kask (maanus) wrote :

Thank you Dan and Eric!

Summarizing howto with bionic for myself:
Upgrade systemd to the version with "keepconfiguration" fix.
Look at /run/systemd/network/*.network filename, in my example it is "10-netplan-ens160.network"
Make a directory adding ".d" to this name: /etc/systemd/network/10-netplan-ens160.network.d
Add a file "override.conf" into this directory, with content:
---
[Network]
KeepConfiguration=static
---

Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Leroy, or anyone else affected,

Accepted systemd into bionic-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/237-3ubuntu10.51 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested, what testing has been performed on the package and change the tag from verification-needed-bionic to verification-done-bionic. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-bionic. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Bionic):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-bionic
removed: verification-done
Revision history for this message
Łukasz Zemczak (sil2100) wrote :

Since the changeset is composed from multiple patches, please be sure to execute both the test cases mentioned in the Test Case section. Thanks!

Revision history for this message
Eric Desrochers (slashd) wrote :

Hi Maanus,

Would you mind perform the 'test cases' now against the systemd's bionic-proposed package ? And report any outcome.

This is one of the last steps for Lukasz to approve the proposed package after the baking minimum aging period (7 days) in verification phase.

- Eric

Revision history for this message
Maanus Kask (maanus) wrote :

I installed the package from bionic-proposed and performed the test cases using keepalived - VIP is not lost. This resolves it.

Thank you!

Revision history for this message
Eric Desrochers (slashd) wrote :

Thanks Maanus for all the testing. Much appreciated.

I'll take care of the rest in a couple of days with Lukasz.
Package needs to stay in proposed for a couple more days (minimum 7 days)

- Eric

tags: added: verification-done-bionic
removed: verification-needed-bionic
Revision history for this message
Eric Desrochers (slashd) wrote :

systemd reaches day 7 wo/ autopkgtest failure nor negative outcome/feedbacks.

I have asked 'sil2100' to promote the package into bionic-updates.

Stay tuned ...

- Eric

Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 237-3ubuntu10.51

---------------
systemd (237-3ubuntu10.51) bionic; urgency=medium

  * Add support to keepconfiguration (LP: #1815101)
    - lp1815101-0001-add-macro-if-flags-are-set.patch
    - lp1815101-0002-networkd-add-support-to-keepconfiguration.patch
    - lp1815101-0003-network-use-hashmap_steal_first-rather-than-hashmap_.patch
    - lp1815101-0004-networkd-stop-clients-when-networkd-shuts-down.patch
    - lp1815101-0005-network-add-KeepConfiguration-dhcp-on-stop.patch
    - lp1815101-0006-network-make-KeepConfiguration-static-drop-DHCP-addr.patch
    - lp1815101-0007-man-add-documentation-about-KeepConfiguration.patch

 -- Eric Desrochers <email address hidden> Mon, 26 Jul 2021 11:31:02 -0400

Changed in systemd (Ubuntu Bionic):
status: Fix Committed → Fix Released
Revision history for this message
Lukas Märdian (slyon) wrote :

netplan uses the "networkctl reload/reconfigure" commands nowadays, instead of hard restarting systemd-networkd: https://github.com/canonical/netplan/pull/200

This change was activated in v0.104 in the Distro, which landed in Jammy and is currently being SRUed to Focal and Impish (LP: #1964481).

Please re-open if you can still observe this issue with netplan 0.104.

Changed in netplan:
status: Confirmed → Fix Released
Revision history for this message
Garagoth (garagoth) wrote :

Hi!

I am having trouble making this work in graceful manner.
Fresh Ubuntu 22.04, systemd 249.11-0ubuntu3.6, netplan 0.104-0ubuntu2.1.
Host has multiple interfaces (6 physical, 2 bonds, 3 vlans on those bonds) and multiple (20 to 30) IP addresses on those vlan interfaces.

EVERY time i issue `netplan apply` everything goes down and up again. Even with no changes to netplan configuration.
systemd-networkd logs interfaces going down and up again.

Changing anything in netplan config (like adding or removing one of IP addresses from single interface), doing `netplan generate` && `networkctl reload` restarts everything as well.

How I am supposed to simply add or remove IP address to existing interface without causing delay of about 5 seconds in network traffic, when all interfaces are being restarted?
Am I doing something incorrectly here?

Regards,
Marcin.

Revision history for this message
Lukas Märdian (slyon) wrote :

The KeepConfiguration= setting is part of systemd nowadays, so I'm closing the systemd component.

Netplan does not currently make use of "KeepConfiguration=" [0] though. Marcin, could you try to place a systemd-networkd override file, e.g. in /etc/systemd/network/10-netplan-eth0.network.d/override.conf (depending on your interface name), that contains a corresponding KeepConfiguration setting and check if that makes any difference?

[0] https://www.freedesktop.org/software/systemd/man/systemd.network.html#KeepConfiguration=

Changed in netplan:
importance: Undecided → Medium
status: Fix Released → Incomplete
Changed in systemd (Ubuntu):
status: In Progress → Fix Released
Revision history for this message
Garagoth (garagoth) wrote :
Download full text (5.2 KiB)

I do not see any difference.
However I am not convinced that this is not a networkd issue anyway... here are journal logs from `netplan apply`, where n1p1, n1p2 are first and second port of first network card, n2p2, n2p2 for second network card:

systemd[1]: Reloading.
systemd-networkd[2352]: enp161s0f1np1: Re-configuring with /run/systemd/network/10-netplan-n2p2.network
systemd-networkd[2352]: lan_si: Re-configuring with /run/systemd/network/10-netplan-lan_si.network
systemd-networkd[2352]: eno1: Re-configuring with /run/systemd/network/10-netplan-zzz-all-en.network
systemd-networkd[2352]: wan: Re-configuring with /run/systemd/network/10-netplan-wan.network
systemd-networkd[2352]: bond_wan: Re-configuring with /run/systemd/network/10-netplan-bond_wan.network
systemd-networkd[2352]: eno2: Re-configuring with /run/systemd/network/10-netplan-zzz-all-en.network
systemd-networkd[2352]: lan_mgmnt: Re-configuring with /run/systemd/network/10-netplan-lan_mgmnt.network
systemd-networkd[2352]: eno33np0: Re-configuring with /run/systemd/network/10-netplan-n1p1.network
systemd-networkd[2352]: eno34np1: Re-configuring with /run/systemd/network/10-netplan-n1p2.network
systemd-networkd[2352]: bond_lan: Re-configuring with /run/systemd/network/10-netplan-bond_lan.network
systemd-networkd[2352]: enp161s0f0np0: Re-configuring with /run/systemd/network/10-netplan-n2p1.network
systemd-networkd[2352]: enp161s0f1np1: Link DOWN
systemd-networkd[2352]: enp161s0f1np1: Lost carrier
systemd-networkd[2352]: enp161s0f1np1: Re-configuring with /run/systemd/network/10-netplan-n2p2.network
systemd-networkd[2352]: eno2: Re-configuring with /run/systemd/network/10-netplan-zzz-all-en.network
systemd[1]: Condition check resulted in OpenVSwitch configuration for cleanup being skipped.
chronyd[17235]: Source 10.160.4.2 offline
chronyd[17235]: Source 10.160.4.1 offline
kernel: bond_wan: (slave enp161s0f0np0): link status definitely down, disabling slave
kernel: bond_wan: now running without any active interface!
kernel: bond_wan: (slave eno33np0): link status definitely down, disabling slave
kernel: bond_lan: (slave enp161s0f1np1): link status definitely down, disabling slave
kernel: bond_lan: (slave eno34np1): link status definitely down, disabling slave
kernel: bond_lan: now running without any active interface!
systemd-networkd[2352]: enp161s0f0np0: Re-configuring with /run/systemd/network/10-netplan-n2p1.network
systemd-networkd[2352]: eno34np1: Re-configuring with /run/systemd/network/10-netplan-n1p2.network
systemd-networkd[2352]: eno33np0: Re-configuring with /run/systemd/network/10-netplan-n1p1.network
systemd-networkd[2352]: eno33np0: Link DOWN
systemd-networkd[2352]: eno33np0: Lost carrier
systemd-networkd[2352]: eno1: Re-configuring with /run/systemd/network/10-netplan-zzz-all-en.network
systemd-networkd[2352]: eno34np1: Link DOWN
systemd-networkd[2352]: eno34np1: Lost carrier
systemd-networkd[2352]: bond_wan: Re-configuring with /run/systemd/network/10-netplan-bond_wan.network
systemd-networkd[2352]: bond_lan: Re-configuring with /run/systemd/network/10-netplan-bond_lan.network
systemd-networkd[2352]: enp161s0f0np0: Link DOWN
systemd-networkd[2352]: enp161s0f0np0: Los...

Read more...

Revision history for this message
Garagoth (garagoth) wrote :

It is reproducible on a VM with a simple netplan file with only one ethernet and one bond interface:

network:
    ethernets:
        n1p1:
            match:
                name: ens192
    bonds:
        bond_all:
            interfaces:
            - n1p1
            parameters:
                mode: balance-rr
            addresses:
            - 10.1.1.206/24
            nameservers:
                addresses:
                - 10.1.1.1
            routes:
            - metric: 150
                to: 0.0.0.0/0
                via: 10.1.1.1

This always produces following logs, together with a brief network loss:
Oct 18 09:03:17 test01 systemd[1]: Reloading.
Oct 18 09:03:18 test01 systemd-networkd[738]: bond_all: Re-configuring with /run/systemd/network/10-netplan-bond_all.network
Oct 18 09:03:18 test01 systemd-networkd[738]: ens192: Re-configuring with /run/systemd/network/10-netplan-n1p1.network
Oct 18 09:03:18 test01 systemd-networkd[738]: ens192: Link DOWN
Oct 18 09:03:18 test01 systemd-networkd[738]: ens192: Lost carrier
Oct 18 09:03:18 test01 systemd-networkd[738]: bond_all: Re-configuring with /run/systemd/network/10-netplan-bond_all.network
Oct 18 09:03:18 test01 systemd-networkd[738]: ens192: Re-configuring with /run/systemd/network/10-netplan-n1p1.network
Oct 18 09:03:18 test01 kernel: vmxnet3 0000:0b:00.0 ens192: intr type 3, mode 0, 5 vectors allocated
Oct 18 09:03:18 test01 kernel: vmxnet3 0000:0b:00.0 ens192: NIC Link is Up 10000 Mbps
Oct 18 09:03:18 test01 systemd-networkd[738]: ens192: Link UP
Oct 18 09:03:18 test01 systemd-networkd[738]: ens192: Gained carrier

Dave Jones (waveform)
Changed in netplan:
status: Incomplete → Triaged
Revision history for this message
Lukas Märdian (slyon) wrote :

Thank you for the minimal reproducer!

Using that, I can reconstruct the issue in a Kinetic LXD container and I can observe the logs as you do, especially the "eth0: Link DOWN" part.

Oct 20 13:43:47 test systemd-networkd[1276]: eth0: Reconfiguring with /run/systemd/network/10-netplan-eth0.network.
Oct 20 13:43:47 test systemd-networkd[1276]: bond_all: Reconfiguring with /run/systemd/network/10-netplan-bond_all.network.
Oct 20 13:43:47 test systemd-networkd[1276]: bond_all: DHCPv6 lease lost
Oct 20 13:43:47 test systemd-networkd[1276]: eth0: Link DOWN
Oct 20 13:43:47 test systemd-networkd[1276]: eth0: Lost carrier
Oct 20 13:43:47 test systemd-networkd[1276]: eth0: Configuring with /run/systemd/network/10-netplan-eth0.network.
Oct 20 13:43:47 test systemd-networkd[1276]: bond_all: Configuring with /run/systemd/network/10-netplan-bond_all.network.
Oct 20 13:43:47 test systemd-networkd[1276]: bond_all: DHCPv6 lease lost
Oct 20 13:43:47 test systemd-networkd[1276]: eth0: Link UP
Oct 20 13:43:47 test systemd-networkd[1276]: eth0: Gained carrier

I tried using override configs, but that didn't change anything.
# cat /etc/systemd/network/10-netplan-{eth0,bond_all}.network.d/override.conf
[Network]
KeepConfiguration=true

The same issue happens when calling "netplan apply", "networkctl reconfigure eth0/bond_all" (which netplan apply is calling under the hood) or "systemctl restart systemd-networkd.service".

I think there isn't a lot that netplan could do to avoid this, if sd-networkd keeps on insisting to restart the interface. Would you mind opening a bug report with the upstream systemd developers about this? https://github.com/systemd/systemd/issues

Changed in netplan:
importance: Medium → Low
Revision history for this message
Lukas Märdian (slyon) wrote :

In bug #1992241 it is suggested that this problem is due to sd-networkd relying on the file's timestamp to decide if the interface needs to be re-started. Netplan generate/apply would re-create the (identical) files in /run/systemd/network/10-netplan-*.network but those would have a newer timestmp...

Revision history for this message
sles (slesru) wrote :

If problem is in timestamp, then this is netplan problem- why it regenerates files with no changes?

Revision history for this message
Garagoth (garagoth) wrote :

I did report to systemd: https://github.com/systemd/systemd/issues/25067
As for timestamps, systemd-networkd does not do any weird things when timestamps change on non-bridge interfaces.

Revision history for this message
sles (slesru) wrote :

Well, no weird thing, yes, probably it just restart interface,
this is wrong if other interface config is changed...

Revision history for this message
Garagoth (garagoth) wrote :

I think fix will be released in systemd 253.

Revision history for this message
Alex Kompel (velocloud) wrote :

We are seeing SIGSEGV related to this in Bionic. lp1815101-0006-network-make-KeepConfiguration-static-drop-DHCP-addr.patch is missing network checks in link_drop_foreign_config

Would it be possible to incorporate this patch from upstream to prevent this? https://github.com/systemd/systemd/commit/b1b0b42e48303134731e017a108c6c334ef5f4c8

----

Core was generated by `/lib/systemd/systemd-networkd'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000555962aa1c34 in link_drop_foreign_config (link=link@entry=0x555963583f30) at ../src/network/networkd-link.c:2741
2741 ../src/network/networkd-link.c: No such file or directory.
(gdb) bt
#0 0x0000555962aa1c34 in link_drop_foreign_config (link=link@entry=0x555963583f30) at ../src/network/networkd-link.c:2741
#1 0x0000555962aa233d in link_carrier_lost.lto_priv.328 (link=<optimized out>, link=<optimized out>) at ../src/network/networkd-link.c:3462
#2 0x0000555962a8e9b2 in link_update (m=0x5559635702c0, link=<optimized out>) at ../src/network/networkd-link.c:3698
#3 manager_rtnl_process_link (rtnl=<optimized out>, message=0x5559635702c0, userdata=<optimized out>) at ../src/network/networkd-manager.c:713
#4 0x0000555962a48a16 in process_match (m=0x5559635702c0, rtnl=0x55596355d990) at ../src/libsystemd/sd-netlink/sd-netlink.c:388
#5 process_running (ret=0x0, rtnl=0x55596355d990) at ../src/libsystemd/sd-netlink/sd-netlink.c:418
#6 sd_netlink_process (rtnl=0x55596355d990, ret=ret@entry=0x0) at ../src/libsystemd/sd-netlink/sd-netlink.c:452
#7 0x0000555962a48cb3 in time_callback (s=<optimized out>, usec=<optimized out>, userdata=<optimized out>) at ../src/libsystemd/sd-netlink/sd-netlink.c:759
#8 0x0000555962a4dbae in source_dispatch (s=s@entry=0x55596355dce0) at ../src/libsystemd/sd-event/sd-event.c:2311
#9 0x0000555962a4de2a in sd_event_dispatch (e=<optimized out>, e@entry=0x55596355bf70) at ../src/libsystemd/sd-event/sd-event.c:2663
#10 0x0000555962a4dfb9 in sd_event_run (e=<optimized out>, e@entry=0x55596355bf70, timeout=timeout@entry=18446744073709551615) at ../src/libsystemd/sd-event/sd-event.c:2723
#11 0x0000555962a4e1fb in sd_event_loop (e=<optimized out>) at ../src/libsystemd/sd-event/sd-event.c:2744
#12 0x0000555962a223d6 in main (argc=<optimized out>, argv=<optimized out>) at ../src/network/networkd.c:158

(gdb) p link->network
$2 = (struct Network *) 0x0

Revision history for this message
Nick Rosbrook (enr0n) wrote :

Alex - Can you please open a new bug report about the crash you are seeing? Please include any details you have about how to reproduce the crash.

Revision history for this message
Robie Basak (racb) wrote :

What's the status of keepalived in Mantic on this bug?

tags: added: server-triage-discuss
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Marking todo to recheck how the situation is today.

tags: added: server-todo
removed: server-triage-discuss
Changed in keepalived (Ubuntu):
assignee: nobody → Athos Ribeiro (athos-ribeiro)
Changed in keepalived (Ubuntu Xenial):
assignee: nobody → Athos Ribeiro (athos-ribeiro)
Changed in keepalived (Ubuntu Bionic):
assignee: nobody → Athos Ribeiro (athos-ribeiro)
no longer affects: keepalived (Ubuntu Xenial)
Changed in keepalived (Ubuntu Focal):
assignee: nobody → Athos Ribeiro (athos-ribeiro)
Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Since Athos is having a look again and the old SRU process is complete I'll remove the verification tags to clear the view.

tags: removed: verification-done-bionic verification-done-eoan verification-needed
Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote :

Closing bionic as per https://bugs.launchpad.net/netplan/+bug/1815101/comments/59 (it is also on EOSS).

Changed in keepalived (Ubuntu Bionic):
status: Confirmed → Won't Fix
Revision history for this message
Athos Ribeiro (athos-ribeiro) wrote :

As discussed in https://bugs.launchpad.net/ubuntu/+source/keepalived/+bug/1819074 and pointed out in https://bugs.launchpad.net/ubuntu/bionic/+source/keepalived/+bug/1815101/comments/65, as KeepConfiguration= is part of systemd nowadays, and since the keepalived fix is available since 2.x, This should be fixed nowadays, and I am marking the keepalived task as such.

tags: removed: server-todo
Changed in keepalived (Ubuntu):
status: In Progress → Fix Released
assignee: Athos Ribeiro (athos-ribeiro) → nobody
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.