[master] Restarting systemd-networkd breaks keepalived, heartbeat, corosync, pacemaker (interface aliases are restarted)

Bug #1815101 reported by Leroy Tennison
194
This bug affects 40 people
Affects Status Importance Assigned to Milestone
netplan
Undecided
Unassigned
heartbeat (Ubuntu)
Low
Unassigned
keepalived (Ubuntu)
Medium
Unassigned
Xenial
Medium
Unassigned
Bionic
Medium
Unassigned
Focal
Undecided
Unassigned
systemd (Ubuntu)
Medium
Unassigned
Xenial
Medium
Unassigned
Bionic
Medium
Unassigned
Disco
Medium
Unassigned
Eoan
Medium
Unassigned
Focal
Undecided
Unassigned

Bug Description

[impact]

- ALL related HA software has a small problem if interfaces are being managed by systemd-networkd: nic restarts/reconfigs are always going to wipe all interfaces aliases when HA software is not expecting it to (no coordination between them.

- keepalived, smb ctdb, pacemaker, all suffer from this. Pacemaker is smarter in this case because it has a service monitor that will restart the virtual IP resource, in affected node & nic, before considering a real failure, but other HA service might consider a real failure when it is not.

[test case]

- comment #14 is a full test case: to have 3 node pacemaker, in that example, and cause a networkd service restart: it will trigger a failure for the virtual IP resource monitor.

- other example is given in the original description for keepalived. both suffer from the same issue (and other HA softwares as well).

[regression potential]

- this backports KeepConfiguration parameter, which adds some significant complexity to networkd's configuration and behavior, which could lead to regressions in correctly configuring the network at networkd start, or incorrectly maintaining configuration at networkd restart, or losing network state at networkd stop.

- Any regressions are most likely to occur during networkd start, restart, or stop, and most likely to involve missing or incorrect ip address(es).

- the change is based in upstream patches adding the exact feature we needed to fix this issue & it will be integrated with a netplan change to add the needed stanza to systemd nic configuration file (KeepConfiguration=)

[other info]

original description:
---

Configure netplan for interfaces, for example (a working config with IP addresses obfuscated)

network:
    ethernets:
        eth0:
            addresses: [192.168.0.5/24]
            dhcp4: false
            nameservers:
              search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, phone.blah.com]
              addresses: [10.22.11.1]
        eth2:
            addresses:
              - 12.13.14.18/29
              - 12.13.14.19/29
            gateway4: 12.13.14.17
            dhcp4: false
            nameservers:
              search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, phone.blah.com]
              addresses: [10.22.11.1]
        eth3:
            addresses: [10.22.11.6/24]
            dhcp4: false
            nameservers:
              search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, phone.blah.com]
              addresses: [10.22.11.1]
        eth4:
            addresses: [10.22.14.6/24]
            dhcp4: false
            nameservers:
              search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, phone.blah.com]
              addresses: [10.22.11.1]
        eth7:
            addresses: [9.5.17.34/29]
            dhcp4: false
            optional: true
            nameservers:
              search: [blah.com, other.blah.com, hq.blah.com, cust.blah.com, phone.blah.com]
              addresses: [10.22.11.1]
    version: 2

Configure keepalived (again, a working config with IP addresses obfuscated)

global_defs # Block id
{
notification_email {
        <email address hidden>
}
        notification_email_from <email address hidden>
        smtp_server 10.22.11.7 # IP
        smtp_connect_timeout 30 # integer, seconds
        router_id system3 # string identifying the machine,
                                     # (doesn't have to be hostname).
        vrrp_mcast_group4 224.0.0.18 # optional, default 224.0.0.18
        vrrp_mcast_group6 ff02::12 # optional, default ff02::12
        enable_traps # enable SNMP traps
}
vrrp_sync_group collection {
        group {
                wan
                lan
                phone
        }
vrrp_instance wan {
        state MASTER
        interface eth2
        virtual_router_id 77
        priority 150
        advert_int 1
        smtp_alert
        authentication {
                auth_type PASS
                auth_pass BlahBlah
        }
        virtual_ipaddress {
        12.13.14.20
        }
}
vrrp_instance lan {
        state MASTER
        interface eth3
        virtual_router_id 78
        priority 150
        advert_int 1
        smtp_alert
        authentication {
                auth_type PASS
                auth_pass MoreBlah
        }
        virtual_ipaddress {
                10.22.11.13/24
        }
}
vrrp_instance phone {
        state MASTER
        interface eth4
        virtual_router_id 79
        priority 150
        advert_int 1
        smtp_alert
        authentication {
                auth_type PASS
                auth_pass MostBlah
        }
        virtual_ipaddress {
                10.22.14.3/24
        }
}

At boot the affected interfaces have:
5: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ab:cd:ef:90:c0:e3 brd ff:ff:ff:ff:ff:ff
    inet 10.22.14.6/24 brd 10.22.14.255 scope global eth4
       valid_lft forever preferred_lft forever
    inet 10.22.14.3/24 scope global secondary eth4
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:fe90:c0e3/64 scope link
       valid_lft forever preferred_lft forever
7: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ab:cd:ef:b0:26:29 brd ff:ff:ff:ff:ff:ff
    inet 10.22.11.6/24 brd 10.22.11.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.22.11.13/24 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:feb0:2629/64 scope link
       valid_lft forever preferred_lft forever
9: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ab:cd:ef:b0:26:2b brd ff:ff:ff:ff:ff:ff
    inet 12.13.14.18/29 brd 12.13.14.23 scope global eth2
       valid_lft forever preferred_lft forever
    inet 12.13.14.20/32 scope global eth2
       valid_lft forever preferred_lft forever
    inet 12.33.89.19/29 brd 12.13.14.23 scope global secondary eth2
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:feb0:262b/64 scope link
       valid_lft forever preferred_lft forever

Run 'netplan try' (didn't even make any changes to the configuration) and the keepalived addresses disappear never to return, the affected interfaces have:
5: eth4: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ab:cd:ef:90:c0:e3 brd ff:ff:ff:ff:ff:ff
    inet 10.22.14.6/24 brd 10.22.14.255 scope global eth4
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:fe90:c0e3/64 scope link
       valid_lft forever preferred_lft forever
7: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ab:cd:ef:b0:26:29 brd ff:ff:ff:ff:ff:ff
    inet 10.22.11.6/24 brd 10.22.11.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:feb0:2629/64 scope link
       valid_lft forever preferred_lft forever
9: eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq state UP group default qlen 1000
    link/ether ab:cd:ef:b0:26:2b brd ff:ff:ff:ff:ff:ff
    inet 12.13.14.18/29 brd 12.13.14.23 scope global eth2
       valid_lft forever preferred_lft forever
    inet 12.33.89.19/29 brd 12.13.14.23 scope global secondary eth2
       valid_lft forever preferred_lft forever
    inet6 fe80::ae1f:6bff:feb0:262b/64 scope link
       valid_lft forever preferred_lft forever

Related branches

Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

This isn't netplan, it's systemd-networkd. Netplan only writes configuration for the chosen renderer (in this case, systemd-networkd).

Either systemd needs to not wipe out foreign addresses (I believe there is a PR in git for that) or keepalived should somehow interface with systemd so they can collaborate on setting and keeping up the IP addresses.

Reassigning.

no longer affects: ubuntu
Changed in netplan:
status: New → Invalid
Changed in keepalived (Ubuntu):
status: New → Incomplete
Changed in systemd (Ubuntu):
status: New → Triaged
Revision history for this message
Mathieu Trudel-Lapierre (cyphermox) wrote :

Kept a task for keepalived (Incomplete) in case it turns out there's something we can do there.

Also added a task for systemd, since that would definitely require development work.

Marked Invalid for netplan, as since netplan only translates config from the YAML to what networkd or NetworkManager require, there isn't really anything I see we can do in netplan directly. Applying absolutely does need to 'poke' the renderer somehow for the configuration to be applied; but if it turns out there's something to change in netplan we can update the task.

Turns out there isn't really a PR about foreign addresses handling; though two are somewhat relevant:

https://github.com/systemd/systemd/pull/9956
and
https://github.com/systemd/systemd/pull/7403

But neither will completely address the problem: systemd-networks expects to be authoritative on the network setup, which is somewhat counter to its use in conjunction with keepalived.

As a workaround, for now, one can use /etc/network/interfaces (and/or no configuration in netplan for the interfaces handled by keepalived) to configure the network.

Revision history for this message
Leroy Tennison (ltennison) wrote :

I am trying ifupdown. Do I need to do anythnig else or is what I've done adequate?

Revision history for this message
cdmiller (cdmiller) wrote :

Newer keepalived (> 2.0.x) addresses the systemd-networkd behavior. From keepalived 2.0.0 release notes: "Transition to backup state if a VIP or eVIP is removed When we next transition to master the addresses will be restored. If nopreempt is not set, that will be almost immediately."

Any chance of a keepalived 2.0.x backport package for Ubuntu 18.0.4?

Revision history for this message
Leroy Tennison (ltennison) wrote :

I note this bug is marked Incomplete meaning that information is missing, what else is needed?

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :

Might I ask - how much is this bug related or a dup to bug 1819074?

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

Seems a dupe to me.

For the bionic case, with keepalived < 2.0, is there some keepalived script that can be run to restore the vip, after networkd removed it? We could run it as a network-dispatcher hook then. Has this been considered?

summary: - netplan removes keepalived configuration
+ Restarting systemd-networkd breaks keepalived clusters
summary: - Restarting systemd-networkd breaks keepalived clusters
+ [master] Restarting systemd-networkd breaks keepalived clusters
Revision history for this message
Leroy Tennison (ltennison) wrote : Re: [master] Restarting systemd-networkd breaks keepalived clusters

If I understand the keepalived > 2.0.x behavior referred to by cdmiller above (see 2019-03-07 comment) that is not the appropriate response to the problem. Granted, it mitigates the consequences butr doesn't address the underlying issue. A systemd-source issue should not cause keepalived failover since failover is designed to address issues of system or hardware failure, not the bad behavior of other system software. systemd needs to be made to cooperate with other software rather than assuming it is the only authority on the system.

Revision history for this message
Robie Basak (racb) wrote :

It looks like there is some clear and actionable work in keepalived here (even if as a workaround and the real fix ends up being in systemd), so I'm marking it as Triaged.

FTR, the Ubuntu Server Team is aware of this as a high level issue and it is high up in our list of priorities to determine how to address it properly.

Changed in keepalived (Ubuntu):
status: Incomplete → Triaged
Revision history for this message
Jorge Niedbalski (niedbalski) wrote :
Revision history for this message
Bryce Harrington (bryce) wrote :

The aforementioned link shows there's been work towards a fix in systemd. Can't say if that suggests what can be done to improve keepalived, but I've tagged this "server-next" to get it on the Ubuntu SErver Team's high priority list, as per Robie's earlier comment.

tags: added: server-next
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

The following 3 bugs:

https://bugs.launchpad.net/bugs/1815101
https://bugs.launchpad.net/bugs/1819074
https://bugs.launchpad.net/bugs/1810583

Have the same root cause: the fact that systemd-network messes with secondary IP addresses in NICs managed by systemd.

I'm marking all other cases as a duplicate of LP: #1815101.

TODO here is the following:

- There are mainly 2 "fixes" for this issue:

1) keepalived is able to recognize systemd-networkd changes and change cluster status in order to reconfigure managed NICs (keepalived (> 2.0.x)).

2) systemd-networkd implements a new stanza (KeepConfiguration=) to systemd service unit files in order to fix not only this behavior but all those HA related software that manages secondary IPs and/or aliases to NICs being managed by systemd-networkd.

I think the most appropriate would make sure those 2 features work in Eoan, both, together, and then make sure the SRUs are done to Disco and Bionic. One problem w/ the item (2) is that netplan will also have to support the new "KeepConfiguration=" systemd service file stanza, but, the fix (2) is more appropriate for all other HA related softwares controlling virtual IPs (CTDB, Pacemaker, and so ...).

Changed in netplan:
status: Invalid → Confirmed
Changed in keepalived (Ubuntu):
status: Triaged → Confirmed
Changed in systemd (Ubuntu):
status: Triaged → Confirmed
Changed in keepalived (Ubuntu Bionic):
status: New → Confirmed
Changed in keepalived (Ubuntu Disco):
status: New → Confirmed
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Based on comment #12, and other comments from other duplicate cases, I'll summarize here in a better (and consolidated way) how to reproduce the issue, how to mitigate it using the dummy workaround, and how to fix it (with the backports/merge requests). At the end I might provide a PPA asking for feedback.

Changed in systemd (Ubuntu Bionic):
status: New → Confirmed
Changed in systemd (Ubuntu Disco):
status: New → Confirmed
Changed in keepalived (Ubuntu Bionic):
importance: Undecided → Medium
Changed in keepalived (Ubuntu Disco):
importance: Undecided → Medium
Changed in keepalived (Ubuntu Eoan):
importance: Undecided → Medium
Changed in systemd (Ubuntu Bionic):
importance: Undecided → Medium
Changed in systemd (Ubuntu Disco):
importance: Undecided → Medium
Changed in systemd (Ubuntu Eoan):
importance: Undecided → Medium
Changed in keepalived (Ubuntu Bionic):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in keepalived (Ubuntu Disco):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in keepalived (Ubuntu Eoan):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in systemd (Ubuntu Bionic):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in systemd (Ubuntu Disco):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in systemd (Ubuntu Eoan):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in netplan:
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in systemd (Ubuntu Eoan):
status: Confirmed → In Progress
Changed in keepalived (Ubuntu Eoan):
status: Confirmed → In Progress
Changed in heartbeat (Ubuntu Bionic):
importance: Undecided → Medium
status: New → Triaged
Changed in heartbeat (Ubuntu Disco):
importance: Undecided → Medium
status: New → Triaged
Changed in heartbeat (Ubuntu Eoan):
importance: Undecided → Low
status: New → Triaged
Changed in heartbeat (Ubuntu Bionic):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in heartbeat (Ubuntu Disco):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in heartbeat (Ubuntu Eoan):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
Download full text (5.9 KiB)

Alright,

As this is a problem that does not only affect keepalived, but, all cluster-like softwares dealing with aliases in any existing interface, managed or not by systemd, I have tested the same test case in a pacemaker based cluster, with 3 nodes, having 1 virtual IP + a lighttpd instance running in the same resource group:

----

(k)inaddy@kcluster01:~$ crm config show
node 1: kcluster01
node 2: kcluster02
node 3: kcluster03
primitive fence_kcluster01 stonith:fence_virsh \
 params ipaddr=192.168.100.205 plug=kcluster01 action=off login=stonithmgr passwd=xxxx use_sudo=true delay=2 \
 op monitor interval=60s
primitive fence_kcluster02 stonith:fence_virsh \
 params ipaddr=192.168.100.205 plug=kcluster02 action=off login=stonithmgr passwd=xxxx use_sudo=true delay=4 \
 op monitor interval=60s
primitive fence_kcluster03 stonith:fence_virsh \
 params ipaddr=192.168.100.205 plug=kcluster03 action=off login=stonithmgr passwd=xxxx use_sudo=true delay=6 \
 op monitor interval=60s
primitive virtual_ip IPaddr2 \
 params ip=10.0.3.1 nic=eth3 \
 op monitor interval=10s
primitive webserver systemd:lighttpd \
 op monitor interval=10 timeout=60
group webserver_virtual_ip webserver virtual_ip
location l_fence_kcluster01 fence_kcluster01 -inf: kcluster01
location l_fence_kcluster02 fence_kcluster02 -inf: kcluster02
location l_fence_kcluster03 fence_kcluster03 -inf: kcluster03
property cib-bootstrap-options: \
 have-watchdog=true \
 dc-version=2.0.1-9e909a5bdd \
 cluster-infrastructure=corosync \
 cluster-name=debian \
 stonith-enabled=true \
 stonith-action=off \
 no-quorum-policy=stop

----

(k)inaddy@kcluster01:~$ cat /etc/netplan/cluster.yaml
network:
    version: 2
    renderer: networkd
    ethernets:
        eth1:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.1.2/24]
        eth2:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.2.2/24]
        eth3:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.3.2/24]
        eth4:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.4.2/24]
        eth5:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.5.2/24]

----

AND the virtual IP failed right after the netplan acted in systemd network interface.

(k)inaddy@kcluster03:~$ sudo netplan apply
(k)inaddy@kcluster03:~$ ping 10.0.3.1
PING 10.0.3.1 (10.0.3.1) 56(84) bytes of data.
From 10.0.3.4 icmp_seq=1 Destination Host Unreachable
From 10.0.3.4 icmp_seq=2 Destination Host Unreachable
From 10.0.3.4 icmp_seq=3 Destination Host Unreachable
From 10.0.3.4 icmp_seq=4 Destination Host Unreachable
From 10.0.3.4 icmp_seq=5 Destination Host Unreachable
From 10.0.3.4 icmp_seq=6 Destination Host Unreachable
64 bytes from 10.0.3.1: icmp_seq=7 ttl=64 time=0.088 ms
64 bytes from 10.0.3.1: icmp_seq=8 ttl=64 time=0.076 ms

--- 10.0.3.1 ping statistics ---
8 packets transmitted, 2 received, +6 errors, 75% packet loss, time 7128ms
rtt min/avg/max/mdev = 0.076/0.082/0.088/0.006 ms, pipe 4

Liked explained in this bug description. With that, virtual_ip_monitor, from pacemaker, realized the virtual IP was gone and re-started it in the same node:

----

(k)inaddy@k...

Read more...

summary: - [master] Restarting systemd-networkd breaks keepalived clusters
+ [master] Restarting systemd-networkd breaks keepalived, heartbeat,
+ corosync, pacemaker (interface aliases are restarted)
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

The commits bellow implement support to "keep configuration":

commit 1e498853a39b46155cb89b5c9e74ecb27aaba3ed
Author: Yu Watanabe <email address hidden>
Date: Mon Jun 3 01:21:13 2019

    test-network: add tests for KeepConfiguration=

commit c98d78d32abba6aadbe89eece7acf0742f59047c
Author: Yu Watanabe <email address hidden>
Date: Mon Jun 3 03:37:25 2019

    man: add documentation about KeepConfiguration

commit db51778f85cb076e9ed1fe7f7e29cc740365c245
Author: Yu Watanabe <email address hidden>
Date: Mon Jun 3 00:33:13 2019

    network: make KeepConfiguration=static drop DHCP addresses and routes

    Also, KeepConfiguration=dhcp drops static foreign addresses and routes.

commit 95355a281c06c5970b7355c38b066910c3be4958
Author: Yu Watanabe <email address hidden>
Date: Mon Jun 3 14:05:26 2019

    network: add KeepConfiguration=dhcp-on-stop

    The option prevents to drop lease address on stop.
    By setting this, we can safely restart networkd.

commit 7da377ef16a2112a673247b39041a180b07e973a
Author: Susant Sahani <email address hidden>
Date: Mon Jun 3 00:31:13 2019

    networkd: add support to keep configuration

for systemd-networkd.

IMO, we should rely in setting the keep configuration flag for the interfaes to be managed by 3rd part software (adding/removing aliases for virtual networks, VRRP interfaces, etc).

Revision history for this message
Christian Ehrhardt  (paelzer) wrote :
Revision history for this message
Edward Hope-Morley (hopem) wrote :

Thanks Rafael/Christian,

I see that all those patches are in 243 and Eoan is currently on 242 (albeit -6 but i dont think any are already backported) so we'll need to get this backported all the way down to Bionic.

max@power:~/git/systemd$ _c=( 7da377e 95355a2 db51778 c98d78d 1e49885 )
max@power:~/git/systemd$ for c in ${_c[@]}; do git tag --contains $c| egrep -v "\-rc"; done| sort -u
v243

Do we have a feel for if/when the keepalived fix(es) will be backportable to B (1.x) as well? Since those fixes already exist in Discco (2.0.10) it might be easier to start with those?

I will add the charm-keepalived to this LP since it will need support for the networkd/netplan fix once that is available.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

@ed,

I just finished the backport to Eoan it was straightforward, I'll finish tests tomorrow with HA related software and networkd enabled HA clusters. After that I'll give you a better estimation about Disco and Bionic.

This is the total size of changes (systemd-networkd-tests.py is not so great to backport, will review that):

$ cat *.patch | diffstat
 man/systemd.network.xml | 27 +-
 src/network/networkd-dhcp4.c | 8
 src/network/networkd-link.c | 57 +++++-
 src/network/networkd-link.h | 2
 src/network/networkd-manager.c | 2
 src/network/networkd-network-gperf.gperf | 3
 src/network/networkd-network.c | 44 ++++
 src/network/networkd-network.h | 26 ++
 test/fuzz/fuzz-network-parser/directives.network | 1
 test/test-network/conf/24-keep-configuration-static.network | 5
 test/test-network/conf/dhcp-client-keep-configuration-dhcp-on-stop.network | 4
 test/test-network/conf/dhcp-client-keep-configuration-dhcp.network | 7
 test/test-network/systemd-networkd-tests.py | 94 +++++++++-
 13 files changed, 235 insertions(+), 45 deletions(-)

Good thing is that the logic is not drastically changed for this feature to exist. Sorry for the delay here, because of freeze we were running to close out some urgent issues for Eoan.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Test Case:

(k)rafaeldtinoco@kcluster03:~$ crm status
Stack: corosync
Current DC: kcluster02 (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Thu Oct 10 17:13:19 2019
Last change: Thu Oct 10 17:11:48 2019 by root via cibadmin on kcluster01

3 nodes configured
5 resources configured

Online: [ kcluster01 kcluster02 kcluster03 ]

Full list of resources:

 fence_kcluster01 (stonith:fence_virsh): Started kcluster02
 fence_kcluster02 (stonith:fence_virsh): Started kcluster01
 fence_kcluster03 (stonith:fence_virsh): Started kcluster01
 Resource Group: webserver_virtual_ip
     webserver (systemd:lighttpd): Started kcluster03
     virtual_ip (ocf::heartbeat:IPaddr2): Started kcluster03

(k)rafaeldtinoco@kcluster03:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:c3:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:c306/64 scope link
       valid_lft forever preferred_lft forever

(k)rafaeldtinoco@kcluster03:~$ systemctl restart systemd-networkd

(k)rafaeldtinoco@kcluster03:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:c3:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:c306/64 scope link
       valid_lft forever preferred_lft forever

<wait for resource monitor timeout, pacemaker starts virtual_ip again>

(k)rafaeldtinoco@kcluster03:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:c3:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:c306/64 scope link
       valid_lft forever preferred_lft forever
(k)rafaeldtinoco@kcluster03:~$

Pacemaker logs:

Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6901]: INFO: IP status = no, IP_CIP=
Oct 10 17:14:37 kcluster03 pacemaker-controld[1266]: notice: Result of stop operation for virtual_ip on kcluster03: 0 (ok)
Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6951]: INFO: Adding inet address 10.0.3.1/24 with broadcast address 10.0.3.255 to device eth3
Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6956]: INFO: Bringing device eth3 up
Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6961]: INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p /run/resource-agents/send_arp-10.0.3.1 eth3 10.0.3.1 auto not_used not_used
Oct 10 17:14:37 kcluster03 pacemaker-controld[1266]: notice: Result of start operation for virtual_ip on kcluster03: 0 (ok)

for the operation.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

(k)rafaeldtinoco@kcluster01:~$ sudo vi /etc/systemd/network/10-netplan-eth3.network

<add KeepConfiguration=static to .network file>

(k)rafaeldtinoco@kcluster01:~$ systemctl restart systemd-networkd

(k)rafaeldtinoco@kcluster01:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:f0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:f003/64 scope link
       valid_lft forever preferred_lft forever

(k)rafaeldtinoco@kcluster01:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:f0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:f003/64 scope link
       valid_lft forever preferred_lft forever

(k)rafaeldtinoco@kcluster01:~$ systemctl restart systemd-networkd

(k)rafaeldtinoco@kcluster01:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:f0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:f003/64 scope link
       valid_lft forever preferred_lft forever

<interface does NOT restart the aliases>

Voila. Needs better testing with KeepConfiguration=dhcp.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
tags: added: sts
Dan Streetman (ddstreet)
description: updated
description: updated
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Please test proposed package

Hello Leroy, or anyone else affected,

Accepted systemd into eoan-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/242-7ubuntu3.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-eoan to verification-done-eoan. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-eoan. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Eoan):
status: In Progress → Fix Committed
tags: added: verification-needed verification-needed-eoan
Revision history for this message
Ubuntu SRU Bot (ubuntu-sru-bot) wrote : Autopkgtest regression report (systemd/242-7ubuntu3.2)

All autopkgtests for the newly accepted systemd (242-7ubuntu3.2) for eoan have finished running.
The following regressions have been reported in tests triggered by the package:

gvfs/1.42.1-1ubuntu1 (amd64)
systemd/242-7ubuntu3.2 (ppc64el)
ndctl/unknown (armhf)
casper/1.427 (amd64)
netplan.io/0.98-0ubuntu1 (ppc64el)
munin/unknown (armhf)
linux-oem-osp1/5.0.0-1026.29 (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/eoan/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :
Download full text (4.5 KiB)

(k)rafaeldtinoco@kcluster01:~$ dpkg -l | grep "ii systemd "
ii systemd 243-3ubuntu1 amd64 system and service manager

k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "dpkg -l | grep systemd "; done | grep "ii systemd "

ii systemd 243-3ubuntu1 amd64 system and service manager
ii systemd 243-3ubuntu1 amd64 system and service manager
ii systemd 243-3ubuntu1 amd64 system and service manager
----

(k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "cat /etc/systemd/network/10-netplan-eth3.network"; done
[Match]
Name=eth3

[Network]
LinkLocalAddressing=ipv6
Address=10.0.3.2/24
KeepConfiguration=static
[Match]
Name=eth3

[Network]
LinkLocalAddressing=ipv6
Address=10.0.3.3/24
KeepConfiguration=static
[Match]
Name=eth3

[Network]
LinkLocalAddressing=ipv6
Address=10.0.3.4/24
KeepConfiguration=static

----

(k)rafaeldtinoco@kcluster01:~$ crm status
Stack: corosync
Current DC: kcluster01 (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Tue Nov 19 16:38:15 2019
Last change: Mon Nov 18 12:41:14 2019 by root via crm_resource on kcluster01

3 nodes configured
5 resources configured

Online: [ kcluster01 kcluster02 kcluster03 ]

Full list of resources:

 fence_kcluster01 (stonith:fence_virsh): Started kcluster02
 fence_kcluster02 (stonith:fence_virsh): Started kcluster01
 fence_kcluster03 (stonith:fence_virsh): Started kcluster01
 Resource Group: webserver_virtual_ip
     webserver (systemd:lighttpd): Started kcluster01
     virtual_ip (ocf::heartbeat:IPaddr2): Started kcluster01

----

(k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "hostname ; ip addr show eth3"; done

kcluster01
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:a0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:a003/64 scope link
       valid_lft forever preferred_lft forever
kcluster02
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:1d:1a:cc brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.3/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe1d:1acc/64 scope link
       valid_lft forever preferred_lft forever
kcluster03
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:13:16 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:1316/64 scope link
       valid_lft forever preferred_lft forever

----

in parallel:

(k)rafaeldtinoco@kcluster01:~$ journalctl -f -u pacemaker

and check if events are generated (vip monitor detects changes)

----

(k)rafaeldtinoco@kcluster01:~$ systemctl restart sy...

Read more...

tags: added: verification-done verification-done-eoan
removed: verification-needed verification-needed-eoan
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Flagging this as wont fix as heartbeat is already being kept just for historical reasons (and systemd-networkd can workaround that by the fix we're backporting to it: KeepConfiguration .service file stanza).

Changed in heartbeat (Ubuntu Eoan):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in heartbeat (Ubuntu Disco):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in heartbeat (Ubuntu Bionic):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in heartbeat (Ubuntu):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
no longer affects: heartbeat (Ubuntu Eoan)
no longer affects: heartbeat (Ubuntu Disco)
no longer affects: heartbeat (Ubuntu Bionic)
Changed in heartbeat (Ubuntu):
status: Triaged → Won't Fix
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package systemd - 242-7ubuntu3.2

---------------
systemd (242-7ubuntu3.2) eoan; urgency=medium

  [ Dan Streetman ]
  * d/extra/dhclient-enter-resolved-hook:
    - Replace use of bash-only &> with > and 2> (LP: #1849608)
  * d/p/lp1849658-resolved-set-stream-type-during-DnsStream-creation.patch:
    - Fix bug in refcounting TCP stream types (LP: #1849658)
  * d/extra/dhclient-enter-resolved-hook: cleanup temp $newstate file

  [ Rafael David Tinoco ]
  * Add support to KeepConfiguration= fixing behaviour for HA (LP: #1815101)
    - d/p/lp1815101-01-networkd-add-support-to-keep-configuration.patch
    - d/p/lp1815101-02-networkd-stop-clients-when-networkd-shuts-down.patch
    - d/p/lp1815101-03-network-add-KeepConfiguration-dhcp-on-stop.patch
    - d/p/lp1815101-04-network-make-KeepConfiguration-static-drop-DHCP-addr.patch
    - d/p/lp1815101-05-man-add-documentation-about-KeepConfiguration.patch

systemd (242-7ubuntu3.1) eoan; urgency=medium

  [ Balint Reczey ]
  * Fix shutdown and related actions from the login screen (LP: #1847896)
    File: debian/patches/logind-consider-greeter-sessions-suitable-as-display-sess.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=b407dfd8c9dc81594553c27467c35b383333d74c
  * debian/gbp.conf: Set debian-branch to ubuntu-eoan
    File: debian/gbp.conf
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=f399ce2cf4701a2dbb4b3505d2dd17a210d62f5c

  [ Dan Streetman ]
  * Fix bogus routes after DHCP lease change (LP: #1831787)
    Files:
    - debian/patches/lp1831787/0001-networkd-Add-back-static-routes-after-DHCPv4-lease-e.patch
    - debian/patches/lp1831787/0002-network-set-preferred-source-in-removing-route-entry.patch
    - debian/patches/lp1831787/0003-network-lower-log-level-about-critical-connection.patch
    - debian/patches/lp1831787/0004-network-reset-Link-dhcp4_configured-flag-earlier.patch
    - debian/patches/lp1831787/0005-network-split-dhcp_lease_lost-into-small-pieces.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=ced3f5c2f619083f7beb164d94d4ccfe52222fe8
  * Set src address for dhcp 'classless' routes (LP: #1835581)
    File: debian/patches/lp1835581-src-network-networkd-dhcp4.c-set-prefsrc-for-classle.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=6a7ef370fb1335548448920be4ae6176b67044a8
  * Allows cache=no-negative option to be set, ignoring negative answers to
    be cached (LP: #1668771)
    File: debian/patches/lp1668771-resolved-switch-cache-option-to-a-tri-state-option-s.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=27def26f5b1d1b8ba314c4a925fc1b7c43837f86

 -- Dan Streetman <email address hidden> Fri, 01 Nov 2019 16:33:08 -0400

Changed in systemd (Ubuntu Eoan):
status: Fix Committed → Fix Released
Revision history for this message
Łukasz Zemczak (sil2100) wrote : Update Released

The verification of the Stable Release Update for systemd has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message
Dan Streetman (ddstreet) wrote :

as disco reaches EOL next week, marking this as wontfix for disco.

Changed in systemd (Ubuntu Disco):
status: Confirmed → Won't Fix
Changed in keepalived (Ubuntu Disco):
status: Confirmed → Won't Fix
Revision history for this message
David Negreira (dnegreira) wrote :

Can we backport the fixes to Bionic?

Revision history for this message
Balint Kovacs (kovacs-balint-o) wrote :

Hi all,

thanks for the fixes in Eoan. Unfortunately we have a product based on disco and cannot move forward at this time. Being a networking shop, this issue has a serious effect on us and we would like to avoid moving to something like ifupdown2 within our stable branch.

For our users the real impact of the bug is not that that the interface that we are currently reconfiguring is suffering a downtime, but the fact that _all_ interfaces have their aliases removed if networkd is restarted. The proposed KeepConfiguration solution kind of beats the purpose of reconfiguring the interfaces, as old addresses are kept and need to be handled manually. Also it interferes with how DHCP works. I believe this might be an issue for others as well.

From our point of view the ideal solution would be a combination of the keepalived patch that detects VIP removal and systemd version 244 that already supports "networkctl reconfigure" and "networkctl reload".

Is there any chance that v244 is backported to bionic? It is already included in focal and debian stable backports, but unfortunately I am not familiar enough with systemd development to tell what the impact of this would be.

As for keepalived, in bug #1819074 there was an ongoing investigation on the patch, that implements the keepalived transition on removing the VIP. We have traced back this functionality to this patch:

https://github.com/acassen/keepalived/commit/0b1528c76d3fe8d1c5765841df86c59570a036da

It was born before v1.3.6 was released, so we hope that it is self-contained enough for a backport if v2.0 of keepalived is not included in bionic-backports.

Best,
Balint

Changed in keepalived (Ubuntu Xenial):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
importance: Undecided → Medium
status: New → Confirmed
no longer affects: heartbeat (Ubuntu Xenial)
Changed in systemd (Ubuntu Xenial):
assignee: nobody → Rafael David Tinoco (rafaeldtinoco)
importance: Undecided → Medium
status: New → Confirmed
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote : Re: [Bug 1815101] Re: [master] Restarting systemd-networkd breaks keepalived, heartbeat, corosync, pacemaker (interface aliases are restarted)
Download full text (5.1 KiB)

Balint, based on your input...

> thanks for the fixes in Eoan. Unfortunately we have a product based on
> disco and cannot move forward at this time. Being a networking shop,
> this issue has a serious effect on us and we would like to avoid moving
> to something like ifupdown2 within our stable branch.

So, Disco is EOL as it is not a LTS version, that is why it did not
get a fix (as the fix is very close to the one done in Eoan). Since
its unsupported by the community, it's up to you backport the Eoan
fixes to Disco if you'd like... you can even create a PPA for your
product and distribute along.

> For our users the real impact of the bug is not that that the interface
> that we are currently reconfiguring is suffering a downtime, but the
> fact that _all_ interfaces have their aliases removed if networkd is
> restarted. The proposed KeepConfiguration solution kind of beats the
> purpose of reconfiguring the interfaces, as old addresses are kept and
> need to be handled manually. Also it interferes with how DHCP works. I
> believe this might be an issue for others as well.

We are following systemd-networkd upstream decisions here. The option
"dhcp" only exists for CERTAIN scenarios (when root disk depends on
that connection, for iSCSI and/or NFS/ROOT for example). It is
explicitly said in the documentation:

"""
Takes a boolean or one of "static", "dhcp-on-stop", "dhcp". When
"static", systemd-networkd will not drop static addresses and routes
on starting up process. When set to "dhcp-on-stop", systemd-networkd
will not drop addresses and routes on stopping the daemon. When
"dhcp", the addresses and routes provided by a DHCP server will never
be dropped even if the DHCP lease expires. This is contrary to the
DHCP specification, but may be the best choice if, e.g., the root
filesystem relies on this connection. The setting "dhcp" implies
"dhcp-on-stop", and "yes" implies "dhcp" and "static". Defaults to
"no".
"""

and it is a question of choice: to have a window of opportunity for
duplicate IPs - in cases where there is no dynamic IP mapping to that
mac address - but possibly maintain the connection instead of causing
uninterruptable I/Os trying to shutdown a machine, for example. I
particularly don't like this option but it is not the default one and
was meant for a specific purpose.

>
> >From our point of view the ideal solution would be a combination of the
> keepalived patch that detects VIP removal and systemd version 244 that
> already supports "networkctl reconfigure" and "networkctl reload".

networkctl reconfigure/reload is a new functionality and won't be
added to previous already released versions as this is against SRU
guidelines. Systemd 244.2-1ubuntu1 is being included in 20.04, our
NEXT LTS version.

Like said before, you can try backporting systemd 244 to disco, or
bionic, if you are willing to support it on your own as it was already
EOL for community support. You should follow:
https://packaging.ubuntu.com/html/backports.html if you would like to
do that.

For the keepalived patches, they could be backported to Eoan, maybe
Bionic and Xenial depending on the amount of work. But then I would
need a practical example of wh...

Read more...

Dan Streetman (ddstreet)
tags: added: ddstreet
Revision history for this message
George Kraft (cynerva) wrote :

Removing charm-keepalived since I believe no changes are needed there. It should pick up fixes once they are available on apt archives.

no longer affects: charm-keepalived
Revision history for this message
Claudio Kuenzler (napsty) wrote :

FYI I stumbled on this problem after a system update (which broke production!) and collected data in a troubleshooting session and documented this here https://www.claudiokuenzler.com/blog/959/keepalived-virtual-ip-addresses-gone-lost-after-systemd-update. Once I found out the restart of systemd-networkd causes the keepalived vips to be gone, I finally came across this bug.

Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

@napsty: the "workaround" (from your blog) is actually to use:

- ifupdown/bridge-utils/vlan/resolvconf for network setup OR
- use systemd-networkd DIRECTLY with the KeepConfiguration= option in .network file

Just highlighting it here.

@ddstreet, you said you would try to come up with the netplan change for KeepConfiguration. Did you have time to check on this ? (just checking).

Cheers o/

Changed in keepalived (Ubuntu):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in keepalived (Ubuntu Xenial):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in keepalived (Ubuntu Bionic):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in keepalived (Ubuntu Disco):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in keepalived (Ubuntu Eoan):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in systemd (Ubuntu):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in systemd (Ubuntu Xenial):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in systemd (Ubuntu Bionic):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in systemd (Ubuntu Disco):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in systemd (Ubuntu Eoan):
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in netplan:
assignee: Rafael David Tinoco (rafaeldtinoco) → nobody
no longer affects: keepalived (Ubuntu Eoan)
no longer affects: keepalived (Ubuntu Disco)
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

TL;DR TODO SUMMARY:

- netplan change to support KeepConfiguration= for systemd-networkd backend (Groovy)
- backport this change: netplan for Ubuntu Focal (SRU)
- backport this change: netplan for Ubuntu Eoan (SRU, WontFix due to EOL ?)
- backport this change: netplan for Ubuntu Bionic (SRU)
- backport this change: netplan for Ubuntu Xenial (SRU, WontFix ?)

Changed in systemd (Ubuntu Focal):
status: New → Fix Released
Changed in keepalived (Ubuntu Focal):
status: New → Confirmed
no longer affects: heartbeat (Ubuntu Focal)
Dan Streetman (ddstreet)
Changed in systemd (Ubuntu Bionic):
assignee: nobody → Jorge Niedbalski (niedbalski)
status: Confirmed → In Progress
Dan Streetman (ddstreet)
Changed in systemd (Ubuntu Bionic):
assignee: Jorge Niedbalski (niedbalski) → Eric Desrochers (slashd)
Revision history for this message
Rafael David Tinoco (rafaeldtinoco) wrote :

Ubuntu HA wise:

I'm recommending all HA clusters to have the cluster interfaces configured with systemd-networkd DIRECTLY instead of wrapping it through netplan.io. At least until we're sure that HA has no issues with netplan.io, having it configured directly will allow us to isolate possible issues.

I see that this has been assigned to @slashd. Eric, Important thing here is to have netplan fix in focal (as it is the latest LTS) in order for HA to be supported with it. KeepConfiguration= is good enough, for now, if using systemd-networkd only.

Thank you!

tags: removed: server-next
Revision history for this message
Eric Desrochers (slashd) wrote :

I have a first iteration of a package:

It's not a final solution nor a long term solution. It is only made to determine if its fix the problem before considering an SRU: (Ideally one would test this package in non-production area)

Adding this PPA to your system
sudo add-apt-repository ppa:slashd/sf263217
sudo apt-get update

Please report back any feedbacks in this bug.

- Eric

Revision history for this message
Eric Desrochers (slashd) wrote :

The above test package has been made for 'systemd' in bionic ^

Revision history for this message
Sheng-Kai Lin (kester-lin) wrote :

Dear Eric Desrochers,
  I add the PPA into my ubuntu 18.04 corosync/pacemaker service node.
Then I upgrade the following libnss-systemd libpam-systemd libsystemd0 libudev1 systemd systemd-sysv udev package.

But it still failed after re-connect the notwork.
The crmsh show as below:
ERROR: status: crm_mon (rc=107): Connection to cluster failed: Transport endpoint is not connected.

I also check the dmesg but it seem ok.
e1000: enp0s8 NIC Link is Down
e1000: enp0s8 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX

Could you describe more detail to help me figure out it is my mistake operation or something wrong in my environment?

Thank you.

Eric Desrochers (slashd)
Changed in systemd (Ubuntu Bionic):
assignee: Eric Desrochers (slashd) → nobody
Revision history for this message
Dan Streetman (ddstreet) wrote :

i'm marking this as wont-fix for xenial.

i'm inclined to also mark this as wont-fix for bionic, unless there are still people affected by this problem using bionic without some other workaround.

Changed in systemd (Ubuntu Xenial):
status: Confirmed → Won't Fix
Changed in systemd (Ubuntu Bionic):
status: In Progress → Incomplete
Revision history for this message
Jasper Spaans (jap171) wrote :

For the people on Focal that want to use netplan and keepalived together: you can just put in an override for the network unit file, to keep systemd-networkd from touching your interface!

```
$ cat /etc/systemd/network/10-netplan-eno1.network.d/override.conf
[Network]
KeepConfiguration=static
$
```

This might be a good enough workaround until this is really fixed.

Revision history for this message
Sebastian (slovdahl) wrote :

> i'm inclined to also mark this as wont-fix for bionic, unless there are still people affected by this problem using bionic without some other workaround.

What kind of workarounds for binoic does this refer to? I have not found any reliable workarounds yet at least.

Revision history for this message
Jianan Wang (wangjianan-zju) wrote :

> > i'm inclined to also mark this as wont-fix for bionic, unless there are still people affected
  by this problem using bionic without some other workaround.

> What kind of workarounds for binoic does this refer to? I have not found any reliable workarounds
  yet at least.

+1. I did not find any other solution either. Please help backward fix binoic as well. Thanks

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers