Bug #1815101 “[master] Restarting systemd-networkd breaks keepal...” : Bugs : Netplan

Revision history for this message

Mathieu Trudel-Lapierre (cyphermox) wrote on 2019-02-07:

#1

This isn't netplan, it's systemd-networkd. Netplan only writes configuration for the chosen renderer (in this case, systemd-networkd).

Either systemd needs to not wipe out foreign addresses (I believe there is a PR in git for that) or keepalived should somehow interface with systemd so they can collaborate on setting and keeping up the IP addresses.

Reassigning.

no longer affects:	ubuntu
Changed in netplan:
status:	New → Invalid
Changed in keepalived (Ubuntu):
status:	New → Incomplete
Changed in systemd (Ubuntu):
status:	New → Triaged

Revision history for this message

Mathieu Trudel-Lapierre (cyphermox) wrote on 2019-02-07:

#2

Kept a task for keepalived (Incomplete) in case it turns out there's something we can do there.

Also added a task for systemd, since that would definitely require development work.

Marked Invalid for netplan, as since netplan only translates config from the YAML to what networkd or NetworkManager require, there isn't really anything I see we can do in netplan directly. Applying absolutely does need to 'poke' the renderer somehow for the configuration to be applied; but if it turns out there's something to change in netplan we can update the task.

Turns out there isn't really a PR about foreign addresses handling; though two are somewhat relevant:

https://github.com/systemd/systemd/pull/9956
and
https://github.com/systemd/systemd/pull/7403

But neither will completely address the problem: systemd-networks expects to be authoritative on the network setup, which is somewhat counter to its use in conjunction with keepalived.

As a workaround, for now, one can use /etc/network/interfaces (and/or no configuration in netplan for the interfaces handled by keepalived) to configure the network.

Revision history for this message

Leroy Tennison (ltennison) wrote on 2019-02-07:

#3

I am trying ifupdown. Do I need to do anythnig else or is what I've done adequate?

Revision history for this message

cdmiller (cdmiller) wrote on 2019-03-07:

#4

Newer keepalived (> 2.0.x) addresses the systemd-networkd behavior. From keepalived 2.0.0 release notes: "Transition to backup state if a VIP or eVIP is removed When we next transition to master the addresses will be restored. If nopreempt is not set, that will be almost immediately."

Any chance of a keepalived 2.0.x backport package for Ubuntu 18.0.4?

Revision history for this message

Leroy Tennison (ltennison) wrote on 2019-03-07:

#5

I note this bug is marked Incomplete meaning that information is missing, what else is needed?

Revision history for this message

Christian Ehrhardt  (paelzer) wrote on 2019-03-08:

#6

Might I ask - how much is this bug related or a dup to bug 1819074?

Revision history for this message

Andreas Hasenack (ahasenack) wrote on 2019-03-11:

#7

Seems a dupe to me.

For the bionic case, with keepalived < 2.0, is there some keepalived script that can be run to restore the vip, after networkd removed it? We could run it as a network-dispatcher hook then. Has this been considered?

Mathieu Trudel-Lapierre (cyphermox) on 2019-03-15

summary:	- netplan removes keepalived configuration + Restarting systemd-networkd breaks keepalived clusters
summary:	- Restarting systemd-networkd breaks keepalived clusters + [master] Restarting systemd-networkd breaks keepalived clusters

Revision history for this message

Leroy Tennison (ltennison) wrote on 2019-05-08: Re: [master] Restarting systemd-networkd breaks keepalived clusters

#8

If I understand the keepalived > 2.0.x behavior referred to by cdmiller above (see 2019-03-07 comment) that is not the appropriate response to the problem. Granted, it mitigates the consequences butr doesn't address the underlying issue. A systemd-source issue should not cause keepalived failover since failover is designed to address issues of system or hardware failure, not the bad behavior of other system software. systemd needs to be made to cooperate with other software rather than assuming it is the only authority on the system.

Revision history for this message

Robie Basak (racb) wrote on 2019-05-09:

#9

It looks like there is some clear and actionable work in keepalived here (even if as a workaround and the real fix ends up being in systemd), so I'm marking it as Triaged.

FTR, the Ubuntu Server Team is aware of this as a high level issue and it is high up in our list of priorities to determine how to address it properly.

Changed in keepalived (Ubuntu):
status:	Incomplete → Triaged

Revision history for this message

Jorge Niedbalski (niedbalski) wrote on 2019-08-21:

#10

For reference: https://github.com/systemd/systemd/pull/12511

Revision history for this message

Bryce Harrington (bryce) wrote on 2019-08-27:

#11

The aforementioned link shows there's been work towards a fix in systemd. Can't say if that suggests what can be done to improve keepalived, but I've tagged this "server-next" to get it on the Ubuntu SErver Team's high priority list, as per Robie's earlier comment.

tags:

added: server-next

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2019-09-13:

#12

The following 3 bugs:

https://bugs.launchpad.net/bugs/1815101
https://bugs.launchpad.net/bugs/1819074
https://bugs.launchpad.net/bugs/1810583

Have the same root cause: the fact that systemd-network messes with secondary IP addresses in NICs managed by systemd.

I'm marking all other cases as a duplicate of LP: #1815101.

TODO here is the following:

- There are mainly 2 "fixes" for this issue:

1) keepalived is able to recognize systemd-networkd changes and change cluster status in order to reconfigure managed NICs (keepalived (> 2.0.x)).

2) systemd-networkd implements a new stanza (KeepConfiguration=) to systemd service unit files in order to fix not only this behavior but all those HA related software that manages secondary IPs and/or aliases to NICs being managed by systemd-networkd.

I think the most appropriate would make sure those 2 features work in Eoan, both, together, and then make sure the SRUs are done to Disco and Bionic. One problem w/ the item (2) is that netplan will also have to support the new "KeepConfiguration=" systemd service file stanza, but, the fix (2) is more appropriate for all other HA related softwares controlling virtual IPs (CTDB, Pacemaker, and so ...).

Rafael David Tinoco (rafaeldtinoco) on 2019-09-13

Changed in netplan:
status:	Invalid → Confirmed
Changed in keepalived (Ubuntu):
status:	Triaged → Confirmed
Changed in systemd (Ubuntu):
status:	Triaged → Confirmed
Changed in keepalived (Ubuntu Bionic):
status:	New → Confirmed
Changed in keepalived (Ubuntu Disco):
status:	New → Confirmed

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2019-09-13:

#13

Based on comment #12, and other comments from other duplicate cases, I'll summarize here in a better (and consolidated way) how to reproduce the issue, how to mitigate it using the dummy workaround, and how to fix it (with the backports/merge requests). At the end I might provide a PPA asking for feedback.

Changed in systemd (Ubuntu Bionic):
status:	New → Confirmed
Changed in systemd (Ubuntu Disco):
status:	New → Confirmed
Changed in keepalived (Ubuntu Bionic):
importance:	Undecided → Medium
Changed in keepalived (Ubuntu Disco):
importance:	Undecided → Medium
Changed in keepalived (Ubuntu Eoan):
importance:	Undecided → Medium
Changed in systemd (Ubuntu Bionic):
importance:	Undecided → Medium
Changed in systemd (Ubuntu Disco):
importance:	Undecided → Medium
Changed in systemd (Ubuntu Eoan):
importance:	Undecided → Medium
Changed in keepalived (Ubuntu Bionic):
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in keepalived (Ubuntu Disco):
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in keepalived (Ubuntu Eoan):
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in systemd (Ubuntu Bionic):
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in systemd (Ubuntu Disco):
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in systemd (Ubuntu Eoan):
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in netplan:
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in systemd (Ubuntu Eoan):
status:	Confirmed → In Progress
Changed in keepalived (Ubuntu Eoan):
status:	Confirmed → In Progress

Lucas Kanashiro (lucaskanashiro) on 2019-09-20

Changed in heartbeat (Ubuntu Bionic):
importance:	Undecided → Medium
status:	New → Triaged
Changed in heartbeat (Ubuntu Disco):
importance:	Undecided → Medium
status:	New → Triaged
Changed in heartbeat (Ubuntu Eoan):
importance:	Undecided → Low
status:	New → Triaged
Changed in heartbeat (Ubuntu Bionic):
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in heartbeat (Ubuntu Disco):
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)
Changed in heartbeat (Ubuntu Eoan):
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2019-09-25:

#14

Download full text (5.9 KiB)

Alright,

As this is a problem that does not only affect keepalived, but, all cluster-like softwares dealing with aliases in any existing interface, managed or not by systemd, I have tested the same test case in a pacemaker based cluster, with 3 nodes, having 1 virtual IP + a lighttpd instance running in the same resource group:

----

(k)inaddy@kcluster01:~$ crm config show
node 1: kcluster01
node 2: kcluster02
node 3: kcluster03
primitive fence_kcluster01 stonith:fence_virsh \
params ipaddr=192.168.100.205 plug=kcluster01 action=off login=stonithmgr passwd=xxxx use_sudo=true delay=2 \
op monitor interval=60s
primitive fence_kcluster02 stonith:fence_virsh \
params ipaddr=192.168.100.205 plug=kcluster02 action=off login=stonithmgr passwd=xxxx use_sudo=true delay=4 \
op monitor interval=60s
primitive fence_kcluster03 stonith:fence_virsh \
params ipaddr=192.168.100.205 plug=kcluster03 action=off login=stonithmgr passwd=xxxx use_sudo=true delay=6 \
op monitor interval=60s
primitive virtual_ip IPaddr2 \
params ip=10.0.3.1 nic=eth3 \
op monitor interval=10s
primitive webserver systemd:lighttpd \
op monitor interval=10 timeout=60
group webserver_virtual_ip webserver virtual_ip
location l_fence_kcluster01 fence_kcluster01 -inf: kcluster01
location l_fence_kcluster02 fence_kcluster02 -inf: kcluster02
location l_fence_kcluster03 fence_kcluster03 -inf: kcluster03
property cib-bootstrap-options: \
have-watchdog=true \
dc-version=2.0.1-9e909a5bdd \
cluster-infrastructure=corosync \
cluster-name=debian \
stonith-enabled=true \
stonith-action=off \
no-quorum-policy=stop

----

(k)inaddy@kcluster01:~$ cat /etc/netplan/cluster.yaml
network:
    version: 2
    renderer: networkd
    ethernets:
        eth1:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.1.2/24]
        eth2:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.2.2/24]
        eth3:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.3.2/24]
        eth4:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.4.2/24]
        eth5:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.5.2/24]

----

AND the virtual IP failed right after the netplan acted in systemd network interface.

(k)inaddy@kcluster03:~$ sudo netplan apply
(k)inaddy@kcluster03:~$ ping 10.0.3.1
PING 10.0.3.1 (10.0.3.1) 56(84) bytes of data.
From 10.0.3.4 icmp_seq=1 Destination Host Unreachable
From 10.0.3.4 icmp_seq=2 Destination Host Unreachable
From 10.0.3.4 icmp_seq=3 Destination Host Unreachable
From 10.0.3.4 icmp_seq=4 Destination Host Unreachable
From 10.0.3.4 icmp_seq=5 Destination Host Unreachable
From 10.0.3.4 icmp_seq=6 Destination Host Unreachable
64 bytes from 10.0.3.1: icmp_seq=7 ttl=64 time=0.088 ms
64 bytes from 10.0.3.1: icmp_seq=8 ttl=64 time=0.076 ms

--- 10.0.3.1 ping statistics ---
8 packets transmitted, 2 received, +6 errors, 75% packet loss, time 7128ms
rtt min/avg/max/mdev = 0.076/0.082/0.088/0.006 ms, pipe 4

Liked explained in this bug description. With that, virtual_ip_monitor, from pacemaker, realized the virtual IP was gone and re-started it in the same node:

----

(k)inaddy@k...

Alright,

As this is a problem that does not only affect keepalived, but, all cluster-like softwares dealing with aliases in any existing interface, managed or not by systemd, I have tested the same test case in a pacemaker based cluster, with 3 nodes, having 1 virtual IP + a lighttpd instance running in the same resource group:

----

(k)inaddy@kcluster01:~$ crm config show
node 1: kcluster01
node 2: kcluster02
node 3: kcluster03
primitive fence_kcluster01 stonith:fence_virsh \
	params ipaddr=192.168.100.205 plug=kcluster01 action=off login=stonithmgr passwd=xxxx use_sudo=true delay=2 \
	op monitor interval=60s
primitive fence_kcluster02 stonith:fence_virsh \
	params ipaddr=192.168.100.205 plug=kcluster02 action=off login=stonithmgr passwd=xxxx use_sudo=true delay=4 \
	op monitor interval=60s
primitive fence_kcluster03 stonith:fence_virsh \
	params ipaddr=192.168.100.205 plug=kcluster03 action=off login=stonithmgr passwd=xxxx use_sudo=true delay=6 \
	op monitor interval=60s
primitive virtual_ip IPaddr2 \
	params ip=10.0.3.1 nic=eth3 \
	op monitor interval=10s
primitive webserver systemd:lighttpd \
	op monitor interval=10 timeout=60
group webserver_virtual_ip webserver virtual_ip
location l_fence_kcluster01 fence_kcluster01 -inf: kcluster01
location l_fence_kcluster02 fence_kcluster02 -inf: kcluster02
location l_fence_kcluster03 fence_kcluster03 -inf: kcluster03
property cib-bootstrap-options: \
	have-watchdog=true \
	dc-version=2.0.1-9e909a5bdd \
	cluster-infrastructure=corosync \
	cluster-name=debian \
	stonith-enabled=true \
	stonith-action=off \
	no-quorum-policy=stop

----

(k)inaddy@kcluster01:~$ cat /etc/netplan/cluster.yaml 
network:
    version: 2
    renderer: networkd
    ethernets:
        eth1:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.1.2/24]
        eth2:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.2.2/24]
        eth3:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.3.2/24]
        eth4:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.4.2/24]
        eth5:
            dhcp4: no
            dhcp6: no
            addresses: [10.0.5.2/24]

----

AND the virtual IP failed right after the netplan acted in systemd network interface.

(k)inaddy@kcluster03:~$ sudo netplan apply
(k)inaddy@kcluster03:~$ ping 10.0.3.1
PING 10.0.3.1 (10.0.3.1) 56(84) bytes of data.
From 10.0.3.4 icmp_seq=1 Destination Host Unreachable
From 10.0.3.4 icmp_seq=2 Destination Host Unreachable
From 10.0.3.4 icmp_seq=3 Destination Host Unreachable
From 10.0.3.4 icmp_seq=4 Destination Host Unreachable
From 10.0.3.4 icmp_seq=5 Destination Host Unreachable
From 10.0.3.4 icmp_seq=6 Destination Host Unreachable
64 bytes from 10.0.3.1: icmp_seq=7 ttl=64 time=0.088 ms
64 bytes from 10.0.3.1: icmp_seq=8 ttl=64 time=0.076 ms

--- 10.0.3.1 ping statistics ---
8 packets transmitted, 2 received, +6 errors, 75% packet loss, time 7128ms
rtt min/avg/max/mdev = 0.076/0.082/0.088/0.006 ms, pipe 4

Liked explained in this bug description. With that, virtual_ip_monitor, from pacemaker, realized the virtual IP was gone and re-started it in the same node:

----

(k)inaddy@kcluster01:~$ crm status
Stack: corosync
Current DC: kcluster01 (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Wed Sep 25 13:11:05 2019
Last change: Wed Sep 25 12:49:56 2019 by root via cibadmin on kcluster01

3 nodes configured
5 resources configured

Online: [ kcluster01 kcluster02 kcluster03 ]

Full list of resources:

fence_kcluster01	(stonith:fence_virsh):	Started kcluster02
 fence_kcluster02	(stonith:fence_virsh):	Started kcluster01
 fence_kcluster03	(stonith:fence_virsh):	Started kcluster01
 Resource Group: webserver_virtual_ip
     webserver	(systemd:lighttpd):	Started kcluster03
     virtual_ip	(ocf::heartbeat:IPaddr2):	FAILED kcluster03

Failed Resource Actions:
* virtual_ip_monitor_10000 on kcluster03 'not running' (7): call=100, status=complete, exitreason='',
    last-rc-change='Wed Sep 25 13:11:05 2019', queued=0ms, exec=0ms

----

(k)inaddy@kcluster01:~$ crm status
Stack: corosync
Current DC: kcluster01 (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Wed Sep 25 13:11:07 2019
Last change: Wed Sep 25 12:49:56 2019 by root via cibadmin on kcluster01

3 nodes configured
5 resources configured

Online: [ kcluster01 kcluster02 kcluster03 ]

Full list of resources:

fence_kcluster01	(stonith:fence_virsh):	Started kcluster02
 fence_kcluster02	(stonith:fence_virsh):	Started kcluster01
 fence_kcluster03	(stonith:fence_virsh):	Started kcluster01
 Resource Group: webserver_virtual_ip
     webserver	(systemd:lighttpd):	Started kcluster03
     virtual_ip	(ocf::heartbeat:IPaddr2):	Started kcluster03

Failed Resource Actions:
* virtual_ip_monitor_10000 on kcluster03 'not running' (7): call=100, status=complete, exitreason='',
    last-rc-change='Wed Sep 25 13:11:05 2019', queued=0ms, exec=0ms

----

And, if I want, I can query the number of restarts that particular resource (the virtual_ip monitor) had in that node, to check if the resource was about to migrate to another node, thinking this was a real failure (and it is ?):

(k)inaddy@kcluster01:~$ sudo crm_failcount --query -r virtual_ip  -N kcluster03
scope=status  name=fail-count-virtual_ip value=5

So this resource already failed 5 times in that node, and a "netplan apply" could have migrated the issue, for example.

----

For pacemaker, the issue is not *that big* if the cluster is configured correctly - with a resource monitor - as the cluster will always try to restart the virtual IP associated with the resource - lighttpd in my case - being managed. Nevertheless, resource migrations and possible downtime could happen in the event of multiple resource monitor failures.

I'll check now why keepalived can't simply re-establish the virtual IPs in the event of a failure, like pacemaker does, and, if systemd-networkd should be altered not to change aliases if having a specific flag, or things are good the way they are.

Rafael David Tinoco (rafaeldtinoco) on 2019-09-26

summary:

- [master] Restarting systemd-networkd breaks keepalived clusters
+ [master] Restarting systemd-networkd breaks keepalived, heartbeat,
+ corosync, pacemaker (interface aliases are restarted)

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2019-09-26:

#15

The commits bellow implement support to "keep configuration":

commit 1e498853a39b46155cb89b5c9e74ecb27aaba3ed
Author: Yu Watanabe <email address hidden>
Date: Mon Jun 3 01:21:13 2019

test-network: add tests for KeepConfiguration=

commit c98d78d32abba6aadbe89eece7acf0742f59047c
Author: Yu Watanabe <email address hidden>
Date: Mon Jun 3 03:37:25 2019

man: add documentation about KeepConfiguration

commit db51778f85cb076e9ed1fe7f7e29cc740365c245
Author: Yu Watanabe <email address hidden>
Date: Mon Jun 3 00:33:13 2019

network: make KeepConfiguration=static drop DHCP addresses and routes

Also, KeepConfiguration=dhcp drops static foreign addresses and routes.

commit 95355a281c06c5970b7355c38b066910c3be4958
Author: Yu Watanabe <email address hidden>
Date: Mon Jun 3 14:05:26 2019

network: add KeepConfiguration=dhcp-on-stop

The option prevents to drop lease address on stop.
By setting this, we can safely restart networkd.

commit 7da377ef16a2112a673247b39041a180b07e973a
Author: Susant Sahani <email address hidden>
Date: Mon Jun 3 00:31:13 2019

networkd: add support to keep configuration

for systemd-networkd.

IMO, we should rely in setting the keep configuration flag for the interfaes to be managed by 3rd part software (adding/removing aliases for virtual networks, VRRP interfaces, etc).

Revision history for this message

Christian Ehrhardt  (paelzer) wrote on 2019-09-26:

#16

If you are lazy to look for these commits, feel free to use these links
https://github.com/systemd/systemd/commit/7da377ef16a2112a673247b39041a180b07e973a
https://github.com/systemd/systemd/commit/95355a281c06c5970b7355c38b066910c3be4958
https://github.com/systemd/systemd/commit/db51778f85cb076e9ed1fe7f7e29cc740365c245
https://github.com/systemd/systemd/commit/c98d78d32abba6aadbe89eece7acf0742f59047c
https://github.com/systemd/systemd/commit/1e498853a39b46155cb89b5c9e74ecb27aaba3ed

Revision history for this message

Edward Hope-Morley (hopem) wrote on 2019-09-28:

#17

Thanks Rafael/Christian,

I see that all those patches are in 243 and Eoan is currently on 242 (albeit -6 but i dont think any are already backported) so we'll need to get this backported all the way down to Bionic.

max@power:~/git/systemd$ _c=( 7da377e 95355a2 db51778 c98d78d 1e49885 )
max@power:~/git/systemd$ for c in ${_c[@]}; do git tag --contains $c| egrep -v "\-rc"; done| sort -u
v243

Do we have a feel for if/when the keepalived fix(es) will be backportable to B (1.x) as well? Since those fixes already exist in Discco (2.0.10) it might be easier to start with those?

I will add the charm-keepalived to this LP since it will need support for the networkd/netplan fix once that is available.

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2019-10-09:

#18

@ed,

I just finished the backport to Eoan it was straightforward, I'll finish tests tomorrow with HA related software and networkd enabled HA clusters. After that I'll give you a better estimation about Disco and Bionic.

This is the total size of changes (systemd-networkd-tests.py is not so great to backport, will review that):

$ cat *.patch | diffstat
man/systemd.network.xml | 27 +-
src/network/networkd-dhcp4.c | 8
src/network/networkd-link.c | 57 +++++-
src/network/networkd-link.h | 2
src/network/networkd-manager.c | 2
src/network/networkd-network-gperf.gperf | 3
src/network/networkd-network.c | 44 ++++
src/network/networkd-network.h | 26 ++
test/fuzz/fuzz-network-parser/directives.network | 1
test/test-network/conf/24-keep-configuration-static.network | 5
test/test-network/conf/dhcp-client-keep-configuration-dhcp-on-stop.network | 4
test/test-network/conf/dhcp-client-keep-configuration-dhcp.network | 7
test/test-network/systemd-networkd-tests.py | 94 +++++++++-
13 files changed, 235 insertions(+), 45 deletions(-)

Good thing is that the logic is not drastically changed for this feature to exist. Sorry for the delay here, because of freeze we were running to close out some urgent issues for Eoan.

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2019-10-10:

#19

Test Case:

(k)rafaeldtinoco@kcluster03:~$ crm status
Stack: corosync
Current DC: kcluster02 (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Thu Oct 10 17:13:19 2019
Last change: Thu Oct 10 17:11:48 2019 by root via cibadmin on kcluster01

3 nodes configured
5 resources configured

Online: [ kcluster01 kcluster02 kcluster03 ]

Full list of resources:

fence_kcluster01 (stonith:fence_virsh): Started kcluster02
fence_kcluster02 (stonith:fence_virsh): Started kcluster01
fence_kcluster03 (stonith:fence_virsh): Started kcluster01
Resource Group: webserver_virtual_ip
webserver (systemd:lighttpd): Started kcluster03
virtual_ip (ocf::heartbeat:IPaddr2): Started kcluster03

(k)rafaeldtinoco@kcluster03:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:c3:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:c306/64 scope link
       valid_lft forever preferred_lft forever

(k)rafaeldtinoco@kcluster03:~$ systemctl restart systemd-networkd

(k)rafaeldtinoco@kcluster03:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:c3:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:c306/64 scope link
       valid_lft forever preferred_lft forever

(k)rafaeldtinoco@kcluster03:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:c3:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:c306/64 scope link
       valid_lft forever preferred_lft forever
(k)rafaeldtinoco@kcluster03:~$

Pacemaker logs:

Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6901]: INFO: IP status = no, IP_CIP=
Oct 10 17:14:37 kcluster03 pacemaker-controld[1266]: notice: Result of stop operation for virtual_ip on kcluster03: 0 (ok)
Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6951]: INFO: Adding inet address 10.0.3.1/24 with broadcast address 10.0.3.255 to device eth3
Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6956]: INFO: Bringing device eth3 up
Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6961]: INFO: /usr/lib/heartbeat/send_arp -i 200 -r 5 -p /run/resource-agents/send_arp-10.0.3.1 eth3 10.0.3.1 auto not_used not_used
Oct 10 17:14:37 kcluster03 pacemaker-controld[1266]: notice: Result of start operation for virtual_ip on kcluster03: 0 (ok)

for the operation.

Test Case:

(k)rafaeldtinoco@kcluster03:~$ crm status 
Stack: corosync
Current DC: kcluster02 (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Thu Oct 10 17:13:19 2019
Last change: Thu Oct 10 17:11:48 2019 by root via cibadmin on kcluster01

3 nodes configured
5 resources configured

Online: [ kcluster01 kcluster02 kcluster03 ]

Full list of resources:

fence_kcluster01       (stonith:fence_virsh):  Started kcluster02
 fence_kcluster02       (stonith:fence_virsh):  Started kcluster01
 fence_kcluster03       (stonith:fence_virsh):  Started kcluster01
 Resource Group: webserver_virtual_ip
     webserver  (systemd:lighttpd):     Started kcluster03
     virtual_ip (ocf::heartbeat:IPaddr2):       Started kcluster03

(k)rafaeldtinoco@kcluster03:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:c3:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:c306/64 scope link 
       valid_lft forever preferred_lft forever

(k)rafaeldtinoco@kcluster03:~$ systemctl restart systemd-networkd

(k)rafaeldtinoco@kcluster03:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:c3:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:c306/64 scope link 
       valid_lft forever preferred_lft forever

(k)rafaeldtinoco@kcluster03:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:c3:06 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:c306/64 scope link 
       valid_lft forever preferred_lft forever
(k)rafaeldtinoco@kcluster03:~$

Pacemaker logs:

Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6901]: INFO: IP status = no, IP_CIP=
Oct 10 17:14:37 kcluster03 pacemaker-controld[1266]:  notice: Result of stop operation for virtual_ip on kcluster03: 0 (ok)
Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6951]: INFO: Adding inet address 10.0.3.1/24 with broadcast address 10.0.3.255 to device eth3
Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6956]: INFO: Bringing device eth3 up
Oct 10 17:14:37 kcluster03 IPaddr2(virtual_ip)[6961]: INFO: /usr/lib/heartbeat/send_arp  -i 200 -r 5 -p /run/resource-agents/send_arp-10.0.3.1 eth3 10.0.3.1 auto not_used not_used
Oct 10 17:14:37 kcluster03 pacemaker-controld[1266]:  notice: Result of start operation for virtual_ip on kcluster03: 0 (ok)

for the operation.

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2019-10-10:

#20

(k)rafaeldtinoco@kcluster01:~$ sudo vi /etc/systemd/network/10-netplan-eth3.network

(k)rafaeldtinoco@kcluster01:~$ systemctl restart systemd-networkd

(k)rafaeldtinoco@kcluster01:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:f0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:f003/64 scope link
       valid_lft forever preferred_lft forever

(k)rafaeldtinoco@kcluster01:~$ systemctl restart systemd-networkd

(k)rafaeldtinoco@kcluster01:~$ ip addr show eth3
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:f0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:f003/64 scope link
       valid_lft forever preferred_lft forever

Voila. Needs better testing with KeepConfiguration=dhcp.

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2019-10-11:

#21

Eoan SRU:

MR: https://code.launchpad.net/~rafaeldtinoco/ubuntu/+source/systemd/+git/systemd/+merge/374027
PPA: https://launchpad.net/~rafaeldtinoco/+archive/ubuntu/lp1815101

Add:

KeepConfiguration=static

to .network file to keep aliases to.

Edward Hope-Morley (hopem) on 2019-10-31

tags:

added: sts

Dan Streetman (ddstreet) on 2019-11-07

description:

updated

Rafael David Tinoco (rafaeldtinoco) on 2019-11-07

description:

updated

Revision history for this message

Łukasz Zemczak (sil2100) wrote on 2019-11-07: Please test proposed package

#22

Hello Leroy, or anyone else affected,

Accepted systemd into eoan-proposed. The package will build now and be available at https://launchpad.net/ubuntu/+source/systemd/242-7ubuntu3.2 in a few hours, and then in the -proposed repository.

Please help us by testing this new package. See https://wiki.ubuntu.com/Testing/EnableProposed for documentation on how to enable and use -proposed. Your feedback will aid us getting this update out to other Ubuntu users.

If this package fixes the bug for you, please add a comment to this bug, mentioning the version of the package you tested and change the tag from verification-needed-eoan to verification-done-eoan. If it does not fix the bug for you, please add a comment stating that, and change the tag to verification-failed-eoan. In either case, without details of your testing we will not be able to proceed.

Further information regarding the verification process can be found at https://wiki.ubuntu.com/QATeam/PerformingSRUVerification . Thank you in advance for helping!

N.B. The updated package will be released to -updates after the bug(s) fixed by this package have been verified and the package has been in -proposed for a minimum of 7 days.

Changed in systemd (Ubuntu Eoan):
status:	In Progress → Fix Committed
tags:	added: verification-needed verification-needed-eoan

Revision history for this message

Ubuntu SRU Bot (ubuntu-sru-bot) wrote on 2019-11-07: Autopkgtest regression report (systemd/242-7ubuntu3.2)

#23

All autopkgtests for the newly accepted systemd (242-7ubuntu3.2) for eoan have finished running.
The following regressions have been reported in tests triggered by the package:

gvfs/1.42.1-1ubuntu1 (amd64)
systemd/242-7ubuntu3.2 (ppc64el)
ndctl/unknown (armhf)
casper/1.427 (amd64)
netplan.io/0.98-0ubuntu1 (ppc64el)
munin/unknown (armhf)
linux-oem-osp1/5.0.0-1026.29 (amd64)

Please visit the excuses page listed below and investigate the failures, proceeding afterwards as per the StableReleaseUpdates policy regarding autopkgtest regressions [1].

https://people.canonical.com/~ubuntu-archive/proposed-migration/eoan/update_excuses.html#systemd

[1] https://wiki.ubuntu.com/StableReleaseUpdates#Autopkgtest_Regressions

Thank you!

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2019-11-19:

#24

Download full text (4.5 KiB)

(k)rafaeldtinoco@kcluster01:~$ dpkg -l | grep "ii systemd "
ii systemd 243-3ubuntu1 amd64 system and service manager

k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "dpkg -l | grep systemd "; done | grep "ii systemd "

ii systemd 243-3ubuntu1 amd64 system and service manager
ii systemd 243-3ubuntu1 amd64 system and service manager
ii systemd 243-3ubuntu1 amd64 system and service manager
----

(k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "cat /etc/systemd/network/10-netplan-eth3.network"; done
[Match]
Name=eth3

[Network]
LinkLocalAddressing=ipv6
Address=10.0.3.2/24
KeepConfiguration=static
[Match]
Name=eth3

[Network]
LinkLocalAddressing=ipv6
Address=10.0.3.3/24
KeepConfiguration=static
[Match]
Name=eth3

[Network]
LinkLocalAddressing=ipv6
Address=10.0.3.4/24
KeepConfiguration=static

----

(k)rafaeldtinoco@kcluster01:~$ crm status
Stack: corosync
Current DC: kcluster01 (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Tue Nov 19 16:38:15 2019
Last change: Mon Nov 18 12:41:14 2019 by root via crm_resource on kcluster01

3 nodes configured
5 resources configured

Online: [ kcluster01 kcluster02 kcluster03 ]

Full list of resources:

fence_kcluster01 (stonith:fence_virsh): Started kcluster02
fence_kcluster02 (stonith:fence_virsh): Started kcluster01
fence_kcluster03 (stonith:fence_virsh): Started kcluster01
Resource Group: webserver_virtual_ip
webserver (systemd:lighttpd): Started kcluster01
virtual_ip (ocf::heartbeat:IPaddr2): Started kcluster01

----

(k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "hostname ; ip addr show eth3"; done

kcluster01
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:a0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:a003/64 scope link
       valid_lft forever preferred_lft forever
kcluster02
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:1d:1a:cc brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.3/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe1d:1acc/64 scope link
       valid_lft forever preferred_lft forever
kcluster03
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:13:16 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:1316/64 scope link
       valid_lft forever preferred_lft forever

----

in parallel:

(k)rafaeldtinoco@kcluster01:~$ journalctl -f -u pacemaker

and check if events are generated (vip monitor detects changes)

----

(k)rafaeldtinoco@kcluster01:~$ systemctl restart sy...

(k)rafaeldtinoco@kcluster01:~$ dpkg -l | grep "ii  systemd " 
ii  systemd    243-3ubuntu1    amd64    system and service manager

k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "dpkg -l | grep systemd "; done | grep "ii  systemd "

ii  systemd    243-3ubuntu1    amd64    system and service manager
ii  systemd    243-3ubuntu1    amd64    system and service manager
ii  systemd    243-3ubuntu1    amd64    system and service manager
----

(k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "cat /etc/systemd/network/10-netplan-eth3.network"; done 
[Match]
Name=eth3

[Network]
LinkLocalAddressing=ipv6
Address=10.0.3.2/24
KeepConfiguration=static
[Match]
Name=eth3

[Network]
LinkLocalAddressing=ipv6
Address=10.0.3.3/24
KeepConfiguration=static
[Match]
Name=eth3

[Network]
LinkLocalAddressing=ipv6
Address=10.0.3.4/24
KeepConfiguration=static

----

(k)rafaeldtinoco@kcluster01:~$ crm status
Stack: corosync
Current DC: kcluster01 (version 2.0.1-9e909a5bdd) - partition with quorum
Last updated: Tue Nov 19 16:38:15 2019
Last change: Mon Nov 18 12:41:14 2019 by root via crm_resource on kcluster01

3 nodes configured
5 resources configured

Online: [ kcluster01 kcluster02 kcluster03 ]

Full list of resources:

fence_kcluster01       (stonith:fence_virsh):  Started kcluster02
 fence_kcluster02       (stonith:fence_virsh):  Started kcluster01
 fence_kcluster03       (stonith:fence_virsh):  Started kcluster01
 Resource Group: webserver_virtual_ip
     webserver  (systemd:lighttpd):     Started kcluster01
     virtual_ip (ocf::heartbeat:IPaddr2):       Started kcluster01

----

(k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "hostname ; ip addr show eth3"; done

kcluster01
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:a0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:a003/64 scope link 
       valid_lft forever preferred_lft forever
kcluster02
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:1d:1a:cc brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.3/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe1d:1acc/64 scope link 
       valid_lft forever preferred_lft forever
kcluster03
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:13:16 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:1316/64 scope link 
       valid_lft forever preferred_lft forever

----

in parallel:

(k)rafaeldtinoco@kcluster01:~$ journalctl -f -u pacemaker

and check if events are generated (vip monitor detects changes)

----

(k)rafaeldtinoco@kcluster01:~$ systemctl restart systemd-networkd

----

No VIP changes:

(k)rafaeldtinoco@kcluster01:~$ for name in kcluster01 kcluster02 kcluster03; do ssh $name "hostname ; ip addr show eth3"; done

kcluster01
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:11:a0:03 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.2/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet 10.0.3.1/24 brd 10.0.3.255 scope global secondary eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe11:a003/64 scope link 
       valid_lft forever preferred_lft forever
kcluster02
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:1d:1a:cc brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.3/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:fe1d:1acc/64 scope link 
       valid_lft forever preferred_lft forever
kcluster03
5: eth3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP group default qlen 1000
    link/ether 52:54:00:b0:13:16 brd ff:ff:ff:ff:ff:ff
    inet 10.0.3.4/24 brd 10.0.3.255 scope global eth3
       valid_lft forever preferred_lft forever
    inet6 fe80::5054:ff:feb0:1316/64 scope link 
       valid_lft forever preferred_lft forever

and no events generated!

verification-done

tags:

added: verification-done verification-done-eoan
removed: verification-needed verification-needed-eoan

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2019-11-19:

#25

Flagging this as wont fix as heartbeat is already being kept just for historical reasons (and systemd-networkd can workaround that by the fix we're backporting to it: KeepConfiguration .service file stanza).

Changed in heartbeat (Ubuntu Eoan):
assignee:	Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in heartbeat (Ubuntu Disco):
assignee:	Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in heartbeat (Ubuntu Bionic):
assignee:	Rafael David Tinoco (rafaeldtinoco) → nobody
Changed in heartbeat (Ubuntu):
assignee:	Rafael David Tinoco (rafaeldtinoco) → nobody
no longer affects:	heartbeat (Ubuntu Eoan)
no longer affects:	heartbeat (Ubuntu Disco)
no longer affects:	heartbeat (Ubuntu Bionic)
Changed in heartbeat (Ubuntu):
status:	Triaged → Won't Fix

Revision history for this message

Launchpad Janitor (janitor) wrote on 2019-11-25:

#27

This bug was fixed in the package systemd - 242-7ubuntu3.2

---------------
systemd (242-7ubuntu3.2) eoan; urgency=medium

  [ Dan Streetman ]
  * d/extra/dhclient-enter-resolved-hook:
    - Replace use of bash-only &> with > and 2> (LP: #1849608)
  * d/p/lp1849658-resolved-set-stream-type-during-DnsStream-creation.patch:
    - Fix bug in refcounting TCP stream types (LP: #1849658)
  * d/extra/dhclient-enter-resolved-hook: cleanup temp $newstate file

  [ Rafael David Tinoco ]
  * Add support to KeepConfiguration= fixing behaviour for HA (LP: #1815101)
    - d/p/lp1815101-01-networkd-add-support-to-keep-configuration.patch
    - d/p/lp1815101-02-networkd-stop-clients-when-networkd-shuts-down.patch
    - d/p/lp1815101-03-network-add-KeepConfiguration-dhcp-on-stop.patch
    - d/p/lp1815101-04-network-make-KeepConfiguration-static-drop-DHCP-addr.patch
    - d/p/lp1815101-05-man-add-documentation-about-KeepConfiguration.patch

systemd (242-7ubuntu3.1) eoan; urgency=medium

  [ Balint Reczey ]
  * Fix shutdown and related actions from the login screen (LP: #1847896)
    File: debian/patches/logind-consider-greeter-sessions-suitable-as-display-sess.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=b407dfd8c9dc81594553c27467c35b383333d74c
  * debian/gbp.conf: Set debian-branch to ubuntu-eoan
    File: debian/gbp.conf
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=f399ce2cf4701a2dbb4b3505d2dd17a210d62f5c

  [ Dan Streetman ]
  * Fix bogus routes after DHCP lease change (LP: #1831787)
    Files:
    - debian/patches/lp1831787/0001-networkd-Add-back-static-routes-after-DHCPv4-lease-e.patch
    - debian/patches/lp1831787/0002-network-set-preferred-source-in-removing-route-entry.patch
    - debian/patches/lp1831787/0003-network-lower-log-level-about-critical-connection.patch
    - debian/patches/lp1831787/0004-network-reset-Link-dhcp4_configured-flag-earlier.patch
    - debian/patches/lp1831787/0005-network-split-dhcp_lease_lost-into-small-pieces.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=ced3f5c2f619083f7beb164d94d4ccfe52222fe8
  * Set src address for dhcp 'classless' routes (LP: #1835581)
    File: debian/patches/lp1835581-src-network-networkd-dhcp4.c-set-prefsrc-for-classle.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=6a7ef370fb1335548448920be4ae6176b67044a8
  * Allows cache=no-negative option to be set, ignoring negative answers to
    be cached (LP: #1668771)
    File: debian/patches/lp1668771-resolved-switch-cache-option-to-a-tri-state-option-s.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=27def26f5b1d1b8ba314c4a925fc1b7c43837f86

-- Dan Streetman <email address hidden> Fri, 01 Nov 2019 16:33:08 -0400

This bug was fixed in the package systemd - 242-7ubuntu3.2

---------------
systemd (242-7ubuntu3.2) eoan; urgency=medium

[ Dan Streetman ]
  * d/extra/dhclient-enter-resolved-hook:
    - Replace use of bash-only &> with > and 2> (LP: #1849608)
  * d/p/lp1849658-resolved-set-stream-type-during-DnsStream-creation.patch:
    - Fix bug in refcounting TCP stream types (LP: #1849658)
  * d/extra/dhclient-enter-resolved-hook: cleanup temp $newstate file

[ Rafael David Tinoco ]
  * Add support to KeepConfiguration= fixing behaviour for HA (LP: #1815101)
    - d/p/lp1815101-01-networkd-add-support-to-keep-configuration.patch
    - d/p/lp1815101-02-networkd-stop-clients-when-networkd-shuts-down.patch
    - d/p/lp1815101-03-network-add-KeepConfiguration-dhcp-on-stop.patch
    - d/p/lp1815101-04-network-make-KeepConfiguration-static-drop-DHCP-addr.patch
    - d/p/lp1815101-05-man-add-documentation-about-KeepConfiguration.patch

systemd (242-7ubuntu3.1) eoan; urgency=medium

[ Balint Reczey ]
  * Fix shutdown and related actions from the login screen (LP: #1847896)
    File: debian/patches/logind-consider-greeter-sessions-suitable-as-display-sess.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=b407dfd8c9dc81594553c27467c35b383333d74c
  * debian/gbp.conf: Set debian-branch to ubuntu-eoan
    File: debian/gbp.conf
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=f399ce2cf4701a2dbb4b3505d2dd17a210d62f5c

[ Dan Streetman ]
  * Fix bogus routes after DHCP lease change (LP: #1831787)
    Files:
    - debian/patches/lp1831787/0001-networkd-Add-back-static-routes-after-DHCPv4-lease-e.patch
    - debian/patches/lp1831787/0002-network-set-preferred-source-in-removing-route-entry.patch
    - debian/patches/lp1831787/0003-network-lower-log-level-about-critical-connection.patch
    - debian/patches/lp1831787/0004-network-reset-Link-dhcp4_configured-flag-earlier.patch
    - debian/patches/lp1831787/0005-network-split-dhcp_lease_lost-into-small-pieces.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=ced3f5c2f619083f7beb164d94d4ccfe52222fe8
  * Set src address for dhcp 'classless' routes (LP: #1835581)
    File: debian/patches/lp1835581-src-network-networkd-dhcp4.c-set-prefsrc-for-classle.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=6a7ef370fb1335548448920be4ae6176b67044a8
  * Allows cache=no-negative option to be set, ignoring negative answers to
    be cached (LP: #1668771)
    File: debian/patches/lp1668771-resolved-switch-cache-option-to-a-tri-state-option-s.patch
    https://git.launchpad.net/~ubuntu-core-dev/ubuntu/+source/systemd/commit/?id=27def26f5b1d1b8ba314c4a925fc1b7c43837f86

-- Dan Streetman <ddstreet@canonical.com>  Fri, 01 Nov 2019 16:33:08 -0400

Changed in systemd (Ubuntu Eoan):
status:	Fix Committed → Fix Released

Revision history for this message

Łukasz Zemczak (sil2100) wrote on 2019-11-25: Update Released

#26

The verification of the Stable Release Update for systemd has completed successfully and the package is now being released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regressions.

Revision history for this message

Dan Streetman (ddstreet) wrote on 2020-01-14:

#28

as disco reaches EOL next week, marking this as wontfix for disco.

Changed in systemd (Ubuntu Disco):
status:	Confirmed → Won't Fix
Changed in keepalived (Ubuntu Disco):
status:	Confirmed → Won't Fix

Revision history for this message

David Negreira (dnegreira) wrote on 2020-02-07:

#29

Can we backport the fixes to Bionic?

Revision history for this message

Balint Kovacs (kovacs-balint-o) wrote on 2020-02-11:

#30

Hi all,

thanks for the fixes in Eoan. Unfortunately we have a product based on disco and cannot move forward at this time. Being a networking shop, this issue has a serious effect on us and we would like to avoid moving to something like ifupdown2 within our stable branch.

For our users the real impact of the bug is not that that the interface that we are currently reconfiguring is suffering a downtime, but the fact that _all_ interfaces have their aliases removed if networkd is restarted. The proposed KeepConfiguration solution kind of beats the purpose of reconfiguring the interfaces, as old addresses are kept and need to be handled manually. Also it interferes with how DHCP works. I believe this might be an issue for others as well.

From our point of view the ideal solution would be a combination of the keepalived patch that detects VIP removal and systemd version 244 that already supports "networkctl reconfigure" and "networkctl reload".

Is there any chance that v244 is backported to bionic? It is already included in focal and debian stable backports, but unfortunately I am not familiar enough with systemd development to tell what the impact of this would be.

As for keepalived, in bug #1819074 there was an ongoing investigation on the patch, that implements the keepalived transition on removing the VIP. We have traced back this functionality to this patch:

https://github.com/acassen/keepalived/commit/0b1528c76d3fe8d1c5765841df86c59570a036da

It was born before v1.3.6 was released, so we hope that it is self-contained enough for a backport if v2.0 of keepalived is not included in bionic-backports.

Best,
Balint

Rafael David Tinoco (rafaeldtinoco) on 2020-02-13

Changed in keepalived (Ubuntu Xenial):
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)
importance:	Undecided → Medium
status:	New → Confirmed
no longer affects:	heartbeat (Ubuntu Xenial)
Changed in systemd (Ubuntu Xenial):
assignee:	nobody → Rafael David Tinoco (rafaeldtinoco)
importance:	Undecided → Medium
status:	New → Confirmed

Revision history for this message

Rafael David Tinoco (rafaeldtinoco) wrote on 2020-02-13: Re: [Bug 1815101] Re: [master] Restarting systemd-networkd breaks keepalived, heartbeat, corosync, pacemaker (interface aliases are restarted)

#31

Download full text (5.1 KiB)

Balint, based on your input...

> thanks for the fixes in Eoan. Unfortunately we have a product based on
> disco and cannot move forward at this time. Being a networking shop,
> this issue has a serious effect on us and we would like to avoid moving
> to something like ifupdown2 within our stable branch.

So, Disco is EOL as it is not a LTS version, that is why it did not
get a fix (as the fix is very close to the one done in Eoan). Since
its unsupported by the community, it's up to you backport the Eoan
fixes to Disco if you'd like... you can even create a PPA for your
product and distribute along.

> For our users the real impact of the bug is not that that the interface
> that we are currently reconfiguring is suffering a downtime, but the
> fact that _all_ interfaces have their aliases removed if networkd is
> restarted. The proposed KeepConfiguration solution kind of beats the
> purpose of reconfiguring the interfaces, as old addresses are kept and
> need to be handled manually. Also it interferes with how DHCP works. I
> believe this might be an issue for others as well.

We are following systemd-networkd upstream decisions here. The option
"dhcp" only exists for CERTAIN scenarios (when root disk depends on
that connection, for iSCSI and/or NFS/ROOT for example). It is
explicitly said in the documentation:

"""
Takes a boolean or one of "static", "dhcp-on-stop", "dhcp". When
"static", systemd-networkd will not drop static addresses and routes
on starting up process. When set to "dhcp-on-stop", systemd-networkd
will not drop addresses and routes on stopping the daemon. When
"dhcp", the addresses and routes provided by a DHCP server will never
be dropped even if the DHCP lease expires. This is contrary to the
DHCP specification, but may be the best choice if, e.g., the root
filesystem relies on this connection. The setting "dhcp" implies
"dhcp-on-stop", and "yes" implies "dhcp" and "static". Defaults to
"no".
"""

and it is a question of choice: to have a window of opportunity for
duplicate IPs - in cases where there is no dynamic IP mapping to that
mac address - but possibly maintain the connection instead of causing
uninterruptable I/Os trying to shutdown a machine, for example. I
particularly don't like this option but it is not the default one and
was meant for a specific purpose.

>
> >From our point of view the ideal solution would be a combination of the
> keepalived patch that detects VIP removal and systemd version 244 that
> already supports "networkctl reconfigure" and "networkctl reload".

networkctl reconfigure/reload is a new functionality and won't be
added to previous already released versions as this is against SRU
guidelines. Systemd 244.2-1ubuntu1 is being included in 20.04, our
NEXT LTS version.

Like said before, you can try backporting systemd 244 to disco, or
bionic, if you are willing to support it on your own as it was already
EOL for community support. You should follow:
https://packaging.ubuntu.com/html/backports.html if you would like to
do that.

For the keepalived patches, they could be backported to Eoan, maybe
Bionic and Xenial depending on the amount of work. But then I would
need a practical example of wh...