systemd-networkd thinks it loses its lease every renewal
| Affects | Status | Importance | Assigned to | Milestone | |
|---|---|---|---|---|---|
| systemd (Ubuntu) |
Invalid
|
Undecided
|
Unassigned | ||
Bug Description
With a server running 20.04 on AWS, I noticed connectivity glitches once per half hour. Eventually managed to correlate it with DHCP renewals. Each time systemd-networkd does a renewal, it logs that the lease was lost and goes through a cycle of removing and re-adding the IP and routes (Even though it's the same IP and routes). This causes disruption, especially to SNATted flows; if a packet arrives for an SNATted flow during hte windows where the IP is removed then (I think) the host sends a RST and the flow gets torn down. (In any case, such flows get lost during the glitch.)
I'd expect a DHCP renewal to be completely transparent; the IP shouldn't flap, it should just be updated to have a longer lifetime.
I managed to capture a PCAP of the DHCP renewals along with a debug log from systemd-networkd.
ProblemType: Bug
DistroRelease: Ubuntu 19.10
Package: systemd 242-7ubuntu3.11
ProcVersionSign
Uname: Linux 5.3.0-1030-aws x86_64
ApportVersion: 2.20.11-0ubuntu8.9
Architecture: amd64
Date: Fri Sep 18 13:05:22 2020
Ec2AMI: ami-0d3d788094d
Ec2AMIManifest: (unknown)
Ec2Availability
Ec2InstanceType: t3.large
Ec2Kernel: unavailable
Ec2Ramdisk: unavailable
Lsusb: Error: command ['lsusb'] failed with exit code 1:
MachineType: Amazon EC2 t3.large
ProcEnviron:
TERM=xterm-
PATH=(custom, no user)
XDG_RUNTIME_
LANG=C.UTF-8
SHELL=/bin/bash
ProcKernelCmdLine: BOOT_IMAGE=
SourcePackage: systemd
UpgradeStatus: No upgrade log present (probably fresh install)
dmi.bios.date: 10/16/2017
dmi.bios.vendor: Amazon EC2
dmi.bios.version: 1.0
dmi.board.
dmi.board.vendor: Amazon EC2
dmi.chassis.
dmi.chassis.type: 1
dmi.chassis.vendor: Amazon EC2
dmi.modalias: dmi:bvnAmazonEC
dmi.product.name: t3.large
dmi.sys.vendor: Amazon EC2

Hmm, looks like this server is on 19.10, not 20.04 as I'd thought.