libreswan unconfigures vti interfaces in temporary network outage

Bug #1751379 reported by Joel N. Weber II
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
libreswan (Ubuntu)
New
Undecided
Unassigned

Bug Description

On a Ubuntu 17.10 system, if a temporary network outage occurs, such as a firmware upgrade on an Ethernet switch in the network path or temporarily disconnecting the interface via the virtualization platform or failing to configure AWS's recommended lifetime and/or dead peer detection settings, libreswan will unconfigure the vti interfaces during the temporary failure and not reconfigure them when the temporary failure is over, resulting in not recovering from the outage until systemctl restart ipsec is run manually. (The vti interfaces disappear from the output of ``ip addr'' during the temporary failure and the vti interfaces do not reappear in the output of ``ip addr'' until after ``systemctl restart ipsec'' is run.) Additionally, libreswan doesn't seem to successfully configure the vti interfaces at boot time, but manually running systemctl restart ipsec shortly after a reboot works. (Given that I'm relying on systemd-networkd to configure the dummy0 interface with the globally routable IP address being used, there's a chance that libreswan might be starting before dummy0 gets configured.)

left=, right=, and leftvti= values have been redacted for posting in this bug report, and I have only included one of the several connections here, but the rest of the configuration below reflects what I have in /etc/ipsec.d/aws.conf.

Additionally, the documentation suggested that I could set mark to -1 for all tunnels to automatically get a unique mark for each one, but I found that some of the tunnels failed to work when I used -1 and started working when I manually assigned a unique mark value to each.

I am using bird to run BGP across these tunnels.

conn aws-base
     fragmentation=yes
     dpdaction=restart
     dpddelay=10
     dpdtimeout=30
     ikelifetime=28800
     salifetime=3600
     auto=start
     authby=secret
     ike=aes256-sha2-dh24
     phase2=esp
     phase2alg=aes256-sha2;dh24
     type=tunnel
     vti-routing=no
     left=100.64.36.16
     leftsubnet=0.0.0.0/0
     rightsubnet=0.0.0.0/0

conn aws-1
     also=aws-base
     vti-interface=vti01
     leftvti=169.254.255.254/30
     right=100.64.25.4
     mark=1001/0xffffffff

Revision history for this message
Joel N. Weber II (joelweber) wrote :

Upgrading from Ubuntu 17.10 to 18.04 appears to have fixed the problem with the vti interfaces disappearing from the output of ip addr during a network glitch.

However, I still see a failure of the vti interfaces to come up automatically at boot without manually running systemctl start ipsec, and I still find that after a temporary network glitch, the tunnels do not promptly resume passing traffic.

Revision history for this message
Joel N. Weber II (joelweber) wrote :

I've ended up with a combination of a leftupdown script which is modified to not unconfigure the interface, plus a cron job which checks the output of birdc show protocols all and runs ipsec auto --down [tunnelname] then waits ten seconds and runs ipsec auto --up [tunnelname] if bird reports a failure, which finally seems to be an adequate workaround for achieving a reasonable approximation the desired stability.

That cron job is relatively recent, and I have not done testing to determine whether the cron job makes the modified leftupdown script obsolete.

An earlier version of the cron job relied on looking at the output of ipsec status to determine whether a tunnel was working, and in some cases that led to the script not restarting a tunnel that needed to be restarted.

The startup at boot order problem mentioned in this bug report has not occurred recently.

Revision history for this message
Joel N. Weber II (joelweber) wrote :

Because Debian 10 seems to have a more robust security update policy for the libreswan package than Ubuntu does, I've moved this functionality from Ubuntu to Debian 10, and I believe I got a newer version of libreswan in the process.

On Debian 10, the cron job to check birdc's output for BGP session status and run ipsec auto --down [tunnelname], sleep 10, and ipsec auto --up [tunnelname] when the BGP session isn't ESTABLISHED still seems to be needed, but the default supplied leftupdown script seems to work fine unmodified.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.