Buffer overflow when open fds exceed FD_SETSIZE

Bug #1549436 reported by ruslan_ka on 2016-02-24
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
strongswan (Ubuntu)
Undecided
Unassigned
Trusty
Undecided
Unassigned

Bug Description

At some conditions AppArmor Deny access of /usr/lib/ipsec/charon to a /dev/tty, which causes a daemon restart:

    Feb 24 07:06:04 vpn-01 kernel: [548017.000283] type=1400 audit(1456297564.902:21): apparmor="DENIED" operation="open" profile="/usr/lib/ipsec/charon" name="/dev/tty" pid=24255 comm="charon" requested_mask="rw" denied_mask="rw" fsuid=0 ouid=0
    Feb 24 07:06:10 vpn-01 charon: 00[DMN] Starting IKE charon daemon (strongSwan 5.1.2, Linux 3.13.0-48-generic, x86_64)

I'm not sure why charon request RW access to /dev/tty, but it was started after installing and configuring xauth-eap plugin (it allows EAP plugin to be used as backend for XAuth credential verification).

When strongswan is used with a Radius backend it creates additional issues besides clients reconnection (radius continues to think that all users are still logged in).

# lsb_release -rd
Description: Ubuntu 14.04.3 LTS
Release: 14.04

# apt-cache policy strongswan
strongswan:
  Installed: 5.1.2-0ubuntu2.4
  Candidate: 5.1.2-0ubuntu2.4
  Version table:
 *** 5.1.2-0ubuntu2.4 0
        500 http://us-west-2.ec2.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages
        500 http://security.ubuntu.com/ubuntu/ trusty-security/main amd64 Packages
        100 /var/lib/dpkg/status
     5.1.2-0ubuntu2 0
        500 http://us-west-2.ec2.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages

# apt-cache policy strongswan-plugin-xauth-eap
strongswan-plugin-xauth-eap:
  Installed: 5.1.2-0ubuntu2.4
  Candidate: 5.1.2-0ubuntu2.4
  Version table:
 *** 5.1.2-0ubuntu2.4 0
        500 http://us-west-2.ec2.archive.ubuntu.com/ubuntu/ trusty-updates/universe amd64 Packages
        500 http://security.ubuntu.com/ubuntu/ trusty-security/universe amd64 Packages
        100 /var/lib/dpkg/status
     5.1.2-0ubuntu2 0
        500 http://us-west-2.ec2.archive.ubuntu.com/ubuntu/ trusty/universe amd64 Packages

# apt-cache policy apparmor
apparmor:
  Installed: 2.8.95~2430-0ubuntu5.3
  Candidate: 2.8.95~2430-0ubuntu5.3
  Version table:
 *** 2.8.95~2430-0ubuntu5.3 0
        500 http://us-west-2.ec2.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages
        100 /var/lib/dpkg/status
     2.8.95~2430-0ubuntu5.1 0
        500 http://security.ubuntu.com/ubuntu/ trusty-security/main amd64 Packages
     2.8.95~2430-0ubuntu5 0
        500 http://us-west-2.ec2.archive.ubuntu.com/ubuntu/ trusty/main amd64 Packages

Right now I've disabled AppArmor for Strongswan and continue to monitor this:
# sudo apparmor_parser -R /etc/apparmor.d/usr.lib.ipsec.charon
# sudo apparmor_parser -R /etc/apparmor.d/usr.lib.ipsec.stroke
# sudo ln -s /etc/apparmor.d/usr.lib.ipsec.charon /etc/apparmor.d/disable/
# sudo ln -s /etc/apparmor.d/usr.lib.ipsec.stroke /etc/apparmor.d/disable/
# sudo apparmor_status
apparmor module is loaded.
5 profiles are loaded.
5 profiles are in enforce mode.
   /sbin/dhclient
   /usr/lib/NetworkManager/nm-dhcp-client.action
   /usr/lib/connman/scripts/dhclient-script
   /usr/sbin/named
   /usr/sbin/tcpdump
0 profiles are in complain mode.
2 processes have profiles defined.
2 processes are in enforce mode.
   /sbin/dhclient (697)
   /usr/sbin/named (1097)
0 processes are in complain mode.
0 processes are unconfined but have a profile defined.

# sudo cat /etc/apparmor.d/usr.lib.ipsec.charon
# ------------------------------------------------------------------
#
# Copyright (C) 2013 Canonical Ltd.
#
# This program is free software; you can redistribute it and/or
# modify it under the terms of version 2 of the GNU General Public
# License published by the Free Software Foundation.
#
# Author: Jonathan Davies <email address hidden>
#
# ------------------------------------------------------------------

#include <tunables/global>

/usr/lib/ipsec/charon {
  #include <abstractions/base>
  #include <abstractions/nameservice>
  #include <abstractions/authentication>
  #include <abstractions/openssl>

  capability net_admin,
  capability net_raw,

  network,
  network raw,

  /bin/dash rmPUx,

  /etc/ipsec.conf r,
  /etc/ipsec.secrets r,
  /etc/ipsec.*.secrets r,
  /etc/ipsec.d/ r,
  /etc/ipsec.d/** r,
  /etc/strongswan.conf r,
  /etc/strongswan.d/ r,
  /etc/strongswan.d/** r,
  /etc/tnc_config r,

  /proc/sys/net/core/xfrm_acq_expires w,

  /run/charon.* rw,

  /usr/lib/ipsec/charon rmix,
  /usr/lib/ipsec/imcvs/ r,
  /usr/lib/ipsec/imcvs/** rm,

  # Site-specific additions and overrides. See local/README for details.
  #include <local/usr.lib.ipsec.charon>
}

Simon Déziel (sdeziel) wrote :

@ruslan_ka, after disabling the Apparmor profiles, did you receive a prompt for a user/password or something when starting Strongswan?

Changed in strongswan (Ubuntu):
status: New → Incomplete
ruslan_ka (r-kalakutsky) wrote :

Hello Simon,

No, I do not have encrypted certs and StrongSwan works well as a service without user interaction:

# sudo ipsec start --nofork
Starting strongSwan 5.1.2 IPsec [starter]...
00[DMN] Starting IKE charon daemon (strongSwan 5.1.2, Linux 3.13.0-48-generic, x86_64)
00[CFG] loading ca certificates from '/etc/ipsec.d/cacerts'
00[CFG] loaded ca certificate "C=US, O=ShareG.co, OU=VPN Dept, CN=ca-root.shareg.co, <email address hidden>" from '/etc/ipsec.d/cacerts/cacert.pem'
00[CFG] loading aa certificates from '/etc/ipsec.d/aacerts'
00[CFG] loading ocsp signer certificates from '/etc/ipsec.d/ocspcerts'
00[CFG] loading attribute certificates from '/etc/ipsec.d/acerts'
00[CFG] loading crls from '/etc/ipsec.d/crls'
00[CFG] loading secrets from '/etc/ipsec.secrets'
00[CFG] loaded RSA private key from '/etc/ipsec.d/private/vpn.shareg.co.pem'
00[CFG] loaded IKE secret for vpn.shareg.co
00[LIB] loaded plugins: charon test-vectors aes rc2 sha1 sha2 md4 md5 rdrand random nonce x509 revocation constraints pkcs1 pkcs7 pkcs8 pkcs12 pem openssl xcbc cmac hmac ctr ccm gcm attr kernel-netlink resolve socket-default stroke updown eap-identity eap-mschapv2 eap-radius xauth-eap addrblock
...

OR

# sudo service strongswan start && sudo tail /var/log/syslog
Feb 24 22:20:56 vpn-01 charon: 00[DMN] Starting IKE charon daemon (strongSwan 5.1.2, Linux 3.13.0-48-generic, x86_64)
Feb 24 22:20:56 vpn-01 charon: 00[CFG] loading ca certificates from '/etc/ipsec.d/cacerts'
Feb 24 22:20:56 vpn-01 charon: 00[CFG] loaded ca certificate "C=US, O=ShareG.co, OU=VPN Dept, CN=ca-root.shareg.co, <email address hidden>" from '/etc/ipsec.d/cacerts/cacert.pem'
Feb 24 22:20:56 vpn-01 charon: 00[CFG] loading aa certificates from '/etc/ipsec.d/aacerts'
Feb 24 22:20:56 vpn-01 charon: 00[CFG] loading ocsp signer certificates from '/etc/ipsec.d/ocspcerts'
Feb 24 22:20:56 vpn-01 charon: 00[CFG] loading attribute certificates from '/etc/ipsec.d/acerts'
Feb 24 22:20:56 vpn-01 charon: 00[CFG] loading crls from '/etc/ipsec.d/crls'
Feb 24 22:20:56 vpn-01 charon: 00[CFG] loading secrets from '/etc/ipsec.secrets'
Feb 24 22:20:56 vpn-01 charon: 00[CFG] loaded RSA private key from '/etc/ipsec.d/private/vpn.shareg.co.pem'
Feb 24 22:20:56 vpn-01 charon: 00[CFG] loaded IKE secret for vpn.shareg.co
Feb 24 22:20:56 vpn-01 charon: 00[LIB] loaded plugins: charon test-vectors aes rc2 sha1 sha2 md4 md5 rdrand random nonce x509 revocation constraints pkcs1 pkcs7 pkcs8 pkcs12 pem openssl xcbc cmac hmac ctr ccm gcm attr kernel-netlink resolve socket-default stroke updown eap-identity eap-mschapv2 eap-radius xauth-eap addrblock
...

Simon Déziel (sdeziel) wrote :

If you re-enable the Apparmor profile and set your connection to not auto start (use "auto=add") when do you get the access denial on /dev/tty? Is it after restarting the strongswan service or when you call "ipsec up $conn"?

Lastly, would you mind providing an obfuscated version of your ipsec.secrets and ipsec.conf?

ruslan_ka (r-kalakutsky) wrote :

The server serves only incoming VPN requests, it is for mobile road-warriors. And the error does not occur right after starting a strongswan or bringing tunnels up. So it makes no sense to run it with auto=add or not.

Strongswan is serving clients ok. It is working for a long time until a first DENIAL. It looks like it is somehow related to reauthentication of xauth iOS client, but I can't reproduce it. Sometimes client can reauth ok, as I can see at logs, but sometimes right after successful reauth I see this error. There are about 5 active clients right now with 20-30 connections per/day, and server gives me an error once/twice per day. I would not even note it, if it'd not break accounting at radius.

If ipsec runs at debug mode at console (--nofork) I don't get this error.

$ sudo cat /etc/ipsec.secrets
# This file holds shared secrets or RSA private keys for authentication.

: RSA vpn.server.name.pem
vpn.server.name : PSK "simpletestpsk"

$ sudo cat /etc/ipsec.conf
# ipsec.conf - strongSwan IPsec configuration file

# basic configuration

config setup
 strictcrlpolicy=yes
 # uniqueids = no

# default options

conn %default
        ikelifetime=60m
        keylife=20m
        rekeymargin=3m
        keyingtries=1
        inactivity = 60s
        dpdaction = clear
        dpdtimeout = 5s
        dpddelay = 5s

# Add connections here.

conn ikev1-psk-xauth
        leftsubnet=0.0.0.0/0
        leftfirewall=yes
        <email address hidden>
        leftauth=psk
        right=%any
        rightsourceip=10.0.0.0/9
        rightauth=psk
        rightauth2=xauth-eap
        auto=add

conn ikev2-with-eap
        keyexchange=ikev2
        leftsubnet=0.0.0.0/0
        leftfirewall=yes
        leftid="C=US, O=Server.name.co, OU=VPN Dept, CN=vpn.server.name, <email address hidden>"
        leftauth=pubkey
        leftcert=vpn.server.name.pem
        right=%any
        rightsourceip=10.0.0.0/16
        rightsendcert=never
        rightauth=eap-radius
        eap_identity=%identity
        auto=add

$ sudo cat /etc/strongswan.conf
# strongswan.conf - strongSwan configuration file

charon {
 load_modular = yes
 plugins {
  include strongswan.d/charon/*.conf
 }
 dns1 = 8.8.8.8
}

include strongswan.d/*.conf

$ sudo cat /etc/strongswan.d/charon.conf | grep -v '^[[:space:]]*#'| grep .
charon {
    crypto_test {
    }
    host_resolver {
    }
    leak_detective {
    }
    processor {
        priority_threads {
        }
    }
    tls {
    }
    x509 {
    }
}

$ sudo cat /etc/strongswan.d/charon/xauth-eap.conf | grep -v '^[[:space:]]*#'| grep .
xauth-eap {
    backend = radius
    load = yes
}

$ sudo cat /etc/strongswan.d/charon/eap-radius.conf | grep -v '^[[:space:]]*#'| grep .
eap-radius {
    accounting = yes
    load = yes
    port = 1812
    secret = secret
    server = 127.0.0.1
    sockets = 1000
    dae {
        enable = yes
        listen = 0.0.0.0
        port = 3799
        secret = dae_secret
    }
    forward {
    }
    servers {
    }
    xauth {
    }
}

Download full text (4.1 KiB)

On 2016-02-25 10:50 AM, ruslan_ka wrote:
> The server serves only incoming VPN requests, it is for mobile road-
> warriors. And the error does not occur right after starting a
> strongswan or bringing tunnels up. So it makes no sense to run it with
> auto=add or not.

I somehow assumed it was an initiator (client) and not a responder
(server), sorry.

> Strongswan is serving clients ok. It is working for a long time until a
> first DENIAL. It looks like it is somehow related to reauthentication of
> xauth iOS client, but I can't reproduce it. Sometimes client can reauth
> ok, as I can see at logs, but sometimes right after successful reauth I
> see this error. There are about 5 active clients right now with 20-30
> connections per/day, and server gives me an error once/twice per day. I
> would not even note it, if it'd not break accounting at radius.

I have no idea what can cause this access to /dev/tty. I never ran into
this problem on my own server which is similar minus the EAP/RADIUS
part, I use xauth-generic only.

> $ sudo cat /etc/ipsec.conf
> # ipsec.conf - strongSwan IPsec configuration file
>
> # basic configuration
>
> config setup
> strictcrlpolicy=yes
> # uniqueids = no
>
> # default options
>
> conn %default
> ikelifetime=60m
> keylife=20m
> rekeymargin=3m
> keyingtries=1
> inactivity = 60s
> dpdaction = clear
> dpdtimeout = 5s
> dpddelay = 5s

Not related to the problem at hand but you generally don't want
dpdtimeout to be equal to dpddelay. Having them equal means that loosing
a single DPD packet will kill the tunnel and have the client reconnect.

With mobile client, occasional packet loss shouldn’t force the
connection to be re-established. You usually want to redial only after
loosing say 3 DPD packets. This better detects peers going offline or
being affected by more severe connectivity problems.

As such, I'd recommend something like this:

  dpdtimeout=15s
  dpddelay=5s

Also, keep in mind that a low dpddelay drains the clients' battery as it
keeps the radio transmitter active more often.

> # Add connections here.
>
> conn ikev1-psk-xauth
> leftsubnet=0.0.0.0/0
> leftfirewall=yes
> <email address hidden>
> leftauth=psk
> right=%any
> rightsourceip=10.0.0.0/9
> rightauth=psk
> rightauth2=xauth-eap
> auto=add
>
> conn ikev2-with-eap
> keyexchange=ikev2
> leftsubnet=0.0.0.0/0
> leftfirewall=yes
> leftid="C=US, O=Server.name.co, OU=VPN Dept, CN=vpn.server.name, <email address hidden>"
> leftauth=pubkey
> leftcert=vpn.server.name.pem
> right=%any
> rightsourceip=10.0.0.0/16
> rightsendcert=never
> rightauth=eap-radius
> eap_identity=%identity
> auto=add

Again, not related but aren't the 2 rightsourceip= overlapping?

> $ sudo cat /etc/strongswan.conf
> # strongswan.conf - strongSwan configuration file
>
> charon {
> load_modular = yes
> plugins {
> include strongswan.d/charon/*.conf
> }
> dns1 = 8.8.8.8
> }
>
> include strongswan.d/*.conf
>
>
> $ sudo cat /etc/strongswan.d/ch...

Read more...

Download full text (7.1 KiB)

> I have no idea what can cause this access to /dev/tty. I never ran into
> this problem on my own server which is similar minus the EAP/RADIUS
> part, I use xauth-generic only.
xauth-eap works in a different way. It takes clear text password from client and makes EAP request to a radius server (in my case EAP-MSCHAPv2). It allows to store user passwords encrypted.

Quick look through the code gives many uses for stdout (as example), but I'm not an expert to analyze them (https://git.strongswan.org/?p=strongswan.git&a=search&h=ddf1fc7692889298e04a4c799bf0c2f67b61ebe9&st=grep&s=stdout).

> As such, I'd recommend something like this:
> dpdtimeout=15s
 > dpddelay=5s

Thanks for notice this.

> Again, not related but aren't the 2 rightsourceip= overlapping?
it is a StrongSwan feature. It manages ip pool as shared in such case. You can either use
   rightsourceip=%poolname
or just use identical definition in rightsourceip and StrongSwan will share the same pool implicitly.

> I honestly don't know why charon tries to access /dev/tty. Are you able
> to see that message on the console or the upstart log when the Apparmor
> profile is disabled?
With disabled Apparmor profile everything work pretty good.

Right now I've just manage to predictably catch this error, and it is not related to xauth-eap module!

Server 1 (where the error occur) with almost the same config. Added a load-testing section:

$ sudo cat /etc/ipsec.conf | grep -v '^\s*#' | grep .
config setup
 strictcrlpolicy=yes
 uniqueids = no
conn %default
        ikelifetime=60m
        keylife=20m
        rekeymargin=3m
        keyingtries=1
        inactivity = 60s
        dpdaction = clear
        dpdtimeout = 6s
        dpddelay = 5s
conn ikev1-psk-xauth
        leftsubnet=0.0.0.0/0
        leftfirewall=yes
        <email address hidden>
        leftauth=psk
        right=%any
        rightsourceip=10.0.0.0/9
        rightauth=psk
        rightauth2=xauth-eap
        auto=add
conn ikev2-with-eap
        keyexchange=ikev2
        leftsubnet=0.0.0.0/0
        leftfirewall=yes
        leftid="C=US, O=server, OU=VPN Dept, CN=test-vpn.server.name, <email address hidden>"
        leftauth=pubkey
        leftcert=test-vpn.server.name.pem
        right=%any
        rightsourceip=10.0.0.0/16
        rightsendcert=never
        rightauth=eap-radius
        eap_identity=%identity
        auto=add
conn ikev2-with-eap-loadtest
        keyexchange=ikev2
        leftsubnet=0.0.0.0/0
        leftfirewall=yes
        leftid="CN=srv, OU=load-test, O=strongSwan"
        leftauth=pubkey
        leftcert=resp.pem
        right=%any
        rightsourceip=10.0.0.0/16
        rightsendcert=never
        rightauth=eap-radius
        eap_identity=%identity
        auto=add

$ sudo cat /etc/ipsec.secrets | grep -v '^\s*#' | grep .
: RSA test-vpn.server.name.pem
: RSA resp.pem
test-vpn.server.name : PSK "testtest"

All other the same.

Server 2 - load-tester:

$ sudo cat /etc/ipsec.conf | grep -v '^\s*#' | grep .
config setup

$ sudo cat /etc/strongswan.d/charon/load-tester.conf | grep -v '^\s*#' | grep .
load-tester {
    child_rekey = 60
    delay = 500
    delete_after_established = no
    dpd_delay =...

Read more...

ruslan_ka (r-kalakutsky) wrote :

Looks like I've found the reason why charon want to open /dev/tty - just to say about buffer overflow error:

01[IKE] CHILD_SA ikev2-with-eap-loadtest{221} established with SPIs c26fb333_i c1ac3989_o and TS 172.31.59.95/32 === 10.0.0.221/32
16[IKE] CHILD_SA ikev2-with-eap-loadtest{222} established with SPIs c0abb568_i c9bb167e_o and TS 172.31.59.95/32 === 10.0.0.222/32
14[NET] received packet: from 172.31.62.150[500] to 172.31.59.95[500] (76 bytes)
02[N*** buffer overflow detected ***: /usr/lib/ipsec/charon terminated
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x7338f)[0x7fbfa132538f]
/lib/x86_64-linux-gnu/libc.so.6(__fortify_fail+0x5c)[0x7fbfa13bcc9c]
/lib/x86_64-linux-gnu/libc.so.6(+0x109b60)[0x7fbfa13bbb60]
/lib/x86_64-linux-gnu/libc.so.6(+0x10abe7)[0x7fbfa13bcbe7]
/usr/lib/ipsec/libradius.so.0(+0x2fcf)[0x7fbf9c0ecfcf]
/usr/lib/ipsec/libradius.so.0(+0x3660)[0x7fbf9c0ed660]
/usr/lib/ipsec/plugins/libstrongswan-eap-radius.so(+0x4af1)[0x7fbf9c2f5af1]
/usr/lib/ipsec/plugins/libstrongswan-eap-radius.so(+0x4f37)[0x7fbf9c2f5f37]
/usr/lib/ipsec/plugins/libstrongswan-eap-radius.so(+0x52af)[0x7fbf9c2f62af]
/usr/lib/ipsec/libcharon.so.0(+0x9e3d)[0x7fbfa189ee3d]
/usr/lib/ipsec/libcharon.so.0(+0x25419)[0x7fbfa18ba419]
/usr/lib/ipsec/libcharon.so.0(+0x302eb)[0x7fbfa18c52eb]
/usr/lib/ipsec/libcharon.so.0(+0x25e4f)[0x7fbfa18bae4f]
/usr/lib/ipsec/libcharon.so.0(+0x2006f)[0x7fbfa18b506f]
/usr/lib/ipsec/libstrongswan.so.0(+0x28df2)[0x7fbfa1d37df2]
/usr/lib/ipsec/libstrongswan.so.0(+0x2bc14)[0x7fbfa1d3ac14]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x8182)[0x7fbfa167f182]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fbfa13ac47d]
======= Memory map: ========
7fbf6c000000-7fbf6c0bc000 rw-p 00000000 00:00 0
7fbf6c0bc000-7fbf70000000 ---p 00000000 00:00 0
7fbf74000000-7fbf740d2000 rw-p 00000000 00:00 0
....

On 2016-02-26 01:11 PM, ruslan_ka wrote:
>> I have no idea what can cause this access to /dev/tty. I never ran into
>> this problem on my own server which is similar minus the EAP/RADIUS
>> part, I use xauth-generic only.
> xauth-eap works in a different way. It takes clear text password from client and makes EAP request to a radius server (in my case EAP-MSCHAPv2). It allows to store user passwords encrypted.
>
> Quick look through the code gives many uses for stdout (as example), but
> I'm not an expert to analyze them
> (https://git.strongswan.org/?p=strongswan.git&a=search&h=ddf1fc7692889298e04a4c799bf0c2f67b61ebe9&st=grep&s=stdout).

Maybe you have some log output configured to go to stdout/stderr?

>> Again, not related but aren't the 2 rightsourceip= overlapping?
> it is a StrongSwan feature. It manages ip pool as shared in such case. You can either use
> rightsourceip=%poolname
> or just use identical definition in rightsourceip and StrongSwan will share the same pool implicitly.

It's what I assumed you were doing but your 2 CIDRs are not identical:
ikev1-psk-xauth uses a /9 and ikev2-with-eap a /16.

>> I honestly don't know why charon tries to access /dev/tty. Are you able
>> to see that message on the console or the upstart log when the Apparmor
>> profile is disabled?
> With disabled Apparmor profile everything work pretty good.

When doing the load testing, do you get something logged or displayed on
the console with the Apparmor profile disabled?

> I can provide any additional information about this system or can do
> some tests.

Well, at this point you demonstrated that you can have charon access
/dev/tty when you fully control the 2 sides of the connections (with
your load tester setup).

This means that those access to /dev/tty are quite probably not the
result of an attack of some kind. They are more likely the result of
normal operations carried by charon. As such, I feel the proper fix
would be to update the Apparmor profile to grant access to /dev/tty and
avoid causing a crash.

Regards,
Simon

Download full text (6.0 KiB)

Hello Simon,

I'm not really sure should I post it here, report a new bug, or report a bug to strongswan project directly.

I can reproduce this buffer overflow with 100% probability. It is a resource independent and strongswan fail as on t1.micro or at any instance with more resources.

Buffer overflow depends on a connections number (few hundreds - from 150 up to almost 400 - it depends on time between connections ).

Some other resources usage:
* CPU load less than 50% on t1.micro and less than 10% on t1.lagre - https://www.dropbox.com/s/kqox2t2u86ws49c/Screenshot%202016-02-27%2018.51.01.png?dl=0 (green)
* a lot of free memory - https://www.dropbox.com/s/vzah4itqmrioksn/Screenshot%202016-02-27%2018.50.45.png?dl=0
* about 1100 sockets are used - https://www.dropbox.com/s/vkqf4ziuz19m20f/Screenshot%202016-02-27%2018.50.18.png?dl=0
* write peak 90kB/sec (/var/log) - https://www.dropbox.com/s/6kyt81lh4wnmh5h/Screenshot%202016-02-27%2018.50.30.png?dl=0

ipsec statusall before fail:

xd@test-vpn-01:~$ sudo ipsec statusall | head -4
Status of IKE charon daemon (strongSwan 5.1.2, Linux 3.13.0-74-generic, x86_64):
  uptime: 83 seconds, since Feb 27 18:02:36 2016
  malloc: sbrk 4870144, mmap 0, used 4382032, free 488112
  worker threads: 507 of 512 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled: 876
xd@test-vpn-01:~$ sudo ipsec statusall | head -4
Status of IKE charon daemon (strongSwan 5.1.2, Linux 3.13.0-74-generic, x86_64):
  uptime: 85 seconds, since Feb 27 18:02:36 2016
  malloc: sbrk 4927488, mmap 0, used 4428240, free 499248
  worker threads: 507 of 512 idle, 5/0/0/0 working, job queue: 0/0/0/0, scheduled: 889
xd@test-vpn-01:~$ sudo ipsec statusall | head -4
Status of IKE charon daemon (strongSwan 5.1.2, Linux 3.13.0-74-generic, x86_64):
  uptime: 86 seconds, since Feb 27 18:02:36 2016
  malloc: sbrk 4980736, mmap 0, used 4488352, free 492384
  worker threads: 506 of 512 idle, 5/0/1/0 working, job queue: 0/0/0/0, scheduled: 897
xd@test-vpn-01:~$ sudo ipsec statusall | head -4
xd@test-vpn-01:~$ sudo ipsec statusall | head -4

# cat log.txt | grep -B 20 -A 20 'overflow'
216[CFG] received RADIUS Access-Accept from server '127.0.0.1'
216[CFG] scheduling RADIUS Interim-Updates every 15s
216[IKE] RADIUS authentication of 'loadtest-197' successful
216[IKE] EAP method EAP_MSCHAPV2 succeeded, MSK established
216[ENC] generating IKE_AUTH response 4 [ EAP/SUCC ]
216[NET] sending packet: from 172.31.59.95[500] to 172.31.62.150[500] (76 bytes)
217[NET] received packet: from 172.31.62.150[500] to 172.31.59.95[500] (92 bytes)
217[ENC] parsed IKE_AUTH request 5 [ AUTH ]
217[IKE] authentication of 'loadtest-197' with EAP successful
217[IKE] authentication of 'CN=srv, OU=load-test, O=strongSwan' (myself) with EAP
217[IKE] IKE_SA ikev2-with-eap-loadtest[248] established between 172.31.59.95[CN=srv, OU=load-test, O=strongSwan]...172.31.62.150[loadtest-197]
217[IKE] scheduling reauthentication in 3403s
217[IKE] maximum IKE_SA lifetime 3583s
217[IKE] peer requested virtual IP %any
217[CFG] assigning new lease to 'loadtest-197'
217[IKE] assigning virtual IP 10.0.0.231 to peer 'loadtest-197'
217[IKE] CHILD_SA ikev2-with-eap-loadtest{231} established with ...

Read more...

Simon Déziel (sdeziel) wrote :

The crash signature looks a lot like this one: https://wiki.strongswan.org/issues/757

Changed in strongswan (Ubuntu):
status: Incomplete → Confirmed
Simon Déziel (sdeziel) wrote :

Ruslan, upstream mentions that lowering the amount of socket used for RADIUS a possible workaround: https://wiki.strongswan.org/issues/757#note-7

Also, you might want to give a try to Ubuntu Xenial that ships Strongswan 5.3.5 which has the fix included.

ruslan_ka (r-kalakutsky) wrote :

Simon, thank you.

Looks like lowering the amount of socket helps.

BR,
Ruslan.

Fixed upstream in strongswan 5.2.2. Maybe a backport of any recent strongswan to trusty would be helpful.

Robie Basak (racb) wrote :

It seems that this is fixed in Xenial and Yakkety then, and exists in Trusty only?

It also seems that a workaround is available (reduce the number of concurrent fds) and fixing this properly would involve refactoring to use poll() instead of select().

I suspect this would be too invasive for an SRU (see https://wiki.ubuntu.com/StableReleaseUpdates for the policy) but I would consider a patch. Nevertheless, I'm setting this to Won't Fix to make it clear that I don't expect this to be fixed in Trusty (affected users can use the workaround). This isn't final though - discussion welcome, though I think any proponent for a fix in Trusty would also need to supply a patch - only then can we consider the regression risk.

summary: - AppArmor kills StronSwan daemon 'charon'
+ Buffer overflow when open fds exceed FD_SETSIZE
Changed in strongswan (Ubuntu):
status: Confirmed → Fix Released
Changed in strongswan (Ubuntu Trusty):
status: New → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers