2019-11-19 20:42:15 |
MIKE OLLIFF |
bug |
|
|
added bug |
2019-11-19 20:42:15 |
MIKE OLLIFF |
attachment added |
|
additional data and patches https://bugs.launchpad.net/bugs/1853197/+attachment/5306500/+files/leakinfo.txt |
|
2019-11-19 21:00:10 |
Ubuntu Kernel Bot |
linux (Ubuntu): status |
New |
Incomplete |
|
2019-11-19 21:13:03 |
MIKE OLLIFF |
linux (Ubuntu): status |
Incomplete |
Confirmed |
|
2019-11-19 21:40:52 |
MIKE OLLIFF |
description |
Ubuntu linux distro, 4.15.0-62 kernel, server platform.
This OS is used as an IPSec VPN gateway. It serves up to several hundred concurrent connections
In an attempt to upgrade from the 4.4 kernel to 4.15, the team noticed that VPN gateway VMs were running out of physical memory after 12-48 hours, depending on load.
Attachments from a server machine in this state in attached leakinfo.txt
output of free -t
output of /proc/meminfo in out of memory condition
output of /slabtop -o -sc
/sys/kernel/debug/page_owner sorted and aggregated after server ran for 12 hrs and ran out of memory
Patches for 4.15 and 5.4
Highlight from page_owner, we can see the leak is a buffer associated with the ipsec impelementation. Each connection leaks 32k of memory via alloc_page with order=3
Page allocated via order 3, mask 0x1085220(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP)
get_page_from_freelist+0xd64/0x1250
__alloc_pages_nodemask+0x11c/0x2e0
alloc_pages_current+0x6a/0xe0
skb_page_frag_refill+0x71/0x100
esp_output_head+0x265/0x3e0 [esp4]
esp_output+0xbc/0x180 [esp4]
xfrm_output_resume+0x179/0x530
xfrm_output+0x8e/0x230
xfrm4_output_finish+0x2b/0x30
__xfrm4_output+0x3a/0x50
xfrm4_output+0x43/0xc0
ip_forward_finish+0x51/0x80
ip_forward+0x38a/0x480
ip_rcv_finish+0x122/0x410
ip_rcv+0x292/0x360
__netif_receive_skb_core+0x815/0xbd0
Patch to fix this issue in 4.15 (tested and verified on same server exhibiting above leak):
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 728272f..7842f83 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -451,6 +451,10 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x)
}
xfrm_dev_state_free(x);
security_xfrm_state_free(x);
+
+ if(x->xfrag.page)
+ put_page(x->xfrag.page);
+
kfree(x);
}
Patch for master branch (5.4 I believe) from Paul Wouters (paul@nohats.ca)
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index c6f3c4a1bd99..f3423562d933 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -495,6 +495,8 @@ static void ___xfrm_state_destroy(struct xfrm_state *x)
x->type->destructor(x);
xfrm_put_type(x->type);
}
+ if (x->xfrag.page)
+ put_page(x->xfrag.page);
xfrm_dev_state_free(x);
security_xfrm_state_free(x);
xfrm_state_free(x);
Severity: Critical - we are unable to use any kernel later than 4.11, and are sticking with 4.4 in production. |
Ubuntu linux distro, 4.15.0-62 kernel, server platform.
This OS is used as an IPSec VPN gateway. It serves up to several hundred concurrent connections
In an attempt to upgrade from the 4.4 kernel to 4.15, the team noticed that VPN gateway VMs were running out of physical memory after 12-48 hours, depending on load.
Attachments from a server machine in this state in attached leakinfo.txt
output of free -t
output of /proc/meminfo in out of memory condition
output of /slabtop -o -sc
/sys/kernel/debug/page_owner sorted and aggregated after server ran for 12 hrs and ran out of memory
Patches for 4.15 and 5.4
Highlight from page_owner, we can see the leak is a buffer associated with the ipsec impelementation. Each connection leaks 32k of memory via alloc_page with order=3
100960 times:
Page allocated via order 3, mask 0x1085220(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP)
get_page_from_freelist+0xd64/0x1250
__alloc_pages_nodemask+0x11c/0x2e0
alloc_pages_current+0x6a/0xe0
skb_page_frag_refill+0x71/0x100
esp_output_head+0x265/0x3e0 [esp4]
esp_output+0xbc/0x180 [esp4]
xfrm_output_resume+0x179/0x530
xfrm_output+0x8e/0x230
xfrm4_output_finish+0x2b/0x30
__xfrm4_output+0x3a/0x50
xfrm4_output+0x43/0xc0
ip_forward_finish+0x51/0x80
ip_forward+0x38a/0x480
ip_rcv_finish+0x122/0x410
ip_rcv+0x292/0x360
__netif_receive_skb_core+0x815/0xbd0
Patch to fix this issue in 4.15 (tested and verified on same server exhibiting above leak):
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 728272f..7842f83 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -451,6 +451,10 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x)
}
xfrm_dev_state_free(x);
security_xfrm_state_free(x);
+
+ if(x->xfrag.page)
+ put_page(x->xfrag.page);
+
kfree(x);
}
Patch for master branch (5.4 I believe) from Paul Wouters (paul@nohats.ca)
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index c6f3c4a1bd99..f3423562d933 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -495,6 +495,8 @@ static void ___xfrm_state_destroy(struct xfrm_state *x)
x->type->destructor(x);
xfrm_put_type(x->type);
}
+ if (x->xfrag.page)
+ put_page(x->xfrag.page);
xfrm_dev_state_free(x);
security_xfrm_state_free(x);
xfrm_state_free(x);
Severity: Critical - we are unable to use any kernel later than 4.11, and are sticking with 4.4 in production. |
|
2019-11-19 23:51:59 |
Terry Rudd |
bug |
|
|
added subscriber Terry Rudd |
2019-11-29 11:24:48 |
Stefan Bader |
nominated for series |
|
Ubuntu Disco |
|
2019-11-29 11:24:48 |
Stefan Bader |
bug task added |
|
linux (Ubuntu Disco) |
|
2019-11-29 11:24:48 |
Stefan Bader |
nominated for series |
|
Ubuntu Eoan |
|
2019-11-29 11:24:48 |
Stefan Bader |
bug task added |
|
linux (Ubuntu Eoan) |
|
2019-11-29 11:24:48 |
Stefan Bader |
nominated for series |
|
Ubuntu Bionic |
|
2019-11-29 11:24:48 |
Stefan Bader |
bug task added |
|
linux (Ubuntu Bionic) |
|
2019-11-29 11:25:08 |
Stefan Bader |
linux (Ubuntu Bionic): importance |
Undecided |
High |
|
2019-11-29 11:25:13 |
Stefan Bader |
linux (Ubuntu Disco): importance |
Undecided |
High |
|
2019-11-29 11:25:16 |
Stefan Bader |
linux (Ubuntu Eoan): importance |
Undecided |
High |
|
2019-11-29 11:25:26 |
Stefan Bader |
linux (Ubuntu Bionic): status |
New |
Triaged |
|
2019-11-29 11:25:30 |
Stefan Bader |
linux (Ubuntu Disco): status |
New |
Triaged |
|
2019-11-29 11:25:34 |
Stefan Bader |
linux (Ubuntu Eoan): status |
New |
Triaged |
|
2019-11-29 11:27:37 |
Stefan Bader |
linux (Ubuntu): status |
Confirmed |
Invalid |
|
2019-11-29 11:32:52 |
Stefan Bader |
description |
Ubuntu linux distro, 4.15.0-62 kernel, server platform.
This OS is used as an IPSec VPN gateway. It serves up to several hundred concurrent connections
In an attempt to upgrade from the 4.4 kernel to 4.15, the team noticed that VPN gateway VMs were running out of physical memory after 12-48 hours, depending on load.
Attachments from a server machine in this state in attached leakinfo.txt
output of free -t
output of /proc/meminfo in out of memory condition
output of /slabtop -o -sc
/sys/kernel/debug/page_owner sorted and aggregated after server ran for 12 hrs and ran out of memory
Patches for 4.15 and 5.4
Highlight from page_owner, we can see the leak is a buffer associated with the ipsec impelementation. Each connection leaks 32k of memory via alloc_page with order=3
100960 times:
Page allocated via order 3, mask 0x1085220(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP)
get_page_from_freelist+0xd64/0x1250
__alloc_pages_nodemask+0x11c/0x2e0
alloc_pages_current+0x6a/0xe0
skb_page_frag_refill+0x71/0x100
esp_output_head+0x265/0x3e0 [esp4]
esp_output+0xbc/0x180 [esp4]
xfrm_output_resume+0x179/0x530
xfrm_output+0x8e/0x230
xfrm4_output_finish+0x2b/0x30
__xfrm4_output+0x3a/0x50
xfrm4_output+0x43/0xc0
ip_forward_finish+0x51/0x80
ip_forward+0x38a/0x480
ip_rcv_finish+0x122/0x410
ip_rcv+0x292/0x360
__netif_receive_skb_core+0x815/0xbd0
Patch to fix this issue in 4.15 (tested and verified on same server exhibiting above leak):
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 728272f..7842f83 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -451,6 +451,10 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x)
}
xfrm_dev_state_free(x);
security_xfrm_state_free(x);
+
+ if(x->xfrag.page)
+ put_page(x->xfrag.page);
+
kfree(x);
}
Patch for master branch (5.4 I believe) from Paul Wouters (paul@nohats.ca)
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index c6f3c4a1bd99..f3423562d933 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -495,6 +495,8 @@ static void ___xfrm_state_destroy(struct xfrm_state *x)
x->type->destructor(x);
xfrm_put_type(x->type);
}
+ if (x->xfrag.page)
+ put_page(x->xfrag.page);
xfrm_dev_state_free(x);
security_xfrm_state_free(x);
xfrm_state_free(x);
Severity: Critical - we are unable to use any kernel later than 4.11, and are sticking with 4.4 in production. |
[SRU Justification]
== Impact ==
An upstream change in v4.11 made xfrm loose memory (8 pages per ipsec connection). This was fixed in v5.4 by:
commit 86c6739eda7d "xfrm: Fix memleak on xfrm state destroy"
== Fix ==
Pick the upstream fix into all affected series.
== Testcase ==
see below
== Risk of Regression ==
Low, the change adds a single memory release case in one driver. The effect can be verified.
---
Ubuntu linux distro, 4.15.0-62 kernel, server platform.
This OS is used as an IPSec VPN gateway. It serves up to several hundred concurrent connections
In an attempt to upgrade from the 4.4 kernel to 4.15, the team noticed that VPN gateway VMs were running out of physical memory after 12-48 hours, depending on load.
Attachments from a server machine in this state in attached leakinfo.txt
output of free -t
output of /proc/meminfo in out of memory condition
output of /slabtop -o -sc
/sys/kernel/debug/page_owner sorted and aggregated after server ran for 12 hrs and ran out of memory
Patches for 4.15 and 5.4
Highlight from page_owner, we can see the leak is a buffer associated with the ipsec impelementation. Each connection leaks 32k of memory via alloc_page with order=3
100960 times:
Page allocated via order 3, mask 0x1085220(GFP_ATOMIC|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP)
get_page_from_freelist+0xd64/0x1250
__alloc_pages_nodemask+0x11c/0x2e0
alloc_pages_current+0x6a/0xe0
skb_page_frag_refill+0x71/0x100
esp_output_head+0x265/0x3e0 [esp4]
esp_output+0xbc/0x180 [esp4]
xfrm_output_resume+0x179/0x530
xfrm_output+0x8e/0x230
xfrm4_output_finish+0x2b/0x30
__xfrm4_output+0x3a/0x50
xfrm4_output+0x43/0xc0
ip_forward_finish+0x51/0x80
ip_forward+0x38a/0x480
ip_rcv_finish+0x122/0x410
ip_rcv+0x292/0x360
__netif_receive_skb_core+0x815/0xbd0
Patch to fix this issue in 4.15 (tested and verified on same server exhibiting above leak):
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 728272f..7842f83 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -451,6 +451,10 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x)
}
xfrm_dev_state_free(x);
security_xfrm_state_free(x);
+
+ if(x->xfrag.page)
+ put_page(x->xfrag.page);
+
kfree(x);
}
Patch for master branch (5.4 I believe) from Paul Wouters (paul@nohats.ca)
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index c6f3c4a1bd99..f3423562d933 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -495,6 +495,8 @@ static void ___xfrm_state_destroy(struct xfrm_state *x)
x->type->destructor(x);
xfrm_put_type(x->type);
}
+ if (x->xfrag.page)
+ put_page(x->xfrag.page);
xfrm_dev_state_free(x);
security_xfrm_state_free(x);
xfrm_state_free(x);
Severity: Critical - we are unable to use any kernel later than 4.11, and are sticking with 4.4 in production. |
|
2019-11-29 11:34:24 |
Stefan Bader |
linux (Ubuntu Eoan): assignee |
|
Stefan Bader (smb) |
|
2019-11-29 11:34:34 |
Stefan Bader |
linux (Ubuntu Disco): assignee |
|
Stefan Bader (smb) |
|
2019-11-29 11:34:38 |
Stefan Bader |
linux (Ubuntu Bionic): assignee |
|
Stefan Bader (smb) |
|
2019-11-29 12:30:52 |
Bernd Schütte |
bug |
|
|
added subscriber Bernd Schütte |
2019-12-02 06:54:18 |
Khaled El Mously |
linux (Ubuntu Bionic): status |
Triaged |
Fix Committed |
|
2019-12-02 06:54:21 |
Khaled El Mously |
linux (Ubuntu Disco): status |
Triaged |
Fix Committed |
|
2019-12-02 06:54:23 |
Khaled El Mously |
linux (Ubuntu Eoan): status |
Triaged |
Fix Committed |
|
2019-12-03 15:43:22 |
Ubuntu Kernel Bot |
tags |
ipsec kernel kernel-bug leak linux memory vpn |
ipsec kernel kernel-bug leak linux memory verification-needed-disco vpn |
|
2019-12-03 15:44:57 |
Ubuntu Kernel Bot |
tags |
ipsec kernel kernel-bug leak linux memory verification-needed-disco vpn |
ipsec kernel kernel-bug leak linux memory verification-needed-bionic verification-needed-disco vpn |
|
2019-12-05 11:27:32 |
Ubuntu Kernel Bot |
tags |
ipsec kernel kernel-bug leak linux memory verification-needed-bionic verification-needed-disco vpn |
ipsec kernel kernel-bug leak linux memory verification-needed-bionic verification-needed-disco verification-needed-eoan vpn |
|
2019-12-09 07:38:05 |
Bernd Schütte |
linux (Ubuntu Bionic): status |
Fix Committed |
Confirmed |
|
2019-12-10 06:27:14 |
Stefan Bader |
linux (Ubuntu Bionic): status |
Confirmed |
Fix Committed |
|
2019-12-10 06:27:43 |
Stefan Bader |
tags |
ipsec kernel kernel-bug leak linux memory verification-needed-bionic verification-needed-disco verification-needed-eoan vpn |
ipsec kernel kernel-bug leak linux memory verification-done-bionic verification-needed-disco verification-needed-eoan vpn |
|
2019-12-18 06:43:01 |
Aleksei |
tags |
ipsec kernel kernel-bug leak linux memory verification-done-bionic verification-needed-disco verification-needed-eoan vpn |
ipsec kernel kernel-bug leak linux memory verification-done-bionic verification-done-eoan verification-needed-disco vpn |
|
2019-12-19 19:13:35 |
Khaled El Mously |
tags |
ipsec kernel kernel-bug leak linux memory verification-done-bionic verification-done-eoan verification-needed-disco vpn |
ipsec kernel kernel-bug leak linux memory verification-done-bionic verification-done-disco verification-done-eoan vpn |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
linux (Ubuntu Eoan): status |
Fix Committed |
Fix Released |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-14895 |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-14896 |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-14897 |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-14901 |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-18660 |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-19055 |
|
2020-01-06 12:53:38 |
Launchpad Janitor |
cve linked |
|
2019-19072 |
|
2020-01-06 13:12:44 |
Launchpad Janitor |
linux (Ubuntu Disco): status |
Fix Committed |
Fix Released |
|
2020-01-06 13:12:44 |
Launchpad Janitor |
cve linked |
|
2019-2214 |
|
2020-01-06 13:26:17 |
Launchpad Janitor |
linux (Ubuntu Bionic): status |
Fix Committed |
Fix Released |
|
2020-01-06 13:26:17 |
Launchpad Janitor |
cve linked |
|
2019-19083 |
|