2016-08-23 19:30:44 |
Dave Chiluk |
bug |
|
|
added bug |
2016-08-23 19:30:44 |
Dave Chiluk |
attachment added |
|
Additional var/log/kern.log output showing fragmentation https://bugs.launchpad.net/bugs/1616193/+attachment/4726550/+files/bug.txt |
|
2016-08-23 20:00:10 |
Brad Figg |
linux (Ubuntu): status |
New |
Incomplete |
|
2016-08-23 20:00:11 |
Brad Figg |
tags |
sts |
sts trusty |
|
2016-08-24 04:13:42 |
Dave Chiluk |
linux (Ubuntu): status |
Incomplete |
Confirmed |
|
2016-08-25 19:42:43 |
Dave Chiluk |
description |
[Impact]
* libvirtd is no longer able to open the vhost_net device. This causes the guest VM to hang. This happens if memory becomes fragmented to the point where vhost_net_open is not able to successfully kmalloc.
* Gratuitous stack trace.
libvirtd: page allocation failure: order:4, mode:0x1040d0
CPU: 14 PID: 82768 Comm: libvirtd Not tainted 3.13.0-85-generic #129-Ubuntu
Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.5.4 10/002/2015
0000000000000000 ffff88003b419990 ffffffff8172b6a7 00000000001040d0
0000000000000000 ffff88003b419a18 ffffffff811580eb ffff88187fffce48
ffff88003b4199b8 ffffffff8115abd6 ffff88003b4199e8 0000000000000286
Call Trace:
[<ffffffff8172b6a7>] dump_stack+0x64/0x82
[<ffffffff811580eb>] warn_alloc_failed+0xeb/0x140
[<ffffffff8115abd6>] ? drain_local_pages+0x16/0x20
[<ffffffff8115c8c0>] __alloc_pages_nodemask+0x980/0xb90
[<ffffffff8119b3a3>] alloc_pages_current+0xa3/0x160
[<ffffffff811570ae>] __get_free_pages+0xe/0x50
[<ffffffff811743be>] kmalloc_order_trace+0x2e/0xc0
[<ffffffffa04e79c9>] vhost_net_open+0x29/0x1b0 [vhost_net]
[<ffffffff81484283>] misc_open+0xb3/0x170
[<ffffffff811c63ff>] chrdev_open+0x9f/0x1d0
[<ffffffff811bef13>] do_dentry_open+0x233/0x2e0
[<ffffffff811c6360>] ? cdev_put+0x30/0x30
[<ffffffff811bf249>] vfs_open+0x49/0x50
[<ffffffff811d0812>] do_last+0x562/0x1370
[<ffffffff811d16db>] path_openat+0xbb/0x670
[<ffffffff811d2afa>] do_filp_open+0x3a/0x90
[<ffffffff811df957>] ? __alloc_fd+0xa7/0x130
[<ffffffff811c0d69>] do_sys_open+0x129/0x2a0
[<ffffffff811c0efe>] SyS_open+0x1e/0x20
[<ffffffff8173c39d>] system_call_fastpath+0x1a/0x1f
* justification: cloud.
* The patches fix this issue by allowing vhost_net_open to use vmalloc when kmalloc fails to find a sufficient page.
[Test Case]
* Fragment Kernel memory. Write to Nic from within a kvm guest that uses a virtio nic.
[Regression Potential]
* Fix was implemented upstream in 3.15, and still exists.
* Testing TBD
[Other Info]
* https://lkml.org/lkml/2013/1/23/492
* http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=23cc5a991c7a9fb7e6d6550e65cee4f4173111c5
* http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d04257b07f2362d4eb550952d5bf5f4241a8046d |
[Impact]
* libvirtd is no longer able to open the vhost_net device. This causes the guest VM to hang. This happens if memory becomes fragmented to the point where vhost_net_open is not able to successfully kmalloc.
* Gratuitous stack trace.
libvirtd: page allocation failure: order:4, mode:0x1040d0
CPU: 14 PID: 82768 Comm: libvirtd Not tainted 3.13.0-85-generic #129-Ubuntu
Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.5.4 10/002/2015
0000000000000000 ffff88003b419990 ffffffff8172b6a7 00000000001040d0
0000000000000000 ffff88003b419a18 ffffffff811580eb ffff88187fffce48
ffff88003b4199b8 ffffffff8115abd6 ffff88003b4199e8 0000000000000286
Call Trace:
[<ffffffff8172b6a7>] dump_stack+0x64/0x82
[<ffffffff811580eb>] warn_alloc_failed+0xeb/0x140
[<ffffffff8115abd6>] ? drain_local_pages+0x16/0x20
[<ffffffff8115c8c0>] __alloc_pages_nodemask+0x980/0xb90
[<ffffffff8119b3a3>] alloc_pages_current+0xa3/0x160
[<ffffffff811570ae>] __get_free_pages+0xe/0x50
[<ffffffff811743be>] kmalloc_order_trace+0x2e/0xc0
[<ffffffffa04e79c9>] vhost_net_open+0x29/0x1b0 [vhost_net]
[<ffffffff81484283>] misc_open+0xb3/0x170
[<ffffffff811c63ff>] chrdev_open+0x9f/0x1d0
[<ffffffff811bef13>] do_dentry_open+0x233/0x2e0
[<ffffffff811c6360>] ? cdev_put+0x30/0x30
[<ffffffff811bf249>] vfs_open+0x49/0x50
[<ffffffff811d0812>] do_last+0x562/0x1370
[<ffffffff811d16db>] path_openat+0xbb/0x670
[<ffffffff811d2afa>] do_filp_open+0x3a/0x90
[<ffffffff811df957>] ? __alloc_fd+0xa7/0x130
[<ffffffff811c0d69>] do_sys_open+0x129/0x2a0
[<ffffffff811c0efe>] SyS_open+0x1e/0x20
[<ffffffff8173c39d>] system_call_fastpath+0x1a/0x1f
* justification: because cloud.
* The patches fix this issue by allowing vhost_net_open to use vmalloc when kmalloc fails to find a sufficient page size.
[Test Case]
* Fragment Kernel memory. Write to Nic from within a kvm guest that uses a virtio nic.
[Regression Potential]
* Fix was implemented upstream in 3.15, and still exists.
* The fix is fairly straightfoward given the stack trace and the upstream fix.
* The fix is hard to verify, as it requires significant memory fragmentation, and an over-active guest. The users machine that was experiencing this has worked around this by removing VM's from the compute host, and using vfs.cache.pressure=600.
[Other Info]
* https://lkml.org/lkml/2013/1/23/492
* http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=23cc5a991c7a9fb7e6d6550e65cee4f4173111c5
* http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d04257b07f2362d4eb550952d5bf5f4241a8046d
* I'm going on vacation, and Eric Desrochers will be following up on this in my absence. This is also the reason for submitting before receiving verification. |
|
2016-08-25 20:37:26 |
Kamal Mostafa |
nominated for series |
|
Ubuntu Trusty |
|
2016-08-25 20:37:26 |
Kamal Mostafa |
bug task added |
|
linux (Ubuntu Trusty) |
|
2016-08-25 20:37:38 |
Kamal Mostafa |
linux (Ubuntu Trusty): status |
New |
Fix Committed |
|
2016-08-25 21:21:57 |
Eric Desrochers |
bug |
|
|
added subscriber Eric Desrochers |
2016-09-06 13:42:36 |
Tim Gardner |
tags |
sts trusty |
sts trusty verification-needed-trusty |
|
2016-09-12 16:55:47 |
Dave Chiluk |
tags |
sts trusty verification-needed-trusty |
sts trusty verification-done-trusty |
|
2016-09-19 14:54:06 |
Launchpad Janitor |
linux (Ubuntu Trusty): status |
Fix Committed |
Fix Released |
|
2016-09-19 14:54:06 |
Launchpad Janitor |
cve linked |
|
2015-8767 |
|
2016-09-19 14:54:06 |
Launchpad Janitor |
cve linked |
|
2016-3841 |
|
2018-09-07 02:44:56 |
zouqian |
description |
[Impact]
* libvirtd is no longer able to open the vhost_net device. This causes the guest VM to hang. This happens if memory becomes fragmented to the point where vhost_net_open is not able to successfully kmalloc.
* Gratuitous stack trace.
libvirtd: page allocation failure: order:4, mode:0x1040d0
CPU: 14 PID: 82768 Comm: libvirtd Not tainted 3.13.0-85-generic #129-Ubuntu
Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.5.4 10/002/2015
0000000000000000 ffff88003b419990 ffffffff8172b6a7 00000000001040d0
0000000000000000 ffff88003b419a18 ffffffff811580eb ffff88187fffce48
ffff88003b4199b8 ffffffff8115abd6 ffff88003b4199e8 0000000000000286
Call Trace:
[<ffffffff8172b6a7>] dump_stack+0x64/0x82
[<ffffffff811580eb>] warn_alloc_failed+0xeb/0x140
[<ffffffff8115abd6>] ? drain_local_pages+0x16/0x20
[<ffffffff8115c8c0>] __alloc_pages_nodemask+0x980/0xb90
[<ffffffff8119b3a3>] alloc_pages_current+0xa3/0x160
[<ffffffff811570ae>] __get_free_pages+0xe/0x50
[<ffffffff811743be>] kmalloc_order_trace+0x2e/0xc0
[<ffffffffa04e79c9>] vhost_net_open+0x29/0x1b0 [vhost_net]
[<ffffffff81484283>] misc_open+0xb3/0x170
[<ffffffff811c63ff>] chrdev_open+0x9f/0x1d0
[<ffffffff811bef13>] do_dentry_open+0x233/0x2e0
[<ffffffff811c6360>] ? cdev_put+0x30/0x30
[<ffffffff811bf249>] vfs_open+0x49/0x50
[<ffffffff811d0812>] do_last+0x562/0x1370
[<ffffffff811d16db>] path_openat+0xbb/0x670
[<ffffffff811d2afa>] do_filp_open+0x3a/0x90
[<ffffffff811df957>] ? __alloc_fd+0xa7/0x130
[<ffffffff811c0d69>] do_sys_open+0x129/0x2a0
[<ffffffff811c0efe>] SyS_open+0x1e/0x20
[<ffffffff8173c39d>] system_call_fastpath+0x1a/0x1f
* justification: because cloud.
* The patches fix this issue by allowing vhost_net_open to use vmalloc when kmalloc fails to find a sufficient page size.
[Test Case]
* Fragment Kernel memory. Write to Nic from within a kvm guest that uses a virtio nic.
[Regression Potential]
* Fix was implemented upstream in 3.15, and still exists.
* The fix is fairly straightfoward given the stack trace and the upstream fix.
* The fix is hard to verify, as it requires significant memory fragmentation, and an over-active guest. The users machine that was experiencing this has worked around this by removing VM's from the compute host, and using vfs.cache.pressure=600.
[Other Info]
* https://lkml.org/lkml/2013/1/23/492
* http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=23cc5a991c7a9fb7e6d6550e65cee4f4173111c5
* http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d04257b07f2362d4eb550952d5bf5f4241a8046d
* I'm going on vacation, and Eric Desrochers will be following up on this in my absence. This is also the reason for submitting before receiving verification. |
[Impact]
* libvirtd is no longer able to open the vhost_net device. This causes the guest VM to hang. This happens if memory becomes fragmented to the point where vhost_net_open is not able to successfully kmalloc.
* Gratuitous stack trace.
libvirtd: page allocation failure: order:4, mode:0x1040d0
CPU: 14 PID: 82768 Comm: libvirtd Not tainted 3.13.0-85-generic #129-Ubuntu
Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.5.4 10/002/2015
0000000000000000 ffff88003b419990 ffffffff8172b6a7 00000000001040d0
0000000000000000 ffff88003b419a18 ffffffff811580eb ffff88187fffce48
ffff88003b4199b8 ffffffff8115abd6 ffff88003b4199e8 0000000000000286
Call Trace:
[<ffffffff8172b6a7>] dump_stack+0x64/0x82
[<ffffffff811580eb>] warn_alloc_failed+0xeb/0x140
[<ffffffff8115abd6>] ? drain_local_pages+0x16/0x20
[<ffffffff8115c8c0>] __alloc_pages_nodemask+0x980/0xb90
[<ffffffff8119b3a3>] alloc_pages_current+0xa3/0x160
[<ffffffff811570ae>] __get_free_pages+0xe/0x50
[<ffffffff811743be>] kmalloc_order_trace+0x2e/0xc0
[<ffffffffa04e79c9>] vhost_net_open+0x29/0x1b0 [vhost_net]
[<ffffffff81484283>] misc_open+0xb3/0x170
[<ffffffff811c63ff>] chrdev_open+0x9f/0x1d0
[<ffffffff811bef13>] do_dentry_open+0x233/0x2e0
[<ffffffff811c6360>] ? cdev_put+0x30/0x30
[<ffffffff811bf249>] vfs_open+0x49/0x50
[<ffffffff811d0812>] do_last+0x562/0x1370
[<ffffffff811d16db>] path_openat+0xbb/0x670
[<ffffffff811d2afa>] do_filp_open+0x3a/0x90
[<ffffffff811df957>] ? __alloc_fd+0xa7/0x130
[<ffffffff811c0d69>] do_sys_open+0x129/0x2a0
[<ffffffff811c0efe>] SyS_open+0x1e/0x20
[<ffffffff8173c39d>] system_call_fastpath+0x1a/0x1f
* justification: because cloud.
* The patches fix this issue by allowing vhost_net_open to use vmalloc when kmalloc fails to find a sufficient page size.
[Test Case]
* Fragment Kernel memory. Write to Nic from within a kvm guest that uses a virtio nic.
[Regression Potential]
* Fix was implemented upstream in 3.15, and still exists.
* The fix is fairly straightfoward given the stack trace and the upstream fix.
* The fix is hard to verify, as it requires significant memory fragmentation, and an over-active guest. The users machine that was experiencing this has worked around this by removing VM's from the compute host, and using vfs.cache.pressure=600.
[Other Info]
* https://lkml.org/lkml/2013/1/23/492
* http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=23cc5a991c7a9fb7e6d6550e65cee4f4173111c5
* http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d04257b07f2362d4eb550952d5bf5f4241a8046d
* I'm going on vacation, and Eric Desrochers will be following up on this in my absence. This is also the reason for submitting before receiving verification. |
|
2019-07-24 21:21:28 |
Brad Figg |
tags |
sts trusty verification-done-trusty |
cscc sts trusty verification-done-trusty |
|
2020-07-14 15:41:01 |
Dan Streetman |
linux (Ubuntu): status |
Confirmed |
Fix Released |
|