Activity log for bug #1616193

Date Who What changed Old value New value Message
2016-08-23 19:30:44 Dave Chiluk bug added bug
2016-08-23 19:30:44 Dave Chiluk attachment added Additional var/log/kern.log output showing fragmentation https://bugs.launchpad.net/bugs/1616193/+attachment/4726550/+files/bug.txt
2016-08-23 20:00:10 Brad Figg linux (Ubuntu): status New Incomplete
2016-08-23 20:00:11 Brad Figg tags sts sts trusty
2016-08-24 04:13:42 Dave Chiluk linux (Ubuntu): status Incomplete Confirmed
2016-08-25 19:42:43 Dave Chiluk description [Impact] * libvirtd is no longer able to open the vhost_net device. This causes the guest VM to hang. This happens if memory becomes fragmented to the point where vhost_net_open is not able to successfully kmalloc. * Gratuitous stack trace. libvirtd: page allocation failure: order:4, mode:0x1040d0 CPU: 14 PID: 82768 Comm: libvirtd Not tainted 3.13.0-85-generic #129-Ubuntu Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.5.4 10/002/2015 0000000000000000 ffff88003b419990 ffffffff8172b6a7 00000000001040d0 0000000000000000 ffff88003b419a18 ffffffff811580eb ffff88187fffce48 ffff88003b4199b8 ffffffff8115abd6 ffff88003b4199e8 0000000000000286 Call Trace: [<ffffffff8172b6a7>] dump_stack+0x64/0x82 [<ffffffff811580eb>] warn_alloc_failed+0xeb/0x140 [<ffffffff8115abd6>] ? drain_local_pages+0x16/0x20 [<ffffffff8115c8c0>] __alloc_pages_nodemask+0x980/0xb90 [<ffffffff8119b3a3>] alloc_pages_current+0xa3/0x160 [<ffffffff811570ae>] __get_free_pages+0xe/0x50 [<ffffffff811743be>] kmalloc_order_trace+0x2e/0xc0 [<ffffffffa04e79c9>] vhost_net_open+0x29/0x1b0 [vhost_net] [<ffffffff81484283>] misc_open+0xb3/0x170 [<ffffffff811c63ff>] chrdev_open+0x9f/0x1d0 [<ffffffff811bef13>] do_dentry_open+0x233/0x2e0 [<ffffffff811c6360>] ? cdev_put+0x30/0x30 [<ffffffff811bf249>] vfs_open+0x49/0x50 [<ffffffff811d0812>] do_last+0x562/0x1370 [<ffffffff811d16db>] path_openat+0xbb/0x670 [<ffffffff811d2afa>] do_filp_open+0x3a/0x90 [<ffffffff811df957>] ? __alloc_fd+0xa7/0x130 [<ffffffff811c0d69>] do_sys_open+0x129/0x2a0 [<ffffffff811c0efe>] SyS_open+0x1e/0x20 [<ffffffff8173c39d>] system_call_fastpath+0x1a/0x1f * justification: cloud. * The patches fix this issue by allowing vhost_net_open to use vmalloc when kmalloc fails to find a sufficient page. [Test Case] * Fragment Kernel memory. Write to Nic from within a kvm guest that uses a virtio nic. [Regression Potential] * Fix was implemented upstream in 3.15, and still exists. * Testing TBD [Other Info] * https://lkml.org/lkml/2013/1/23/492 * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=23cc5a991c7a9fb7e6d6550e65cee4f4173111c5 * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d04257b07f2362d4eb550952d5bf5f4241a8046d [Impact]  * libvirtd is no longer able to open the vhost_net device. This causes the guest VM to hang. This happens if memory becomes fragmented to the point where vhost_net_open is not able to successfully kmalloc.  * Gratuitous stack trace. libvirtd: page allocation failure: order:4, mode:0x1040d0 CPU: 14 PID: 82768 Comm: libvirtd Not tainted 3.13.0-85-generic #129-Ubuntu Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.5.4 10/002/2015  0000000000000000 ffff88003b419990 ffffffff8172b6a7 00000000001040d0  0000000000000000 ffff88003b419a18 ffffffff811580eb ffff88187fffce48  ffff88003b4199b8 ffffffff8115abd6 ffff88003b4199e8 0000000000000286 Call Trace:  [<ffffffff8172b6a7>] dump_stack+0x64/0x82  [<ffffffff811580eb>] warn_alloc_failed+0xeb/0x140  [<ffffffff8115abd6>] ? drain_local_pages+0x16/0x20  [<ffffffff8115c8c0>] __alloc_pages_nodemask+0x980/0xb90  [<ffffffff8119b3a3>] alloc_pages_current+0xa3/0x160  [<ffffffff811570ae>] __get_free_pages+0xe/0x50  [<ffffffff811743be>] kmalloc_order_trace+0x2e/0xc0  [<ffffffffa04e79c9>] vhost_net_open+0x29/0x1b0 [vhost_net]  [<ffffffff81484283>] misc_open+0xb3/0x170  [<ffffffff811c63ff>] chrdev_open+0x9f/0x1d0  [<ffffffff811bef13>] do_dentry_open+0x233/0x2e0  [<ffffffff811c6360>] ? cdev_put+0x30/0x30  [<ffffffff811bf249>] vfs_open+0x49/0x50  [<ffffffff811d0812>] do_last+0x562/0x1370  [<ffffffff811d16db>] path_openat+0xbb/0x670  [<ffffffff811d2afa>] do_filp_open+0x3a/0x90  [<ffffffff811df957>] ? __alloc_fd+0xa7/0x130  [<ffffffff811c0d69>] do_sys_open+0x129/0x2a0  [<ffffffff811c0efe>] SyS_open+0x1e/0x20  [<ffffffff8173c39d>] system_call_fastpath+0x1a/0x1f  * justification: because cloud.  * The patches fix this issue by allowing vhost_net_open to use vmalloc when kmalloc fails to find a sufficient page size. [Test Case]  * Fragment Kernel memory. Write to Nic from within a kvm guest that uses a virtio nic. [Regression Potential]  * Fix was implemented upstream in 3.15, and still exists. * The fix is fairly straightfoward given the stack trace and the upstream fix.  * The fix is hard to verify, as it requires significant memory fragmentation, and an over-active guest. The users machine that was experiencing this has worked around this by removing VM's from the compute host, and using vfs.cache.pressure=600. [Other Info]  * https://lkml.org/lkml/2013/1/23/492  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=23cc5a991c7a9fb7e6d6550e65cee4f4173111c5  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d04257b07f2362d4eb550952d5bf5f4241a8046d * I'm going on vacation, and Eric Desrochers will be following up on this in my absence. This is also the reason for submitting before receiving verification.
2016-08-25 20:37:26 Kamal Mostafa nominated for series Ubuntu Trusty
2016-08-25 20:37:26 Kamal Mostafa bug task added linux (Ubuntu Trusty)
2016-08-25 20:37:38 Kamal Mostafa linux (Ubuntu Trusty): status New Fix Committed
2016-08-25 21:21:57 Eric Desrochers bug added subscriber Eric Desrochers
2016-09-06 13:42:36 Tim Gardner tags sts trusty sts trusty verification-needed-trusty
2016-09-12 16:55:47 Dave Chiluk tags sts trusty verification-needed-trusty sts trusty verification-done-trusty
2016-09-19 14:54:06 Launchpad Janitor linux (Ubuntu Trusty): status Fix Committed Fix Released
2016-09-19 14:54:06 Launchpad Janitor cve linked 2015-8767
2016-09-19 14:54:06 Launchpad Janitor cve linked 2016-3841
2018-09-07 02:44:56 zouqian description [Impact]  * libvirtd is no longer able to open the vhost_net device. This causes the guest VM to hang. This happens if memory becomes fragmented to the point where vhost_net_open is not able to successfully kmalloc.  * Gratuitous stack trace. libvirtd: page allocation failure: order:4, mode:0x1040d0 CPU: 14 PID: 82768 Comm: libvirtd Not tainted 3.13.0-85-generic #129-Ubuntu Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.5.4 10/002/2015  0000000000000000 ffff88003b419990 ffffffff8172b6a7 00000000001040d0  0000000000000000 ffff88003b419a18 ffffffff811580eb ffff88187fffce48  ffff88003b4199b8 ffffffff8115abd6 ffff88003b4199e8 0000000000000286 Call Trace:  [<ffffffff8172b6a7>] dump_stack+0x64/0x82  [<ffffffff811580eb>] warn_alloc_failed+0xeb/0x140  [<ffffffff8115abd6>] ? drain_local_pages+0x16/0x20  [<ffffffff8115c8c0>] __alloc_pages_nodemask+0x980/0xb90  [<ffffffff8119b3a3>] alloc_pages_current+0xa3/0x160  [<ffffffff811570ae>] __get_free_pages+0xe/0x50  [<ffffffff811743be>] kmalloc_order_trace+0x2e/0xc0  [<ffffffffa04e79c9>] vhost_net_open+0x29/0x1b0 [vhost_net]  [<ffffffff81484283>] misc_open+0xb3/0x170  [<ffffffff811c63ff>] chrdev_open+0x9f/0x1d0  [<ffffffff811bef13>] do_dentry_open+0x233/0x2e0  [<ffffffff811c6360>] ? cdev_put+0x30/0x30  [<ffffffff811bf249>] vfs_open+0x49/0x50  [<ffffffff811d0812>] do_last+0x562/0x1370  [<ffffffff811d16db>] path_openat+0xbb/0x670  [<ffffffff811d2afa>] do_filp_open+0x3a/0x90  [<ffffffff811df957>] ? __alloc_fd+0xa7/0x130  [<ffffffff811c0d69>] do_sys_open+0x129/0x2a0  [<ffffffff811c0efe>] SyS_open+0x1e/0x20  [<ffffffff8173c39d>] system_call_fastpath+0x1a/0x1f  * justification: because cloud.  * The patches fix this issue by allowing vhost_net_open to use vmalloc when kmalloc fails to find a sufficient page size. [Test Case]  * Fragment Kernel memory. Write to Nic from within a kvm guest that uses a virtio nic. [Regression Potential]  * Fix was implemented upstream in 3.15, and still exists. * The fix is fairly straightfoward given the stack trace and the upstream fix.  * The fix is hard to verify, as it requires significant memory fragmentation, and an over-active guest. The users machine that was experiencing this has worked around this by removing VM's from the compute host, and using vfs.cache.pressure=600. [Other Info]  * https://lkml.org/lkml/2013/1/23/492  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=23cc5a991c7a9fb7e6d6550e65cee4f4173111c5  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d04257b07f2362d4eb550952d5bf5f4241a8046d * I'm going on vacation, and Eric Desrochers will be following up on this in my absence. This is also the reason for submitting before receiving verification. [Impact]  * libvirtd is no longer able to open the vhost_net device. This causes the guest VM to hang. This happens if memory becomes fragmented to the point where vhost_net_open is not able to successfully kmalloc.  * Gratuitous stack trace. libvirtd: page allocation failure: order:4, mode:0x1040d0 CPU: 14 PID: 82768 Comm: libvirtd Not tainted 3.13.0-85-generic #129-Ubuntu Hardware name: Dell Inc. PowerEdge R730/0599V5, BIOS 1.5.4 10/002/2015  0000000000000000 ffff88003b419990 ffffffff8172b6a7 00000000001040d0  0000000000000000 ffff88003b419a18 ffffffff811580eb ffff88187fffce48  ffff88003b4199b8 ffffffff8115abd6 ffff88003b4199e8 0000000000000286 Call Trace:  [<ffffffff8172b6a7>] dump_stack+0x64/0x82  [<ffffffff811580eb>] warn_alloc_failed+0xeb/0x140  [<ffffffff8115abd6>] ? drain_local_pages+0x16/0x20  [<ffffffff8115c8c0>] __alloc_pages_nodemask+0x980/0xb90  [<ffffffff8119b3a3>] alloc_pages_current+0xa3/0x160  [<ffffffff811570ae>] __get_free_pages+0xe/0x50  [<ffffffff811743be>] kmalloc_order_trace+0x2e/0xc0  [<ffffffffa04e79c9>] vhost_net_open+0x29/0x1b0 [vhost_net]  [<ffffffff81484283>] misc_open+0xb3/0x170  [<ffffffff811c63ff>] chrdev_open+0x9f/0x1d0  [<ffffffff811bef13>] do_dentry_open+0x233/0x2e0  [<ffffffff811c6360>] ? cdev_put+0x30/0x30  [<ffffffff811bf249>] vfs_open+0x49/0x50  [<ffffffff811d0812>] do_last+0x562/0x1370  [<ffffffff811d16db>] path_openat+0xbb/0x670  [<ffffffff811d2afa>] do_filp_open+0x3a/0x90  [<ffffffff811df957>] ? __alloc_fd+0xa7/0x130  [<ffffffff811c0d69>] do_sys_open+0x129/0x2a0  [<ffffffff811c0efe>] SyS_open+0x1e/0x20  [<ffffffff8173c39d>] system_call_fastpath+0x1a/0x1f  * justification: because cloud.  * The patches fix this issue by allowing vhost_net_open to use vmalloc when kmalloc fails to find a sufficient page size. [Test Case]  * Fragment Kernel memory. Write to Nic from within a kvm guest that uses a virtio nic. [Regression Potential]  * Fix was implemented upstream in 3.15, and still exists.  * The fix is fairly straightfoward given the stack trace and the upstream fix.  * The fix is hard to verify, as it requires significant memory fragmentation, and an over-active guest. The users machine that was experiencing this has worked around this by removing VM's from the compute host, and using vfs.cache.pressure=600. [Other Info]  * https://lkml.org/lkml/2013/1/23/492  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=23cc5a991c7a9fb7e6d6550e65cee4f4173111c5  * http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=d04257b07f2362d4eb550952d5bf5f4241a8046d  * I'm going on vacation, and Eric Desrochers will be following up on this in my absence. This is also the reason for submitting before receiving verification.
2019-07-24 21:21:28 Brad Figg tags sts trusty verification-done-trusty cscc sts trusty verification-done-trusty
2020-07-14 15:41:01 Dan Streetman linux (Ubuntu): status Confirmed Fix Released