Activity log for bug #1466135

Date Who What changed Old value New value Message
2015-06-17 15:33:45 Chris J Arges bug added bug
2015-06-17 15:33:53 Chris J Arges nominated for series Ubuntu Trusty
2015-06-17 15:33:53 Chris J Arges bug task added linux (Ubuntu Trusty)
2015-06-17 15:33:58 Chris J Arges linux (Ubuntu Trusty): assignee Chris J Arges (arges)
2015-06-17 15:34:00 Chris J Arges linux (Ubuntu Trusty): importance Undecided Medium
2015-06-17 15:34:02 Chris J Arges linux (Ubuntu Trusty): status New In Progress
2015-06-17 15:34:04 Chris J Arges linux (Ubuntu): assignee Chris J Arges (arges)
2015-06-17 15:34:06 Chris J Arges linux (Ubuntu): status In Progress Fix Released
2015-06-17 15:34:08 Chris J Arges linux (Ubuntu): importance Medium Undecided
2015-06-17 15:44:11 Chris J Arges description [Impact] Occasionally starting new containers or creating new net namespaces may fail because of improper refcounting of conntrack entires. [Test Case] bug 1403152 has a testcase which can occasionally hit this issue [Fix] $ git describe --contains e53376bef2cd97d3e3f61fdc677fb8da7d03d0da v3.14-rc3~36^2~28^2~12 [Impact] Occasionally starting new containers or creating new net namespaces may soft lockup because of improper refcounting of conntrack entires. Softlockup backtrace: [<ffffffff81723b69>] schedule_preempt_disabled+0x29/0x70 [<ffffffff817259d5>] __mutex_lock_slowpath+0x135/0x1b0 [<ffffffff811a2679>] ? __kmalloc+0x1e9/0x230 [<ffffffff81725a6f>] mutex_lock+0x1f/0x2f [<ffffffff8161c2c1>] copy_net_ns+0x71/0x130 [<ffffffff8108f889>] create_new_namespaces+0xf9/0x180 [<ffffffff8108f983>] copy_namespaces+0x73/0xa0 [<ffffffff81065b16>] copy_process.part.26+0x9a6/0x16b0 [<ffffffff810669f5>] do_fork+0xd5/0x340 [<ffffffff810c8e8d>] ? call_rcu_sched+0x1d/0x20 [<ffffffff81066ce6>] SyS_clone+0x16/0x20 [<ffffffff81730089>] stub_clone+0x69/0x90 [<ffffffff8172fd2d>] ? system_call_fastpath+0x1a/0x1f [Test Case] bug 1403152 has a testcase which can occasionally hit this issue [Fix] $ git describe --contains e53376bef2cd97d3e3f61fdc677fb8da7d03d0da v3.14-rc3~36^2~28^2~12
2015-06-17 17:39:49 Chris J Arges description [Impact] Occasionally starting new containers or creating new net namespaces may soft lockup because of improper refcounting of conntrack entires. Softlockup backtrace: [<ffffffff81723b69>] schedule_preempt_disabled+0x29/0x70 [<ffffffff817259d5>] __mutex_lock_slowpath+0x135/0x1b0 [<ffffffff811a2679>] ? __kmalloc+0x1e9/0x230 [<ffffffff81725a6f>] mutex_lock+0x1f/0x2f [<ffffffff8161c2c1>] copy_net_ns+0x71/0x130 [<ffffffff8108f889>] create_new_namespaces+0xf9/0x180 [<ffffffff8108f983>] copy_namespaces+0x73/0xa0 [<ffffffff81065b16>] copy_process.part.26+0x9a6/0x16b0 [<ffffffff810669f5>] do_fork+0xd5/0x340 [<ffffffff810c8e8d>] ? call_rcu_sched+0x1d/0x20 [<ffffffff81066ce6>] SyS_clone+0x16/0x20 [<ffffffff81730089>] stub_clone+0x69/0x90 [<ffffffff8172fd2d>] ? system_call_fastpath+0x1a/0x1f [Test Case] bug 1403152 has a testcase which can occasionally hit this issue [Fix] $ git describe --contains e53376bef2cd97d3e3f61fdc677fb8da7d03d0da v3.14-rc3~36^2~28^2~12 [Impact] Occasionally starting new containers or creating new net namespaces may soft lockup because of improper refcounting of conntrack entires. In the issue that I face, I can find a kworker thread using up an entire core, and when I cat /proc/$pid/stack I see this: <ffffffffbe01e9b6>] ___preempt_schedule+0x56/0xb0 [<ffffffffc02223e4>] nf_ct_iterate_cleanup+0x134/0x160 [nf_conntrack] [<ffffffffc0223dae>] nf_conntrack_cleanup_net_list+0x4e/0x170 [nf_conntrack] [<ffffffffc022436d>] nf_conntrack_pernet_exit+0x4d/0x60 [nf_conntrack] [<ffffffffbe6040d3>] ops_exit_list.isra.1+0x53/0x60 [<ffffffffbe6048d0>] cleanup_net+0x100/0x1d0 [<ffffffffbe084991>] process_one_work+0x171/0x470 [<ffffffffbe08563b>] worker_thread+0x11b/0x3a0 [<ffffffffbe08bb82>] kthread+0xd2/0xf0 [<ffffffffbe71757c>] ret_from_fork+0x7c/0xb0 [<ffffffffffffffff>] 0xffffffffffffffff The kworker is looping forever and failing to clean up conntrack state. All the while, it holds the global netns lock. Given that I've bisected to commit e53376bef2cd97d3e3f61fdc677fb8da7d03d0da which is to do with refcounting, I suspect that borked refcounting on conntrack entries makes them impossible to properly free/destroy, which prevents this worker from cleaning up the namespace, which then goes on to prevent anything else from interacting with namespaces (add/delete/etc). [Test Case] bug 1403152 has a testcase which can occasionally hit this issue [Fix] $ git describe --contains e53376bef2cd97d3e3f61fdc677fb8da7d03d0da v3.14-rc3~36^2~28^2~12
2015-06-17 18:56:26 Chris J Arges bug added subscriber Joe Stringer
2015-06-19 15:13:50 Brad Figg linux (Ubuntu Trusty): status In Progress Fix Committed
2015-07-09 15:14:43 Brad Figg tags verification-needed-trusty
2015-07-09 17:27:50 Joe Stringer tags verification-needed-trusty verification-done-trusty
2015-07-22 15:40:58 Launchpad Janitor linux (Ubuntu Trusty): status Fix Committed Fix Released
2015-07-22 15:40:58 Launchpad Janitor cve linked 2015-1805