Comment 8 for bug 1021471

Revision history for this message
Clint Byrum (clint-fewbar) wrote : Re: stuck on mutex_lock creating a new network namespace when starting a container

On my somewhat lagged quantal, I have been seeing similar issues:

Linux clint-MacBookPro 3.5.0-8-generic #8-Ubuntu SMP Sat Aug 4 04:42:28 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

[194038.144050] unregister_netdevice: waiting for lo to become free. Usage count = 1
[194040.576173] INFO: task lxc-start:23872 blocked for more than 120 seconds.
[194040.576178] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[194040.576180] lxc-start D ffff88014fd13980 0 23872 1 0x00000000
[194040.576186] ffff880116909cc0 0000000000000086 ffff880090ad2e00 ffff880116909fd8
[194040.576192] ffff880116909fd8 ffff880116909fd8 ffff880144830000 ffff880090ad2e00
[194040.576197] ffff880116909cc0 ffffffff81ca91a0 ffff880090ad2e00 ffffffff81ca91a4
[194040.576202] Call Trace:
[194040.576212] [<ffffffff8167f519>] schedule+0x29/0x70
[194040.576217] [<ffffffff8167f7de>] schedule_preempt_disabled+0xe/0x10
[194040.576221] [<ffffffff8167e2f7>] __mutex_lock_slowpath+0xd7/0x150
[194040.576225] [<ffffffff8167ddca>] mutex_lock+0x2a/0x50
[194040.576230] [<ffffffff8156ab01>] copy_net_ns+0x71/0x100
[194040.576236] [<ffffffff8107b39b>] create_new_namespaces+0xdb/0x190
[194040.576239] [<ffffffff8107b58c>] copy_namespaces+0x8c/0xd0
[194040.576245] [<ffffffff81050112>] copy_process.part.22+0x902/0x1520
[194040.576249] [<ffffffff81050eb5>] do_fork+0x135/0x390
[194040.576254] [<ffffffff811820d5>] ? vfs_write+0x105/0x180
[194040.576258] [<ffffffff8101c2e8>] sys_clone+0x28/0x30
[194040.576263] [<ffffffff816889b3>] stub_clone+0x13/0x20
[194040.576267] [<ffffffff816886a9>] ? system_call_fastpath+0x16/0x1b
[194048.384149] unregister_netdevice: waiting for lo to become free. Usage count = 1
[194058.624071] unregister_netdevice: waiting for lo to become free. Usage count = 1
[194068.864079] unregister_netdevice: waiting for lo to become free. Usage count = 1
[194079.104158] unregister_netdevice: waiting for lo to become free. Usage count = 1
[194089.344152] unregister_netdevice: waiting for lo to become free. Usage count = 1
[194099.584105] unregister_netdevice: waiting for lo to become free. Usage count = 1
[194109.824044] unregister_netdevice: waiting for lo to become free. Usage count = 1
[194120.064158] unregister_netdevice: waiting for lo to become free. Usage count = 1
[194130.304148] unregister_netdevice: waiting for lo to become free. Usage count = 1
[194140.544146] unregister_netdevice: waiting for lo to become free. Usage count = 1
[194150.784065] unregister_netdevice: waiting for lo to become free. Usage count = 1
[194160.576246] INFO: task lxc-start:23872 blocked for more than 120 seconds.
[194160.576251] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[194160.576253] lxc-start D ffff88014fd13980 0 23872 1 0x00000000
[194160.576259] ffff880116909cc0 0000000000000086 ffff880090ad2e00 ffff880116909fd8
[194160.576265] ffff880116909fd8 ffff880116909fd8 ffff880144830000 ffff880090ad2e00
[194160.576270] ffff880116909cc0 ffffffff81ca91a0 ffff880090ad2e00 ffffffff81ca91a4
[194160.576275] Call Trace:
[194160.576286] [<ffffffff8167f519>] schedule+0x29/0x70
[194160.576290] [<ffffffff8167f7de>] schedule_preempt_disabled+0xe/0x10
[194160.576294] [<ffffffff8167e2f7>] __mutex_lock_slowpath+0xd7/0x150
[194160.576299] [<ffffffff8167ddca>] mutex_lock+0x2a/0x50
[194160.576304] [<ffffffff8156ab01>] copy_net_ns+0x71/0x100
[194160.576309] [<ffffffff8107b39b>] create_new_namespaces+0xdb/0x190
[194160.576313] [<ffffffff8107b58c>] copy_namespaces+0x8c/0xd0
[194160.576318] [<ffffffff81050112>] copy_process.part.22+0x902/0x1520
[194160.576322] [<ffffffff81050eb5>] do_fork+0x135/0x390
[194160.576327] [<ffffffff811820d5>] ? vfs_write+0x105/0x180
[194160.576332] [<ffffffff8101c2e8>] sys_clone+0x28/0x30
[194160.576337] [<ffffffff816889b3>] stub_clone+0x13/0x20
[194160.576341] [<ffffffff816886a9>] ? system_call_fastpath+0x16/0x1b
[194161.024151] unregister_netdevice: waiting for lo to become free. Usage count = 1

I've been creating/destroying a lot of LXC containers, so its possible the veth's created for them are causing some issues. I also have a ton of network-interface-security jobs running suggesting that they're being added but not removed:

network-interface-security (network-interface/vethx3SWbR) start/running
network-interface-security (network-interface/vethWUOSpt) start/running
network-interface-security (network-interface/veth90RDZM) start/running
network-interface-security (network-interface/vethCdnGSx) start/running
network-interface-security (network-interface/vetha8REFc) start/running
network-interface-security (network-interface/veth8yrXSC) start/running
network-interface-security (network-interface/vethvtEy9P) start/running

These issues are blocking some LXC work I'm doing, so I'm going to try upgrading which may take me out of the 'affected' category, so I've apt-cloned so we can get back to this state if need be: