We later call device_add which triggers the udev event for this block:
int device_add(struct device *dev)
{ kobject_uevent(&dev->kobj, KOBJ_ADD);
[...]
}
Finally, after emitting this event and and while holding that we call
device_attach() above, nesting the device_lock(dev) inside the memory
hotplug lock.
int __ref online_pages(unsigned long pfn, unsigned long nr_pages, int online_type)
{
[...]
lock_memory_hotplug();
[...]
mutex_lock(&zonelists_mutex);
[...]
}
Note that udevd would have taken the device lock in device_online():
int device_online(struct device *dev)
{
int ret = 0;
device_lock(dev);
[...]
}
And while holding this we call online_pages() as above, nesting the memory
hotplug lock inside the device_lock(dev).
This looks to be an ABBA deadlock, assuming dev is the same in these two
cases which seems plausible as we emit the udev event in the middle.
Stack from kworker:
[ 240.608612] INFO: task kworker/0:2:861 blocked for more than 120 seconds. kernel/ hung_task_ timeout_ secs" disables this message. 9f9>] schedule_ preempt_ disabled+ 0x29/0x70 865>] __mutex_ lock_slowpath+ 0x135/0x1b0 8ff>] mutex_lock+ 0x1f/0x2f 5bd>] device_ attach+ 0x1d/0xa0 a38>] bus_probe_ device+ 0x98/0xc0 895>] device_ add+0x4c5/ 0x640 a2a>] device_ register+ 0x1a/0x20 000>] init_memory_ block+0xd0/ 0xf0 141>] register_ new_memory+ 0x91/0xa0 d10>] __add_pages+ 0x140/0x240 649>] arch_add_ memory+ 0x59/0xd0 fe4>] add_memory+ 0xe4/0x1f0 1cf>] hot_add_ req+0x31f/ 0x1150 [hv_balloon] 4a2>] process_ one_work+ 0x182/0x450 241>] worker_ thread+ 0x121/0x410 120>] ? rescuer_ thread+ 0x3e0/0x3e0 ed2>] kthread+0xd2/0xf0 e00>] ? kthread_ create_ on_node+ 0x190/0x190 9bc>] ret_from_ fork+0x7c/ 0xb0 e00>] ? kthread_ create_ on_node+ 0x190/0x190
[ 240.608617] Not tainted 3.13.0-17-generic #37-Ubuntu
[ 240.608618] "echo 0 > /proc/sys/
[ 240.608620] kworker/0:2 D ffff88001e414440 0 861 2 0x00000000
[ 240.608628] Workqueue: events hot_add_req [hv_balloon]
[ 240.608630] ffff88001a00fb30 0000000000000002 ffff88001a6f8000 ffff88001a00ffd8
[ 240.608632] 0000000000014440 0000000000014440 ffff88001a6f8000 ffff88001aac6c98
[ 240.608635] ffff88001aac6c9c ffff88001a6f8000 00000000ffffffff ffff88001aac6ca0
[ 240.608637] Call Trace:
[ 240.608643] [<ffffffff81715
[ 240.608645] [<ffffffff81717
[ 240.608647] [<ffffffff81717
[ 240.608651] [<ffffffff8148a
[ 240.608653] [<ffffffff81489
[ 240.608656] [<ffffffff81487
[ 240.608658] [<ffffffff81487
[ 240.608661] [<ffffffff8149e
[ 240.608663] [<ffffffff8149e
[ 240.608666] [<ffffffff81700
[ 240.608670] [<ffffffff81055
[ 240.608672] [<ffffffff81700
[ 240.608675] [<ffffffffa0041
[ 240.608679] [<ffffffff81082
[ 240.608681] [<ffffffff81083
[ 240.608683] [<ffffffff81083
[ 240.608686] [<ffffffff81089
[ 240.608688] [<ffffffff81089
[ 240.608691] [<ffffffff81721
[ 240.608693] [<ffffffff81089
kworker looks to be blocked on the device lock in device_attach:
int device_ attach( struct device *dev)
{
int ret = 0;
device_lock(dev);
[...]
}
If we follow the call trace we take mem_hotplug_mutex in add_memory():
int __ref add_memory(int nid, u64 start, u64 size)
lock_memory_ hotplug( );
{
[...]
}
We later call device_add which triggers the udev event for this block:
int device_add(struct device *dev)
kobject_ uevent( &dev->kobj, KOBJ_ADD);
{
[...]
}
Finally, after emitting this event and and while holding that we call
device_attach() above, nesting the device_lock(dev) inside the memory
hotplug lock.
Stack from systemd-udevd:
[ 240.608705] INFO: task systemd-udevd:1906 blocked for more than 120 seconds. kernel/ hung_task_ timeout_ secs" disables this message. 9f9>] schedule_ preempt_ disabled+ 0x29/0x70 865>] __mutex_ lock_slowpath+ 0x135/0x1b0 8ae>] ? lru_cache_ add+0xe/ 0x10 8ff>] mutex_lock+ 0x1f/0x2f 9c3>] online_ pages+0x33/ 0x570 d98>] memory_ subsys_ online+ 0x68/0xd0 1e5>] device_ online+ 0x65/0x90 a24>] store_mem_ state+0x64/ 0x160 748>] dev_attr_ store+0x18/ 0x30 698>] sysfs_write_ file+0x128/ 0x1c0 8c4>] vfs_write+ 0xb4/0x1f0 2f9>] SyS_write+0x49/0xa0 c7f>] tracesys+0xe1/0xe6
[ 240.608706] Not tainted 3.13.0-17-generic #37-Ubuntu
[ 240.608707] "echo 0 > /proc/sys/
[ 240.608708] systemd-udevd D ffff88001e414440 0 1906 404 0x00000004
[ 240.608710] ffff88001a97bd20 0000000000000002 ffff8800170e0000 ffff88001a97bfd8
[ 240.608712] 0000000000014440 0000000000014440 ffff8800170e0000 ffffffff81c620e0
[ 240.608714] ffffffff81c620e4 ffff8800170e0000 00000000ffffffff ffffffff81c620e8
[ 240.608716] Call Trace:
[ 240.608719] [<ffffffff81715
[ 240.608721] [<ffffffff81717
[ 240.608725] [<ffffffff8115a
[ 240.608727] [<ffffffff81717
[ 240.608729] [<ffffffff81701
[ 240.608731] [<ffffffff8149d
[ 240.608733] [<ffffffff81488
[ 240.608735] [<ffffffff8149d
[ 240.608738] [<ffffffff81485
[ 240.608742] [<ffffffff8122e
[ 240.608745] [<ffffffff811b8
[ 240.608747] [<ffffffff811b9
[ 240.608749] [<ffffffff81721
udevd seems to be blocked on the hotplug lock:
int __ref online_ pages(unsigned long pfn, unsigned long nr_pages, int online_type) hotplug( ); lock(&zonelists _mutex) ;
{
[...]
lock_memory_
[...]
mutex_
[...]
}
Note that udevd would have taken the device lock in device_online():
int device_ online( struct device *dev)
{
int ret = 0;
[...]
}
And while holding this we call online_pages() as above, nesting the memory
hotplug lock inside the device_lock(dev).
This looks to be an ABBA deadlock, assuming dev is the same in these two
cases which seems plausible as we emit the udev event in the middle.