memory hotplug test stuck on Azure BasicA1 node with X-HWE azure kernel

Bug #1781902 reported by Po-Hsu Lin
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
ubuntu-kernel-tests
Invalid
Undecided
Unassigned
linux-signed-azure (Ubuntu)
Invalid
Undecided
Unassigned

Bug Description

The memory hotplug test stuck on Azure Basic A1 node, with kernel 4.15.0-1017-azure #17~16.04.1-Ubuntu

 Running 'sudo make -C linux/tools/testing/selftests/memory-hotplug all run_tests'
 make: Entering directory '/home/azure/autotest/client/tmp/ubuntu_kernel_selftests/src/linux/tools/testing/selftests/memory-hotplug'
 make: Nothing to be done for 'all'.
 ./mem-on-off-test.sh -r 2 || echo "selftests: memory-hotplug [FAIL]"
 Test scope: 2% hotplug memory
   online all hot-pluggable memory in offline state:
    SKIPPED - no hot-pluggable memory in offline state
   offline 2% hot-pluggable memory in online state
   trying to offline 1 out of 1 memory block(s):
 online->offline memory39

Some traces could be found in dmesg:
[ 605.156847] INFO: task kworker/0:0:3 blocked for more than 120 seconds.
[ 605.165522] Not tainted 4.15.0-1017-azure #17~16.04.1-Ubuntu
[ 605.171822] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 605.182439] kworker/0:0 D 0 3 2 0x80000000
[ 605.182450] Workqueue: cgroup_destroy css_free_work_fn
[ 605.182452] Call Trace:
[ 605.182460] __schedule+0x3d6/0x8b0
[ 605.182464] ? enqueue_entity+0x112/0x670
[ 605.182467] schedule+0x36/0x80
[ 605.182470] rwsem_down_read_failed+0x10a/0x170
[ 605.182473] call_rwsem_down_read_failed+0x18/0x30
[ 605.182474] ? call_rwsem_down_read_failed+0x18/0x30
[ 605.182477] __percpu_down_read+0x54/0x80
[ 605.182481] get_online_mems+0x32/0x40
[ 605.182485] memcg_destroy_kmem_caches+0x14/0x90
[ 605.182488] mem_cgroup_css_free+0x144/0x180
[ 605.182491] css_free_work_fn+0x4c/0x370
[ 605.182494] process_one_work+0x14d/0x410
[ 605.182495] worker_thread+0x4b/0x460
[ 605.182499] kthread+0x105/0x140
[ 605.182500] ? process_one_work+0x410/0x410
[ 605.182503] ? kthread_associate_blkcg+0xa0/0xa0
[ 605.182504] ret_from_fork+0x35/0x40
[ 725.984133] INFO: task kworker/0:0:3 blocked for more than 120 seconds.
[ 725.996091] Not tainted 4.15.0-1017-azure #17~16.04.1-Ubuntu
[ 726.012768] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 726.030221] kworker/0:0 D 0 3 2 0x80000000
[ 726.030232] Workqueue: cgroup_destroy css_free_work_fn
[ 726.030234] Call Trace:
[ 726.030242] __schedule+0x3d6/0x8b0
[ 726.030246] ? enqueue_entity+0x112/0x670
[ 726.030249] schedule+0x36/0x80
[ 726.030252] rwsem_down_read_failed+0x10a/0x170
[ 726.030255] call_rwsem_down_read_failed+0x18/0x30
[ 726.030256] ? call_rwsem_down_read_failed+0x18/0x30
[ 726.030259] __percpu_down_read+0x54/0x80
[ 726.030264] get_online_mems+0x32/0x40
[ 726.030268] memcg_destroy_kmem_caches+0x14/0x90
[ 726.030271] mem_cgroup_css_free+0x144/0x180
[ 726.030273] css_free_work_fn+0x4c/0x370
[ 726.030276] process_one_work+0x14d/0x410
[ 726.030278] worker_thread+0x4b/0x460
[ 726.030281] kthread+0x105/0x140
[ 726.030283] ? process_one_work+0x410/0x410
[ 726.030285] ? kthread_associate_blkcg+0xa0/0xa0
[ 726.030287] ret_from_fork+0x35/0x40
[ 846.820680] INFO: task kworker/0:0:3 blocked for more than 120 seconds.
[ 846.831583] Not tainted 4.15.0-1017-azure #17~16.04.1-Ubuntu
[ 846.842013] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 846.851811] kworker/0:0 D 0 3 2 0x80000000
[ 846.851822] Workqueue: cgroup_destroy css_free_work_fn
[ 846.851823] Call Trace:
[ 846.851831] __schedule+0x3d6/0x8b0
[ 846.851835] ? enqueue_entity+0x112/0x670
[ 846.851838] schedule+0x36/0x80
[ 846.851841] rwsem_down_read_failed+0x10a/0x170
[ 846.851844] call_rwsem_down_read_failed+0x18/0x30
[ 846.851845] ? call_rwsem_down_read_failed+0x18/0x30
[ 846.851848] __percpu_down_read+0x54/0x80
[ 846.851852] get_online_mems+0x32/0x40
[ 846.851856] memcg_destroy_kmem_caches+0x14/0x90
[ 846.851859] mem_cgroup_css_free+0x144/0x180
[ 846.851862] css_free_work_fn+0x4c/0x370
[ 846.851865] process_one_work+0x14d/0x410
[ 846.851867] worker_thread+0x4b/0x460
[ 846.851870] kthread+0x105/0x140
[ 846.851872] ? process_one_work+0x410/0x410
[ 846.851874] ? kthread_associate_blkcg+0xa0/0xa0
[ 846.851876] ret_from_fork+0x35/0x40
[ 967.651013] INFO: task kworker/0:0:3 blocked for more than 120 seconds.
[ 967.661072] Not tainted 4.15.0-1017-azure #17~16.04.1-Ubuntu
[ 967.676751] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 967.695953] kworker/0:0 D 0 3 2 0x80000000
[ 967.695964] Workqueue: cgroup_destroy css_free_work_fn
[ 967.695965] Call Trace:
[ 967.695973] __schedule+0x3d6/0x8b0
[ 967.695977] ? enqueue_entity+0x112/0x670
[ 967.695980] schedule+0x36/0x80
[ 967.695984] rwsem_down_read_failed+0x10a/0x170
[ 967.695986] call_rwsem_down_read_failed+0x18/0x30
[ 967.695988] ? call_rwsem_down_read_failed+0x18/0x30
[ 967.695991] __percpu_down_read+0x54/0x80
[ 967.695995] get_online_mems+0x32/0x40
[ 967.695999] memcg_destroy_kmem_caches+0x14/0x90
[ 967.696012] mem_cgroup_css_free+0x144/0x180
[ 967.696014] css_free_work_fn+0x4c/0x370
[ 967.696017] process_one_work+0x14d/0x410
[ 967.696019] worker_thread+0x4b/0x460
[ 967.696023] kthread+0x105/0x140
[ 967.696024] ? process_one_work+0x410/0x410
[ 967.696026] ? kthread_associate_blkcg+0xa0/0xa0
[ 967.696028] ret_from_fork+0x35/0x40
[ 1088.484528] INFO: task kworker/0:0:3 blocked for more than 120 seconds.
[ 1088.496479] Not tainted 4.15.0-1017-azure #17~16.04.1-Ubuntu
[ 1088.505639] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1088.515429] kworker/0:0 D 0 3 2 0x80000000
[ 1088.515441] Workqueue: cgroup_destroy css_free_work_fn
[ 1088.515442] Call Trace:
[ 1088.515451] __schedule+0x3d6/0x8b0
[ 1088.515455] ? enqueue_entity+0x112/0x670
[ 1088.515458] schedule+0x36/0x80
[ 1088.515461] rwsem_down_read_failed+0x10a/0x170
[ 1088.515464] call_rwsem_down_read_failed+0x18/0x30
[ 1088.515465] ? call_rwsem_down_read_failed+0x18/0x30
[ 1088.515468] __percpu_down_read+0x54/0x80
[ 1088.515473] get_online_mems+0x32/0x40
[ 1088.515477] memcg_destroy_kmem_caches+0x14/0x90
[ 1088.515480] mem_cgroup_css_free+0x144/0x180
[ 1088.515483] css_free_work_fn+0x4c/0x370
[ 1088.515486] process_one_work+0x14d/0x410
[ 1088.515487] worker_thread+0x4b/0x460
[ 1088.515491] kthread+0x105/0x140
[ 1088.515492] ? process_one_work+0x410/0x410
[ 1088.515495] ? kthread_associate_blkcg+0xa0/0xa0
[ 1088.515497] ret_from_fork+0x35/0x40
[ 1088.515524] INFO: task kworker/0:1:8902 blocked for more than 120 seconds.
[ 1088.525792] Not tainted 4.15.0-1017-azure #17~16.04.1-Ubuntu
[ 1088.536508] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1088.549442] kworker/0:1 D 0 8902 2 0x80000000
[ 1088.549452] Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
[ 1088.549454] Call Trace:
[ 1088.549462] __schedule+0x3d6/0x8b0
[ 1088.549465] schedule+0x36/0x80
[ 1088.549468] rwsem_down_read_failed+0x10a/0x170
[ 1088.549471] call_rwsem_down_read_failed+0x18/0x30
[ 1088.549473] ? call_rwsem_down_read_failed+0x18/0x30
[ 1088.549476] __percpu_down_read+0x54/0x80
[ 1088.549480] get_online_mems+0x32/0x40
[ 1088.549484] memcg_create_kmem_cache+0x1b/0x110
[ 1088.549487] memcg_kmem_cache_create_func+0x20/0x70
[ 1088.549489] process_one_work+0x14d/0x410
[ 1088.549491] worker_thread+0x4b/0x460
[ 1088.549494] kthread+0x105/0x140
[ 1088.549496] ? process_one_work+0x410/0x410
[ 1088.549498] ? kthread_associate_blkcg+0xa0/0xa0
[ 1088.549501] ? do_syscall_64+0x73/0x130
[ 1088.549504] ? SyS_exit_group+0x14/0x20
[ 1088.549506] ret_from_fork+0x35/0x40
[ 1209.316846] INFO: task kworker/0:0:3 blocked for more than 120 seconds.
[ 1209.326607] Not tainted 4.15.0-1017-azure #17~16.04.1-Ubuntu
[ 1209.336199] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1209.351335] kworker/0:0 D 0 3 2 0x80000000
[ 1209.351347] Workqueue: cgroup_destroy css_free_work_fn
[ 1209.351349] Call Trace:
[ 1209.351357] __schedule+0x3d6/0x8b0
[ 1209.351362] ? enqueue_entity+0x112/0x670
[ 1209.351365] schedule+0x36/0x80
[ 1209.351368] rwsem_down_read_failed+0x10a/0x170
[ 1209.351371] call_rwsem_down_read_failed+0x18/0x30
[ 1209.351372] ? call_rwsem_down_read_failed+0x18/0x30
[ 1209.351375] __percpu_down_read+0x54/0x80
[ 1209.351380] get_online_mems+0x32/0x40
[ 1209.351384] memcg_destroy_kmem_caches+0x14/0x90
[ 1209.351387] mem_cgroup_css_free+0x144/0x180
[ 1209.351390] css_free_work_fn+0x4c/0x370
[ 1209.351393] process_one_work+0x14d/0x410
[ 1209.351394] worker_thread+0x4b/0x460
[ 1209.351398] kthread+0x105/0x140
[ 1209.351399] ? process_one_work+0x410/0x410
[ 1209.351401] ? kthread_associate_blkcg+0xa0/0xa0
[ 1209.351403] ret_from_fork+0x35/0x40
[ 1209.351432] INFO: task kworker/0:1:8902 blocked for more than 120 seconds.
[ 1209.360304] Not tainted 4.15.0-1017-azure #17~16.04.1-Ubuntu
[ 1209.375588] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1209.387152] kworker/0:1 D 0 8902 2 0x80000000
[ 1209.387163] Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
[ 1209.387164] Call Trace:
[ 1209.387172] __schedule+0x3d6/0x8b0
[ 1209.387175] schedule+0x36/0x80
[ 1209.387178] rwsem_down_read_failed+0x10a/0x170
[ 1209.387181] call_rwsem_down_read_failed+0x18/0x30
[ 1209.387183] ? call_rwsem_down_read_failed+0x18/0x30
[ 1209.387186] __percpu_down_read+0x54/0x80
[ 1209.387190] get_online_mems+0x32/0x40
[ 1209.387194] memcg_create_kmem_cache+0x1b/0x110
[ 1209.387196] memcg_kmem_cache_create_func+0x20/0x70
[ 1209.387199] process_one_work+0x14d/0x410
[ 1209.387201] worker_thread+0x4b/0x460
[ 1209.387204] kthread+0x105/0x140
[ 1209.387206] ? process_one_work+0x410/0x410
[ 1209.387208] ? kthread_associate_blkcg+0xa0/0xa0
[ 1209.387211] ? do_syscall_64+0x73/0x130
[ 1209.387213] ? SyS_exit_group+0x14/0x20
[ 1209.387215] ret_from_fork+0x35/0x40
[ 1330.144221] INFO: task kworker/0:0:3 blocked for more than 120 seconds.
[ 1330.158150] Not tainted 4.15.0-1017-azure #17~16.04.1-Ubuntu
[ 1330.167192] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1330.182753] kworker/0:0 D 0 3 2 0x80000000
[ 1330.182765] Workqueue: cgroup_destroy css_free_work_fn
[ 1330.182767] Call Trace:
[ 1330.182775] __schedule+0x3d6/0x8b0
[ 1330.182779] ? enqueue_entity+0x112/0x670
[ 1330.182782] schedule+0x36/0x80
[ 1330.182785] rwsem_down_read_failed+0x10a/0x170
[ 1330.182788] call_rwsem_down_read_failed+0x18/0x30
[ 1330.182789] ? call_rwsem_down_read_failed+0x18/0x30
[ 1330.182792] __percpu_down_read+0x54/0x80
[ 1330.182797] get_online_mems+0x32/0x40
[ 1330.182801] memcg_destroy_kmem_caches+0x14/0x90
[ 1330.182805] mem_cgroup_css_free+0x144/0x180
[ 1330.182807] css_free_work_fn+0x4c/0x370
[ 1330.182810] process_one_work+0x14d/0x410
[ 1330.182812] worker_thread+0x4b/0x460
[ 1330.182815] kthread+0x105/0x140
[ 1330.182817] ? process_one_work+0x410/0x410
[ 1330.182819] ? kthread_associate_blkcg+0xa0/0xa0
[ 1330.182821] ret_from_fork+0x35/0x40
[ 1330.182848] INFO: task kworker/0:1:8902 blocked for more than 120 seconds.
[ 1330.196283] Not tainted 4.15.0-1017-azure #17~16.04.1-Ubuntu
[ 1330.204686] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 1330.222151] kworker/0:1 D 0 8902 2 0x80000000
[ 1330.222162] Workqueue: memcg_kmem_cache memcg_kmem_cache_create_func
[ 1330.222164] Call Trace:
[ 1330.222171] __schedule+0x3d6/0x8b0
[ 1330.222174] schedule+0x36/0x80
[ 1330.222178] rwsem_down_read_failed+0x10a/0x170
[ 1330.222181] call_rwsem_down_read_failed+0x18/0x30
[ 1330.222183] ? call_rwsem_down_read_failed+0x18/0x30
[ 1330.222186] __percpu_down_read+0x54/0x80
[ 1330.222190] get_online_mems+0x32/0x40
[ 1330.222194] memcg_create_kmem_cache+0x1b/0x110
[ 1330.222197] memcg_kmem_cache_create_func+0x20/0x70
[ 1330.222199] process_one_work+0x14d/0x410
[ 1330.222201] worker_thread+0x4b/0x460
[ 1330.222204] kthread+0x105/0x140
[ 1330.222206] ? process_one_work+0x410/0x410
[ 1330.222208] ? kthread_associate_blkcg+0xa0/0xa0
[ 1330.222211] ? do_syscall_64+0x73/0x130
[ 1330.222214] ? SyS_exit_group+0x14/0x20
[ 1330.222215] ret_from_fork+0x35/0x40

ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: linux-image-4.15.0-1017-azure 4.15.0-1017.17~16.04.1
ProcVersionSignature: User Name 4.15.0-1017.17~16.04.1-username 4.15.18
Uname: Linux 4.15.0-1017-azure x86_64
ApportVersion: 2.20.1-0ubuntu2.18
Architecture: amd64
Date: Mon Jul 16 10:23:21 2018
ProcEnviron:
 TERM=xterm-256color
 PATH=(custom, no user)
 XDG_RUNTIME_DIR=<set>
 LANG=en_US.UTF-8
 SHELL=/bin/bash
SourcePackage: linux-signed-azure
UpgradeStatus: No upgrade log present (probably fresh install)

Revision history for this message
Po-Hsu Lin (cypressyew) wrote :
Po-Hsu Lin (cypressyew)
tags: added: ubuntu-kernel-selftests
Revision history for this message
Sean Feole (sfeole) wrote :

We no longer test with the A1 Instances due to problems like this, A1 is an extremely minimal, low resource instance. We should not be offlining cpu and memory in these VMs.

Changed in ubuntu-kernel-tests:
status: New → Invalid
Changed in linux-signed-azure (Ubuntu):
status: New → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.