qemu-nbd causes the cluster to hang

Bug #1279366 reported by sirio81
10
This bug affects 2 people
Affects Status Importance Assigned to Milestone
sheepdog
New
Undecided
Unassigned

Bug Description

Description
writing data into a nbd0 device cause sheepdog to hang.
'dog vdi list' doesn't work on any of the nodes.
'dog node list' hangs only on the node using nbd.
If no guests are running, nothing is shown in sheep.log.
The node running nbd doesn't get disconnected from the cluster.
If guests are running (on any node), they are unable to issue I/O requests and you get errors in sheep.log [1]
and /var/log/messages [2].
The only way to fix the situation is to reboot the node running nbd.
If the destination vdi is using erasure code, it will hang writing a file larger than 1G.
Note: the same procedure doesn't cause probelm is using qemu-nbd on a regular file instead of a sheepdog vdi.

QEMU emulator version 1.7.50, Copyright (c) 2003-2008 Fabrice Bellard
Sheepdog daemon version 0.8.0_31_gf2a00e1
Linux pbds 3.2.0-4-amd64 #1 SMP Debian 3.2.51-1 x86_64 GNU/Linux (wheezy)

Ho to reproduce:

dog vdi list | grep test
  test 0 25 GB 512 MB 0.0 MB 2014-02-12 14:22 7c2b25 2:1

qemu-nbd -c /dev/nbd0 sheepdog:test

dd if=bigfile of=/dev/nbd0 bs=1M

[1]
sheep.log
/var/sheep/sheep.log
[266904.388934] nbd: registered device at major 43
[266947.188736] block nbd0: NBD_DISCONNECT
[266947.189041] block nbd0: Unexpected reply (ffff8800084a3c58)
[266947.189309] block nbd0: queue cleared
[267212.302371] block nbd0: NBD_DISCONNECT
[267212.302655] block nbd0: Unexpected reply (ffff880004df9c58)
[267212.368741] block nbd0: queue cleared
[267741.369185] block nbd0: NBD_DISCONNECT
[267741.369451] block nbd0: Unexpected reply (ffff880004d4fc58)
[267741.618210] block nbd0: queue cleared
[268030.252831] block nbd0: NBD_DISCONNECT
[268030.253102] block nbd0: Unexpected reply (ffff880006b01c58)
[268030.506146] block nbd0: queue cleared
[605094.364246] block nbd0: NBD_DISCONNECT
[605094.364655] block nbd0: Unexpected reply (ffff880005707c58)
[605094.364884] block nbd0: queue cleared
[605151.748821] block nbd0: NBD_DISCONNECT
[605151.749114] block nbd0: Unexpected reply (ffff880074081c58)
[605151.750838] block nbd0: queue cleared
[605366.036589] block nbd0: NBD_DISCONNECT
[605366.036882] block nbd0: Unexpected reply (ffff880074171c58)
[605366.038608] block nbd0: queue cleared
[608189.537841] block nbd0: NBD_DISCONNECT
[608189.946757] block nbd0: Unexpected reply (ffff880006bddc58)
[608190.198368] block nbd0: queue cleared
[608246.316551] block nbd0: NBD_DISCONNECT
[608246.316849] block nbd0: Unexpected reply (ffff880020db3c58)
[608246.577224] block nbd0: queue cleared
[608318.444758] block nbd0: NBD_DISCONNECT
[608318.445048] block nbd0: Unexpected reply (ffff88004d5b9c58)
[608318.701296] block nbd0: queue cleared
[608395.089149] block nbd0: NBD_DISCONNECT
[608395.089434] block nbd0: Unexpected reply (ffff8800134efc58)
[608395.346413] block nbd0: queue cleared
[608427.549061] block nbd0: NBD_DISCONNECT
[608427.549333] block nbd0: Unexpected reply (ffff880005091c58)
[608427.808279] block nbd0: queue cleared
[608514.375923] block nbd0: NBD_DISCONNECT
[608514.376223] block nbd0: Unexpected reply (ffff8800440afc58)
[608514.378241] block nbd0: queue cleared
[608792.236539] block nbd0: NBD_DISCONNECT
[608792.236832] block nbd0: Unexpected reply (ffff8800440afc58)
[608792.238694] block nbd0: queue cleared
[608870.784907] block nbd0: NBD_DISCONNECT
[608870.785199] block nbd0: Unexpected reply (ffff880050cf7c58)
[608870.786924] block nbd0: queue cleared
[608966.753806] block nbd0: NBD_DISCONNECT
[608968.083637] block nbd0: Unexpected reply (ffff880045dbfc58)
[608968.337105] block nbd0: queue cleared
[611601.621622] block nbd0: NBD_DISCONNECT
[611601.621901] block nbd0: Unexpected reply (ffff8800295f1c58)
[611601.879094] block nbd0: queue cleared
[612807.021413] block nbd0: NBD_DISCONNECT
[612807.021707] block nbd0: Unexpected reply (ffff8800247b9c58)
[612807.280124] block nbd0: queue cleared
[612859.680467] block nbd0: NBD_DISCONNECT
[612859.680757] block nbd0: Unexpected reply (ffff880045c31c58)
[612859.936098] block nbd0: queue cleared
[612911.461513] block nbd0: NBD_DISCONNECT
[612911.461816] block nbd0: Unexpected reply (ffff88000f5adc58)
[612911.719748] block nbd0: queue cleared
[614041.648259] INFO: task sheep:25482 blocked for more than 120 seconds.
[614041.648399] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[614041.648551] sheep D ffff880077c13780 0 25482 1 0x00000000
[614041.648563] ffff880072b1a080 0000000000000082 ffffffff810b938f ffffffff8160d020
[614041.648576] 0000000000013780 ffff880044151fd8 ffff880044151fd8 ffff880072b1a080
[614041.648586] ffff880044151778 ffff880044151778 ffff88003740b8a0 ffff880077c13fd0
[614041.648595] Call Trace:
[614041.648611] [<ffffffff810b938f>] ? get_pageblock_flags_group+0x3a/0x6e
[614041.648623] [<ffffffff810b47b3>] ? lock_page+0x20/0x20
[614041.648632] [<ffffffff8134dac1>] ? io_schedule+0x59/0x71
[614041.648641] [<ffffffff810b47b9>] ? sleep_on_page+0x6/0xa
[614041.648649] [<ffffffff8134df04>] ? __wait_on_bit+0x3e/0x71
[614041.648658] [<ffffffff810b48f5>] ? wait_on_page_bit+0x6e/0x73
[614041.648668] [<ffffffff8105fb09>] ? autoremove_wake_function+0x2a/0x2a
[614041.648678] [<ffffffff810c2850>] ? shrink_page_list+0x166/0x73f
[614041.648689] [<ffffffff810c9cfa>] ? zone_page_state_add+0x14/0x23
[614041.648698] [<ffffffff810ca514>] ? __mod_zone_page_state+0x3c/0x45
[614041.648706] [<ffffffff810c0e13>] ? update_isolated_counts+0x13b/0x15a
[614041.648715] [<ffffffff810c32c4>] ? shrink_inactive_list+0x2cd/0x3f0
[614041.648724] [<ffffffff810c3a89>] ? shrink_zone+0x3c0/0x4e6
[614041.648734] [<ffffffff810c3fa7>] ? do_try_to_free_pages+0x1cc/0x41c
[614041.648743] [<ffffffff810c4462>] ? try_to_free_pages+0xa9/0xe9
[614041.648752] [<ffffffff810364e8>] ? should_resched+0x5/0x23
[614041.648761] [<ffffffff810bb3ee>] ? __alloc_pages_nodemask+0x4ed/0x7aa
[614041.648770] [<ffffffff8134d86c>] ? _cond_resched+0x7/0x1c
[614041.648779] [<ffffffff8103ec07>] ? set_task_cpu+0xc1/0xd4
[614041.648789] [<ffffffff810e5f05>] ? alloc_pages_vma+0x12d/0x136
[614041.648798] [<ffffffff8103ec07>] ? set_task_cpu+0xc1/0xd4
[614041.648807] [<ffffffff810f1912>] ? do_huge_pmd_anonymous_page+0xad/0x259
[614041.648817] [<ffffffff81351b2e>] ? do_page_fault+0x320/0x345
[614041.648827] [<ffffffff810d60b0>] ? mmap_region+0x353/0x44a
[614041.648837] [<ffffffff8134f135>] ? page_fault+0x25/0x30
[614041.648854] INFO: task flush-43:0:2797 blocked for more than 120 seconds.

Feb 03 17:09:12 WARN [gway 10579] wait_forward_request(413) poll timeout 1, disks of some nodes or network is busy. Going to poll-wait again
Feb 03 17:09:12 WARN [gway 11263] wait_forward_request(413) poll timeout 1, disks of some nodes or network is busy. Going to poll-wait again
Feb 03 17:09:12 WARN [gway 11262] wait_forward_request(413) poll timeout 1, disks of some nodes or network is busy. Going to poll-wait again
Feb 03 17:09:12 WARN [gway 11261] wait_forward_request(413) poll timeout 1, disks of some nodes or network is busy. Going to poll-wait again
...
Feb 03 17:15:17 ERROR [gway 11686] do_write(281) failed to write to socket: Resource temporarily unavailable
Feb 03 17:15:17 ERROR [gway 11686] send_req(319) failed to send request a5, 262144: Resource temporarily unavailable
...
Feb 03 17:15:17 WARN [gway 11261] wait_forward_request(413) poll timeout 1, disks of some nodes or network is busy. Going to poll-wait again
Feb 03 17:15:17 WARN [gway 11684] wait_forward_request(413) poll timeout 1, disks of some nodes or network is busy. Going to poll-wait again
Feb 03 17:15:17 WARN [gway 11262] wait_forward_request(413) poll timeout 1, disks of some nodes or network is busy. Going to poll-wait again
...
Feb 03 17:29:43 ERROR [gway 11684] do_write(281) failed to write to socket: Resource temporarily unavailable
Feb 03 17:29:43 ERROR [gway 11684] send_req(319) failed to send request a5, 32768: Resource temporarily unavailable
Feb 03 17:29:46 ERROR [gway 11686] do_write(281) failed to write to socket: Resource temporarily unavailable
Feb 03 17:29:46 ERROR [gway 11686] send_req(319) failed to send request a5, 32768: Resource temporarily unavailable
Feb 03 17:30:02 ERROR [gway 11683] do_write(281) failed to write to socket: Resource temporarily unavailable
Feb 03 17:30:02 ERROR [gway 11683] send_req(319) failed to send request a5, 262144: Resource temporarily unavailable

[2]
Feb 12 14:56:55 test004 kernel: [610801.624515] sheep D ffff880077c13780 0 3151 1 0x00000000
Feb 12 14:56:55 test004 kernel: [610801.624528] ffff88003773a280 0000000000000082 ffffffff00000000 ffffffff8160d020
Feb 12 14:56:55 test004 kernel: [610801.624539] 0000000000013780 ffff880073d9bfd8 ffff880073d9bfd8 ffff88003773a280
Feb 12 14:56:55 test004 kernel: [610801.624549] ffff880073d9b778 0000000173d9b778 ffff880074489980 ffff880077c13fd0
Feb 12 14:56:55 test004 kernel: [610801.624559] Call Trace:
Feb 12 14:56:55 test004 kernel: [610801.624575] [<ffffffff810b47b3>] ? lock_page+0x20/0x20
Feb 12 14:56:55 test004 kernel: [610801.624586] [<ffffffff8134dac1>] ? io_schedule+0x59/0x71
Feb 12 14:56:55 test004 kernel: [610801.624596] [<ffffffff810b47b9>] ? sleep_on_page+0x6/0xa
Feb 12 14:56:55 test004 kernel: [610801.624603] [<ffffffff8134df04>] ? __wait_on_bit+0x3e/0x71
Feb 12 14:56:55 test004 kernel: [610801.624612] [<ffffffff810b48f5>] ? wait_on_page_bit+0x6e/0x73
Feb 12 14:56:55 test004 kernel: [610801.624623] [<ffffffff8105fb09>] ? autoremove_wake_function+0x2a/0x2a
Feb 12 14:56:55 test004 kernel: [610801.624633] [<ffffffff810c2850>] ? shrink_page_list+0x166/0x73f
Feb 12 14:56:55 test004 kernel: [610801.624644] [<ffffffff810c9cfa>] ? zone_page_state_add+0x14/0x23
Feb 12 14:56:55 test004 kernel: [610801.624653] [<ffffffff810ca514>] ? __mod_zone_page_state+0x3c/0x45
Feb 12 14:56:55 test004 kernel: [610801.624661] [<ffffffff810c0e13>] ? update_isolated_counts+0x13b/0x15a
Feb 12 14:56:55 test004 kernel: [610801.624670] [<ffffffff810c32c4>] ? shrink_inactive_list+0x2cd/0x3f0
Feb 12 14:56:55 test004 kernel: [610801.624680] [<ffffffff810bd999>] ? add_page_to_lru_list+0x64/0x64
Feb 12 14:56:55 test004 kernel: [610801.624689] [<ffffffff810c3a89>] ? shrink_zone+0x3c0/0x4e6
Feb 12 14:56:55 test004 kernel: [610801.624699] [<ffffffff810c3fa7>] ? do_try_to_free_pages+0x1cc/0x41c
Feb 12 14:56:55 test004 kernel: [610801.624708] [<ffffffff810c4462>] ? try_to_free_pages+0xa9/0xe9
Feb 12 14:56:55 test004 kernel: [610801.624716] [<ffffffff810364e8>] ? should_resched+0x5/0x23
Feb 12 14:56:55 test004 kernel: [610801.624726] [<ffffffff810bb3ee>] ? __alloc_pages_nodemask+0x4ed/0x7aa
Feb 12 14:56:55 test004 kernel: [610801.624735] [<ffffffff811af277>] ? rb_insert_color+0xb1/0xd9
Feb 12 14:56:55 test004 kernel: [610801.624745] [<ffffffff8103ec07>] ? set_task_cpu+0xc1/0xd4
Feb 12 14:56:55 test004 kernel: [610801.624754] [<ffffffff810e5f05>] ? alloc_pages_vma+0x12d/0x136
Feb 12 14:56:55 test004 kernel: [610801.624764] [<ffffffff8103ec07>] ? set_task_cpu+0xc1/0xd4
Feb 12 14:56:55 test004 kernel: [610801.624773] [<ffffffff810f1912>] ? do_huge_pmd_anonymous_page+0xad/0x259
Feb 12 14:56:55 test004 kernel: [610801.624783] [<ffffffff81351b2e>] ? do_page_fault+0x320/0x345
Feb 12 14:56:55 test004 kernel: [610801.624793] [<ffffffff810d60b0>] ? mmap_region+0x353/0x44a
Feb 12 14:56:55 test004 kernel: [610801.624803] [<ffffffff8134f135>] ? page_fault+0x25/0x30

Tags: qemu-nbd
Revision history for this message
sirio81 (sirio81) wrote :

I've been repeating the test on a new vdi with --copies 2 and it never hanged.
So this issue is related to erasure coded vdis.

dog vdi list
  testc2 0 25 GB 25 GB 0.0 MB 2014-02-12 16:05 cdab40 2

description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.