pvmove hangs while moving LVs

Bug #1514428 reported by Guillaume Penin on 2015-11-09
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
lvm2 (Ubuntu)
Undecided
Unassigned

Bug Description

Operating System : Ubuntu 12.04 LTS
LVM version : 2.02.66-4ubuntu7.1

When trying to pvmove from one device to another on an active production system, it seems that we encounter the following bug : https://bugzilla.redhat.com/show_bug.cgi?id=706036

Symptoms :
- pvmove command hangs indefinitely
- iostat shows busy disks without any I/O
- dmesg shows :

kernel: [119143.641376] INFO: task jbd2/dm-5-8:1801 blocked for more than 120 seconds.
kernel: [119143.641456] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kernel: [119143.641529] jbd2/dm-5-8 D ffffffff81806200 0 1801 2 0x00000000
kernel: [119143.641535] ffff88022d489ac0 0000000000000046 ffff88022d489a60 ffffffff8103ec29
kernel: [119143.641540] ffff88022d489fd8 ffff88022d489fd8 ffff88022d489fd8 0000000000012800
kernel: [119143.641543] ffff880232169700 ffff880227d0ae00 ffff88022d489a90 ffff88023fc930c0
kernel: [119143.641547] Call Trace:
kernel: [119143.641561] [<ffffffff8103ec29>] ? default_spin_lock_flags+0x9/0x10
kernel: [119143.641568] [<ffffffff81119ec0>] ? __lock_page+0x70/0x70
kernel: [119143.641577] [<ffffffff81666bff>] schedule+0x3f/0x60
kernel: [119143.641580] [<ffffffff81666caf>] io_schedule+0x8f/0xd0
kernel: [119143.641584] [<ffffffff81119ece>] sleep_on_page+0xe/0x20
kernel: [119143.641587] [<ffffffff816674bf>] __wait_on_bit+0x5f/0x90
kernel: [119143.641590] [<ffffffff8111a038>] wait_on_page_bit+0x78/0x80
kernel: [119143.641595] [<ffffffff8108c640>] ? autoremove_wake_function+0x40/0x40
kernel: [119143.641598] [<ffffffff8111a14c>] filemap_fdatawait_range+0x10c/0x1a0
kernel: [119143.641603] [<ffffffff81501a58>] ? dm_request+0x28/0x40
kernel: [119143.641607] [<ffffffff812f6cf4>] ? generic_make_request.part.52+0x74/0xb0
kernel: [119143.641610] [<ffffffff812f7108>] ? generic_make_request+0x68/0x70
kernel: [119143.641613] [<ffffffff8111a20b>] filemap_fdatawait+0x2b/0x30
kernel: [119143.641617] [<ffffffff81265ec0>] journal_finish_inode_data_buffers+0x70/0x170
kernel: [119143.641621] [<ffffffff812667c7>] jbd2_journal_commit_transaction+0x677/0x12a0
kernel: [119143.641625] [<ffffffff81079184>] ? try_to_del_timer_sync+0xa4/0x110
kernel: [119143.641629] [<ffffffff8126af5b>] kjournald2+0xbb/0x220
kernel: [119143.641632] [<ffffffff8108c600>] ? add_wait_queue+0x60/0x60
kernel: [119143.641635] [<ffffffff8126aea0>] ? commit_timeout+0x10/0x10
kernel: [119143.641638] [<ffffffff8108bb5c>] kthread+0x8c/0xa0
kernel: [119143.641646] [<ffffffff816733b4>] kernel_thread_helper+0x4/0x10
kernel: [119143.641649] [<ffffffff8108bad0>] ? flush_kthread_worker+0xa0/0xa0
kernel: [119143.641652] [<ffffffff816733b0>] ? gs_change+0x13/0x13

The system has been resetted in order to become operational.

I'm unable to use apport-collect on our servers as it does not seem to use proxy settings.

Some more information about this problem :

We wanted to pvmove approximately 600GB of data between two (VMWare) disks : /dev/sdb and /dev/sdc

- With database activity (PostgreSQL) :
    pvmove hangs exactly at the same point (60%)

- Without database activity (PostgreSQL) :
    pvmove finishes correctly

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers