Bug #1559723 “soft lockup during btrfs balance” : Bugs : linux package : Ubuntu

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#1

dmesg.log Edit (248.3 KiB, text/plain)

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#2

lspci-vvnn.log Edit (14.7 KiB, text/plain)

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#3

uname-a.log Edit (113 bytes, text/plain)

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#4

related: #1349711

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#5

Note that some months ago (when I got the error message), I still was able to cleanly free up some diskspace and perform a clean btrfs balance (which freed a lot of space).

My hope was that I'd get an error message and exit cleanly if disk space is missing. Instead, I got the soft lockup.

Revision history for this message

Brad Figg (brad-figg) wrote on 2016-03-20: Status changed to Confirmed

#6

This change was made by a bot.

Changed in linux (Ubuntu):
status:	New → Confirmed
tags:	added: trusty

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#7

For my setup, btrfs file system runs on top of a LUKS encrypted partition which itself is on top of a mdraid (mirroring).

The Layout is:

# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 2,7T 0 disk
├─sda1 8:1 0 7M 0 part
├─sda2 8:2 0 286M 0 part /boot
└─sda3 8:3 0 2,7T 0 part
  └─md0 9:0 0 2,7T 0 raid1
    └─md0_crypt (dm-0) 252:0 0 2,7T 0 crypt
sdb 8:16 0 2,7T 0 disk
├─sdb1 8:17 0 7M 0 part
├─sdb2 8:18 0 286M 0 part /boot2
└─sdb3 8:19 0 2,7T 0 part
  └─md0 9:0 0 2,7T 0 raid1
    └─md0_crypt (dm-0) 252:0 0 2,7T 0 crypt

The btrfs filesystem does not use btrfs-raid functionality.

Scenario: 3 hosts are backed up using rsync and btrfs-snapshots on this server. I have 3 groups of subvolumes with about 50 snapshots in each group.

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#8

Related: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1349711

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#9

Related: https://www.spinics.net/lists/linux-btrfs/msg48037.html

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#10

I have # btrfs --version
Btrfs v3.12

Kernel is from Hardware enablement stack 14.04.4, Btrfs-tools from ubuntu's package respectively.

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#11

# apt-cache policy btrfs-tools
btrfs-tools:
Installiert: 3.12-1ubuntu0.1

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#12

related: https://<email address hidden>/msg48609.html

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#13

Some relevant lines from /var/log/syslog:

Mar 19 19:15:10 server smartd[2251]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 107 to 106

Mar 19 07:59:21 server console-kit-daemon[2041]: GLib-CRITICAL: Source ID 376 was not found when attempting to remove it
Mar 19 19:23:17 server console-kit-daemon[2041]: GLib-CRITICAL: Source ID 332 was not found when attempting to remove it
Mar 19 19:23:17 server console-kit-daemon[2041]: GLib-CRITICAL: Source ID 332 was not found when attempting to remove it
Mar 19 19:23:41 server console-kit-daemon[2041]: WARNING: Record was not written to disk (No space left on device)

Mar 19 19:24:39 server console-kit-daemon[2041]: message repeated 3 times: [ WARNING: Record was not written to disk (No space left on device)]
Mar 19 19:45:07 server console-kit-daemon[2041]: GLib-CRITICAL: Source ID 384 was not found when attempting to remove it
Mar 19 19:45:11 server smartd[2251]: Device: /dev/sdb [SAT], SMART Usage Attribute: 194 Temperature_Celsius changed from 106 to 107

So besides full disk and maybe broken file system, the temperature of one drive also was too high, presumably during the balance operation. After >10 hours of idling, temperature is normal again (46 degrees celsius):

smartctl -a /dev/sdb
194 Temperature_Celsius 0x0022 104 101 000 Old_age Always - 46

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#14

The underlying md-raid (mirroring) is fine. So according to md-raid status, there's nothing wrong with the mirrored partitions:

cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sda3[0] sdb3[1]
2929833792 blocks super 1.2 [2/2] [UU]

Basically that means, if there is no issue with LUKS (and I don't see any), then the btrfs-balance on a full disk led to the ro-mounted, presumably corrupted file system.

I'll try to check the raid integrity, which might take a day or two.

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#15

Currently, I'm checking the md-raid (raid 1) with

echo check > /sys/block/md0/md/sync_action

I have 2 x 3.5TB hard drives, so this will take some while.

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#16

Something noticable: The command btrfs is consuming 100% of the cpu for a long time:

# top
[...]
PID BENUTZER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
27516 root 20 0 15476 132 0 R 100,0 0,0 1348:33 btrfs

I assume that this process originates from my "btrfs balance ...", some days ago. At least can I say that I did not start some other btrfs command in the meantime.

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#17

Error messages in dmesg |tail are still ongoing, presumably originating from the btrfs command which still is eating 100% of the cpu,

I did not reboot yet nor kill the btrfs process (pid 27516).

Here's two lines from two of the latest error messages in dmesg:
[3111643.576434] [<ffffffff817bc3b2>] entry_SYSCALL_64_fastpath+0x16/0x75
[3111671.589435] [<ffffffff817bc3b2>] entry_SYSCALL_64_fastpath+0x16/0x75

so the time interval beween two messages is about 28 seconds.

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#18

I found a few entries in kern.log, nothing special during btrfs balance:

Mar 19 03:47:03 server kernel: [2970569.341605] BTRFS info (device dm-0): relocating block group 9553211031552 flags 36
Mar 19 04:01:08 server kernel: [2971415.588391] BTRFS info (device dm-0): relocating block group 9552674160640 flags 36
Mar 19 04:11:30 server kernel: [2972037.616074] BTRFS info (device dm-0): relocating block group 9552137289728 flags 36
Mar 19 04:24:01 server kernel: [2972788.648812] BTRFS info (device dm-0): relocating block group 9488249651200 flags 36
Mar 19 04:58:24 server kernel: [2974852.613046] BTRFS info (device dm-0): relocating block group 9440468140032 flags 36
Mar 19 05:39:25 server kernel: [2977314.792513] BTRFS info (device dm-0): relocating block group 9424898883584 flags 36
Mar 19 06:29:42 server kernel: [2980333.819035] BTRFS info (device dm-0): relocating block group 9266521964544 flags 36
Mar 19 11:01:36 server kernel: [2996654.661281] BTRFS info (device dm-0): relocating block group 9215519227904 flags 36

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#19

cat /proc/mounts
sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0
proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0
udev /dev devtmpfs rw,relatime,size=8157548k,nr_inodes=2039387,mode=755 0 0
devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0
tmpfs /run tmpfs rw,nosuid,noexec,relatime,size=1634416k,mode=755 0 0
/dev/mapper/md0_crypt / btrfs ro,relatime,space_cache,subvolid=256,subvol=/@ 0 0
none /sys/fs/cgroup tmpfs rw,relatime,size=4k,mode=755 0 0
none /sys/fs/fuse/connections fusectl rw,relatime 0 0
none /sys/kernel/debug debugfs rw,relatime 0 0
none /sys/kernel/security securityfs rw,relatime 0 0
none /tmp tmpfs rw,relatime,size=6291456k 0 0
none /run/lock tmpfs rw,nosuid,nodev,noexec,relatime,size=5120k 0 0
none /run/shm tmpfs rw,nosuid,nodev,relatime 0 0
none /run/user tmpfs rw,nosuid,nodev,noexec,relatime,size=102400k,mode=755 0 0
none /sys/fs/pstore pstore rw,relatime 0 0
none /var/tmp/mysql tmpfs rw,relatime,size=6291456k 0 0
/dev/mapper/md0_crypt /home btrfs ro,relatime,space_cache,subvolid=258,subvol=/@home 0 0
/dev/mapper/md0_crypt /backup btrfs ro,relatime,space_cache,subvolid=263,subvol=/@backup 0 0
/dev/sda2 /boot ext4 rw,relatime,stripe=4,data=ordered 0 0
/dev/sdb2 /boot2 ext4 rw,relatime,stripe=4,data=ordered 0 0
systemd /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,name=systemd 0 0
/dev/mapper/md0_crypt /mnt/x btrfs ro,relatime,space_cache,subvolid=5,subvol=/ 0 0

All the btrfs mount points were mounted read-only (they all originate from one partition /dev/mapper/md0_crypt). I started the btrfs balance online (some days ago), meaning I mounted /dev/mapper/md0_crypt to mount point /mnt/x for the btrfs balance operation.

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-20:

#20

Now, this one is funny after ~2 days:

# btrfs balance status x
Balance on 'x' is running
0 out of about 2797 chunks balanced (533 considered), 100% left

# btrfs balance cancel x
ERROR: balance cancel on 'x' failed - Read-only file system

So the balance is still running, consuming 100% of one cpu, but now that the file system is read-only, nothing can be written to disk. The interval of ~30s between error messages in dmesg tells me that this might be interval of periodic comit according to https://btrfs.wiki.kernel.org/index.php/Mount_options

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-21:

#21

Checking the underlying mdraid with

echo check > /sys/block/md0/md/sync_action

now has finished and it is just fine:

cat /proc/mdstat
Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10]
md0 : active raid1 sda3[0] sdb3[1]
2929833792 blocks super 1.2 [2/2] [UU]

Revision history for this message

Joseph Salisbury (jsalisbury) wrote on 2016-03-21:

#22

Would it be possible for you to test the latest upstream kernel? Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Please test the latest v4.5 kernel[0].

If this bug is fixed in the mainline kernel, please add the following tag 'kernel-fixed-upstream'.

If the mainline kernel does not fix this bug, please add the tag: 'kernel-bug-exists-upstream'.

Once testing of the upstream kernel is complete, please mark this bug as "Confirmed".

Thanks in advance.

[0] http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.5-wily/

Changed in linux (Ubuntu):
importance:	Undecided → Medium
status:	Confirmed → Incomplete

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-21:

#23

@jsalisbury

What exactly should I test on a broken file system with a newer Kernel? How could this help you? If btrfs balance corrupted my file system, how should a newer kernel fix that?

If you can tell me test a scenario which will help you to see if the bug is fixed upstream, I'd be happy to hear about it.

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-23:

#24

Download full text (23.4 KiB)

I rebooted the system in the meantime. It now had Kernel 4.2.0-30.36 from previous updates:

cat /proc/version_signature
Ubuntu 4.2.0-30.36~14.04.1-generic 4.2.8-ckt3

During the reboot, the system performed some disk I/O for several hours. Before services like sshd became available. I guess this disk operation happened before the file system was mounted in read-write mode.

What I can say is that the system tried to continue the failed balance operation during the boot (from syslog) without intervention from my side:

BTRFS info (device dm-0): continuing balance
BTRFS info (device dm-0): relocating block group 9553211031552 flags 36
found 106150 extents

The btrfs balance operation did not succeed again, but in a manner which I was used to before writing this bug report. This is MUCH better than going into an unpredictable state (ro mode with Input/Output errors). I guess the OS totally got out of sync.

BTRFS info (device dm-0): 2 enospc errors during balance

The btrfs file system was mounted in READ-WRITE-mode this time (not in read-only mode as before!). And this time, I can read the files and navigate through directories without any Input/Output errors (before the reboot, I had lots of them!).

Finally, everything looks fine for a btrfs file system with no space left on the disk.

Now that the fs is online and accessible, I can try to add another disk to the fs to have some empty space again. After that, I hopefully can delete some data and hopefully redo the btrfs balance operation.

/var/log/syslog gives me:

Mar 22 12:42:37 server kernel: [ 3481.725273] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 22 12:42:37 server kernel: [ 3481.726122] mount D ffff88043e296640 0 587 455 0x00000000
Mar 22 12:42:37 server kernel: [ 3481.726125] ffff88042798f6a8 0000000000000086 ffff88042b98b700 ffff880427946e00
Mar 22 12:42:37 server kernel: [ 3481.726127] ffff88042798f788 ffff880427990000 0000000000000000 7fffffffffffffff
Mar 22 12:42:37 server kernel: [ 3481.726129] ffff88043e5aeee8 ffffffff817b9240 ffff88042798f6c8 ffffffff817b8ac7
Mar 22 12:42:37 server kernel: [ 3481.726130] Call Trace:
Mar 22 12:42:37 server kernel: [ 3481.726136] [<ffffffff817b9240>] ? bit_wait+0x60/0x60
Mar 22 12:42:37 server kernel: [ 3481.726138] [<ffffffff817b8ac7>] schedule+0x37/0x80
Mar 22 12:42:37 server kernel: [ 3481.726140] [<ffffffff817bb4a1>] schedule_timeout+0x201/0x2a0
Mar 22 12:42:37 server kernel: [ 3481.726143] [<ffffffff8101d749>] ? read_tsc+0x9/0x10
Mar 22 12:42:37 server kernel: [ 3481.726146] [<ffffffff810e6e5e>] ? ktime_get+0x3e/0xa0
Mar 22 12:42:37 server kernel: [ 3481.726147] [<ffffffff817b9240>] ? bit_wait+0x60/0x60
Mar 22 12:42:37 server kernel: [ 3481.726149] [<ffffffff817b8136>] io_schedule_timeout+0xa6/0x110
Mar 22 12:42:37 server kernel: [ 3481.726151] [<ffffffff817b925f>] bit_wait_io+0x1f/0x70
Mar 22 12:42:37 server kernel: [ 3481.726152] [<ffffffff817b8e90>] __wait_on_bit+0x60/0x90
Mar 22 12:42:37 server kernel: [ 3481.726155] [<ffffffff81176708>] ? find_get_pages_tag+0xc8/0x170
Mar 22 12:42:37 server kernel: [ 3481.726157] [<ffffffff81175980>] wait_on_page_bit+0xc0/0xd0
Mar 22 ...

I rebooted the system in the meantime. It now had Kernel 4.2.0-30.36 from previous updates:

cat /proc/version_signature
Ubuntu 4.2.0-30.36~14.04.1-generic 4.2.8-ckt3

During the reboot, the system performed some disk I/O for several hours. Before services like sshd became available. I guess this disk operation happened before the file system was mounted in read-write mode.

What I can say is that the system tried to continue the failed balance operation during the boot (from syslog) without intervention from my side:

BTRFS info (device dm-0): continuing balance
BTRFS info (device dm-0): relocating block group 9553211031552 flags 36
found 106150 extents

The btrfs balance operation did not succeed again, but in a manner which I was used to before writing this bug report. This is MUCH better than going into an unpredictable state (ro mode with Input/Output errors). I guess the OS totally got out of sync.

BTRFS info (device dm-0): 2 enospc errors during balance

The btrfs file system was mounted in READ-WRITE-mode this time (not in read-only mode as before!). And this time, I can read the files and navigate through directories without any Input/Output errors (before the reboot, I had lots of them!).

Finally, everything looks fine for a btrfs file system with no space left on the disk.

Now that the fs is online and accessible, I can try to add another disk to the fs to have some empty space again. After that, I hopefully can delete some data and hopefully redo the btrfs balance operation.

/var/log/syslog gives me:

Mar 22 12:42:37 server kernel: [ 3481.725273] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mar 22 12:42:37 server kernel: [ 3481.726122] mount           D ffff88043e296640     0   587    455 0x00000000
Mar 22 12:42:37 server kernel: [ 3481.726125]  ffff88042798f6a8 0000000000000086 ffff88042b98b700 ffff880427946e00
Mar 22 12:42:37 server kernel: [ 3481.726127]  ffff88042798f788 ffff880427990000 0000000000000000 7fffffffffffffff
Mar 22 12:42:37 server kernel: [ 3481.726129]  ffff88043e5aeee8 ffffffff817b9240 ffff88042798f6c8 ffffffff817b8ac7
Mar 22 12:42:37 server kernel: [ 3481.726130] Call Trace:
Mar 22 12:42:37 server kernel: [ 3481.726136]  [<ffffffff817b9240>] ? bit_wait+0x60/0x60
Mar 22 12:42:37 server kernel: [ 3481.726138]  [<ffffffff817b8ac7>] schedule+0x37/0x80
Mar 22 12:42:37 server kernel: [ 3481.726140]  [<ffffffff817bb4a1>] schedule_timeout+0x201/0x2a0
Mar 22 12:42:37 server kernel: [ 3481.726143]  [<ffffffff8101d749>] ? read_tsc+0x9/0x10
Mar 22 12:42:37 server kernel: [ 3481.726146]  [<ffffffff810e6e5e>] ? ktime_get+0x3e/0xa0
Mar 22 12:42:37 server kernel: [ 3481.726147]  [<ffffffff817b9240>] ? bit_wait+0x60/0x60
Mar 22 12:42:37 server kernel: [ 3481.726149]  [<ffffffff817b8136>] io_schedule_timeout+0xa6/0x110
Mar 22 12:42:37 server kernel: [ 3481.726151]  [<ffffffff817b925f>] bit_wait_io+0x1f/0x70
Mar 22 12:42:37 server kernel: [ 3481.726152]  [<ffffffff817b8e90>] __wait_on_bit+0x60/0x90
Mar 22 12:42:37 server kernel: [ 3481.726155]  [<ffffffff81176708>] ? find_get_pages_tag+0xc8/0x170
Mar 22 12:42:37 server kernel: [ 3481.726157]  [<ffffffff81175980>] wait_on_page_bit+0xc0/0xd0
Mar 22 12:42:37 server kernel: [ 3481.726159]  [<ffffffff810b76b0>] ? autoremove_wake_function+0x40/0x40
Mar 22 12:42:37 server kernel: [ 3481.726160]  [<ffffffff81175a80>] filemap_fdatawait_range+0xf0/0x180
Mar 22 12:42:37 server kernel: [ 3481.726175]  [<ffffffffc04dc283>] btrfs_wait_ordered_range+0x73/0x110 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726184]  [<ffffffffc05075a3>] btrfs_wait_cache_io+0x63/0x1d0 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726190]  [<ffffffffc04acdbf>] ? btrfs_run_delayed_refs.part.72+0x1bf/0x280 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726195]  [<ffffffffc04b069b>] btrfs_start_dirty_block_groups+0x18b/0x400 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726202]  [<ffffffffc04c1501>] btrfs_commit_transaction+0x1c1/0xb10 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726210]  [<ffffffffc04e35af>] ? free_extent_buffer+0x4f/0x90 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726215]  [<ffffffffc04a7f9b>] ? btrfs_block_rsv_check+0x2b/0x70 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726222]  [<ffffffffc04c287b>] __btrfs_end_transaction+0x39b/0x3d0 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726227]  [<ffffffffc04b2c48>] ? btrfs_update_root+0x198/0x320 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726233]  [<ffffffffc04c2c03>] btrfs_end_transaction_throttle+0x13/0x20 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726239]  [<ffffffffc04ab4c2>] btrfs_drop_snapshot+0x492/0x830 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726247]  [<ffffffffc0514c72>] merge_reloc_roots+0xd2/0x230 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726253]  [<ffffffffc05159f3>] btrfs_recover_relocation+0x353/0x3b0 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726258]  [<ffffffffc04947c7>] btrfs_remount+0x407/0x560 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726260]  [<ffffffff811ee66a>] do_remount_sb+0x6a/0x1c0
Mar 22 12:42:37 server kernel: [ 3481.726262]  [<ffffffff8120c413>] do_mount+0x843/0xcf0
Mar 22 12:42:37 server kernel: [ 3481.726263]  [<ffffffff8120ba9a>] ? copy_mount_options+0x3a/0x150
Mar 22 12:42:37 server kernel: [ 3481.726265]  [<ffffffff8120cbab>] SyS_mount+0x8b/0xe0
Mar 22 12:42:37 server kernel: [ 3481.726266]  [<ffffffff817bc332>] entry_SYSCALL_64_fastpath+0x16/0x75
Mar 22 12:42:37 server kernel: [ 4562.225662] INFO: task mount:587 blocked for more than 120 seconds.
Mar 22 12:42:37 server kernel: [ 4562.226559]       Tainted: G     U          4.2.0-30-generic #36~14.04.1-Ubuntu
Mar 22 12:42:37 server kernel: [ 4562.227415] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[...]
Mar 22 12:42:37 server kernel: [ 3481.726122] mount           D ffff88043e296640     0   587    455 0x00000000
Mar 22 12:42:37 server kernel: [ 3481.726125]  ffff88042798f6a8 0000000000000086 ffff88042b98b700 ffff880427946e00
Mar 22 12:42:37 server kernel: [ 3481.726127]  ffff88042798f788 ffff880427990000 0000000000000000 7fffffffffffffff
Mar 22 12:42:37 server kernel: [ 3481.726129]  ffff88043e5aeee8 ffffffff817b9240 ffff88042798f6c8 ffffffff817b8ac7
Mar 22 12:42:37 server kernel: [ 3481.726130] Call Trace:
Mar 22 12:42:37 server kernel: [ 3481.726136]  [<ffffffff817b9240>] ? bit_wait+0x60/0x60
Mar 22 12:42:37 server kernel: [ 3481.726138]  [<ffffffff817b8ac7>] schedule+0x37/0x80
Mar 22 12:42:37 server kernel: [ 3481.726140]  [<ffffffff817bb4a1>] schedule_timeout+0x201/0x2a0
Mar 22 12:42:37 server kernel: [ 3481.726143]  [<ffffffff8101d749>] ? read_tsc+0x9/0x10
Mar 22 12:42:37 server kernel: [ 3481.726146]  [<ffffffff810e6e5e>] ? ktime_get+0x3e/0xa0
Mar 22 12:42:37 server kernel: [ 3481.726147]  [<ffffffff817b9240>] ? bit_wait+0x60/0x60
Mar 22 12:42:37 server kernel: [ 3481.726149]  [<ffffffff817b8136>] io_schedule_timeout+0xa6/0x110
Mar 22 12:42:37 server kernel: [ 3481.726151]  [<ffffffff817b925f>] bit_wait_io+0x1f/0x70
Mar 22 12:42:37 server kernel: [ 3481.726152]  [<ffffffff817b8e90>] __wait_on_bit+0x60/0x90
Mar 22 12:42:37 server kernel: [ 3481.726155]  [<ffffffff81176708>] ? find_get_pages_tag+0xc8/0x170
Mar 22 12:42:37 server kernel: [ 3481.726157]  [<ffffffff81175980>] wait_on_page_bit+0xc0/0xd0
Mar 22 12:42:37 server kernel: [ 3481.726159]  [<ffffffff810b76b0>] ? autoremove_wake_function+0x40/0x40
Mar 22 12:42:37 server kernel: [ 3481.726160]  [<ffffffff81175a80>] filemap_fdatawait_range+0xf0/0x180
Mar 22 12:42:37 server kernel: [ 3481.726175]  [<ffffffffc04dc283>] btrfs_wait_ordered_range+0x73/0x110 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726184]  [<ffffffffc05075a3>] btrfs_wait_cache_io+0x63/0x1d0 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726190]  [<ffffffffc04acdbf>] ? btrfs_run_delayed_refs.part.72+0x1bf/0x280 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726195]  [<ffffffffc04b069b>] btrfs_start_dirty_block_groups+0x18b/0x400 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726202]  [<ffffffffc04c1501>] btrfs_commit_transaction+0x1c1/0xb10 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726210]  [<ffffffffc04e35af>] ? free_extent_buffer+0x4f/0x90 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726215]  [<ffffffffc04a7f9b>] ? btrfs_block_rsv_check+0x2b/0x70 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726222]  [<ffffffffc04c287b>] __btrfs_end_transaction+0x39b/0x3d0 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726227]  [<ffffffffc04b2c48>] ? btrfs_update_root+0x198/0x320 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726233]  [<ffffffffc04c2c03>] btrfs_end_transaction_throttle+0x13/0x20 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726239]  [<ffffffffc04ab4c2>] btrfs_drop_snapshot+0x492/0x830 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726247]  [<ffffffffc0514c72>] merge_reloc_roots+0xd2/0x230 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726253]  [<ffffffffc05159f3>] btrfs_recover_relocation+0x353/0x3b0 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726258]  [<ffffffffc04947c7>] btrfs_remount+0x407/0x560 [btrfs]
Mar 22 12:42:37 server kernel: [ 3481.726260]  [<ffffffff811ee66a>] do_remount_sb+0x6a/0x1c0
Mar 22 12:42:37 server kernel: [ 3481.726262]  [<ffffffff8120c413>] do_mount+0x843/0xcf0
Mar 22 12:42:37 server kernel: [ 3481.726263]  [<ffffffff8120ba9a>] ? copy_mount_options+0x3a/0x150
Mar 22 12:42:37 server kernel: [ 3481.726265]  [<ffffffff8120cbab>] SyS_mount+0x8b/0xe0
Mar 22 12:42:37 server kernel: [ 3481.726266]  [<ffffffff817bc332>] entry_SYSCALL_64_fastpath+0x16/0x75
Mar 22 12:42:37 server kernel: [ 4562.225662] INFO: task mount:587 blocked for more than 120 seconds.
[...]
Mar 22 12:42:37 server kernel: [ 4562.228280] mount           D ffff88043e296640     0   587    455 0x00000000
Mar 22 12:42:37 server kernel: [ 4562.228283]  ffff88042798f6a8 0000000000000086 ffff88042b98b700 ffff880427946e00
Mar 22 12:42:37 server kernel: [ 4562.228285]  ffff88042798f788 ffff880427990000 0000000000000000 7fffffffffffffff
Mar 22 12:42:37 server kernel: [ 4562.228286]  ffff88043e5c05e8 ffffffff817b9240 ffff88042798f6c8 ffffffff817b8ac7
Mar 22 12:42:37 server kernel: [ 4562.228288] Call Trace:
Mar 22 12:42:37 server kernel: [ 4562.228294]  [<ffffffff817b9240>] ? bit_wait+0x60/0x60
Mar 22 12:42:37 server kernel: [ 4562.228296]  [<ffffffff817b8ac7>] schedule+0x37/0x80
Mar 22 12:42:37 server kernel: [ 4562.228297]  [<ffffffff817bb4a1>] schedule_timeout+0x201/0x2a0
Mar 22 12:42:37 server kernel: [ 4562.228301]  [<ffffffff8101d749>] ? read_tsc+0x9/0x10
Mar 22 12:42:37 server kernel: [ 4562.228304]  [<ffffffff810e6e5e>] ? ktime_get+0x3e/0xa0
Mar 22 12:42:37 server kernel: [ 4562.228305]  [<ffffffff817b9240>] ? bit_wait+0x60/0x60
Mar 22 12:42:37 server kernel: [ 4562.228307]  [<ffffffff817b8136>] io_schedule_timeout+0xa6/0x110
Mar 22 12:42:37 server kernel: [ 4562.228309]  [<ffffffff817b925f>] bit_wait_io+0x1f/0x70
Mar 22 12:42:37 server kernel: [ 4562.228310]  [<ffffffff817b8e90>] __wait_on_bit+0x60/0x90
Mar 22 12:42:37 server kernel: [ 4562.228313]  [<ffffffff81176708>] ? find_get_pages_tag+0xc8/0x170
Mar 22 12:42:37 server kernel: [ 4562.228315]  [<ffffffff81175980>] wait_on_page_bit+0xc0/0xd0
Mar 22 12:42:37 server kernel: [ 4562.228317]  [<ffffffff810b76b0>] ? autoremove_wake_function+0x40/0x40
Mar 22 12:42:37 server kernel: [ 4562.228318]  [<ffffffff81175a80>] filemap_fdatawait_range+0xf0/0x180
Mar 22 12:42:37 server kernel: [ 4562.228332]  [<ffffffffc04dc283>] btrfs_wait_ordered_range+0x73/0x110 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228334]  [<ffffffff81180df1>] ? set_page_dirty+0x41/0x70
Mar 22 12:42:37 server kernel: [ 4562.228343]  [<ffffffffc05075a3>] btrfs_wait_cache_io+0x63/0x1d0 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228348]  [<ffffffffc0498071>] ? btrfs_release_path+0x21/0xa0 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228354]  [<ffffffffc04b069b>] btrfs_start_dirty_block_groups+0x18b/0x400 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228361]  [<ffffffffc04c1501>] btrfs_commit_transaction+0x1c1/0xb10 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228369]  [<ffffffffc04e35af>] ? free_extent_buffer+0x4f/0x90 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228374]  [<ffffffffc04a7f9b>] ? btrfs_block_rsv_check+0x2b/0x70 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228381]  [<ffffffffc04c287b>] __btrfs_end_transaction+0x39b/0x3d0 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228386]  [<ffffffffc04b2c48>] ? btrfs_update_root+0x198/0x320 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228392]  [<ffffffffc04c2c03>] btrfs_end_transaction_throttle+0x13/0x20 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228398]  [<ffffffffc04ab4c2>] btrfs_drop_snapshot+0x492/0x830 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228405]  [<ffffffffc0514c72>] merge_reloc_roots+0xd2/0x230 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228412]  [<ffffffffc05159f3>] btrfs_recover_relocation+0x353/0x3b0 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228417]  [<ffffffffc04947c7>] btrfs_remount+0x407/0x560 [btrfs]
Mar 22 12:42:37 server kernel: [ 4562.228419]  [<ffffffff811ee66a>] do_remount_sb+0x6a/0x1c0
Mar 22 12:42:37 server kernel: [ 4562.228421]  [<ffffffff8120c413>] do_mount+0x843/0xcf0
Mar 22 12:42:37 server kernel: [ 4562.228422]  [<ffffffff8120ba9a>] ? copy_mount_options+0x3a/0x150
Mar 22 12:42:37 server kernel: [ 4562.228424]  [<ffffffff8120cbab>] SyS_mount+0x8b/0xe0
Mar 22 12:42:37 server kernel: [ 4562.228425]  [<ffffffff817bc332>] entry_SYSCALL_64_fastpath+0x16/0x75
Mar 22 12:42:37 server kernel: [ 7975.560908] BTRFS info (device dm-0): continuing balance
[...]
Mar 22 12:44:03 server kernel: [ 8073.441009] BTRFS info (device dm-0): relocating block group 9553211031552 flags 36
[...]
Mar 22 12:51:33 server kernel: [ 8524.324198] btrfs-transacti D ffff88043e3d6640     0   409      2 0x00000000
Mar 22 12:51:33 server kernel: [ 8524.324204]  ffff880424983d68 0000000000000046 ffff88042b9b8000 ffff880425be5280
Mar 22 12:51:33 server kernel: [ 8524.324208]  0000000000000246 ffff880424984000 ffff8803ff9884e0 ffff8804265801b0
Mar 22 12:51:33 server kernel: [ 8524.324211]  ffff88042bbe5000 0000000000000000 ffff880424983d88 ffffffff817b8ac7
Mar 22 12:51:33 server kernel: [ 8524.324215] Call Trace:
Mar 22 12:51:33 server kernel: [ 8524.324225]  [<ffffffff817b8ac7>] schedule+0x37/0x80
Mar 22 12:51:33 server kernel: [ 8524.324247]  [<ffffffffc04c06d0>] wait_for_commit.isra.17+0x40/0x70 [btrfs]
Mar 22 12:51:33 server kernel: [ 8524.324252]  [<ffffffff810b7670>] ? prepare_to_wait_event+0xf0/0xf0
Mar 22 12:51:33 server kernel: [ 8524.324267]  [<ffffffffc04c149c>] btrfs_commit_transaction+0x15c/0xb10 [btrfs]
Mar 22 12:51:33 server kernel: [ 8524.324281]  [<ffffffffc04c1ee3>] ? start_transaction+0x93/0x580 [btrfs]
Mar 22 12:51:33 server kernel: [ 8524.324295]  [<ffffffffc04bc99a>] transaction_kthread+0x1ba/0x240 [btrfs]
Mar 22 12:51:33 server kernel: [ 8524.324307]  [<ffffffffc04bc7e0>] ? btrfs_cleanup_transaction+0x540/0x540 [btrfs]
Mar 22 12:51:33 server kernel: [ 8524.324311]  [<ffffffff81095499>] kthread+0xc9/0xe0
Mar 22 12:51:33 server kernel: [ 8524.324315]  [<ffffffff810953d0>] ? kthread_create_on_node+0x1c0/0x1c0
Mar 22 12:51:33 server kernel: [ 8524.324318]  [<ffffffff817bc75f>] ret_from_fork+0x3f/0x70
Mar 22 12:51:33 server kernel: [ 8524.324322]  [<ffffffff810953d0>] ? kthread_create_on_node+0x1c0/0x1c0
Mar 22 12:52:23 server kernel: [ 8574.196463] init: ureadahead main process (449) killed by KILL signal
Mar 22 12:59:29 server kernel: [ 9000.470433] BTRFS info (device dm-0): found 106150 extents
Mar 22 12:59:54 server kernel: [ 9025.568882] BTRFS info (device dm-0): 2 enospc errors during balance

dmesg gives me

[ 3481.723605] INFO: task mount:587 blocked for more than 120 seconds.
[ 3481.724438]       Tainted: G     U          4.2.0-30-generic #36~14.04.1-Ubuntu
[ 3481.725273] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3481.726122] mount           D ffff88043e296640     0   587    455 0x00000000
[ 3481.726125]  ffff88042798f6a8 0000000000000086 ffff88042b98b700 ffff880427946e00
[ 3481.726127]  ffff88042798f788 ffff880427990000 0000000000000000 7fffffffffffffff
[ 3481.726129]  ffff88043e5aeee8 ffffffff817b9240 ffff88042798f6c8 ffffffff817b8ac7
[ 3481.726130] Call Trace:
[ 3481.726136]  [<ffffffff817b9240>] ? bit_wait+0x60/0x60
[ 3481.726138]  [<ffffffff817b8ac7>] schedule+0x37/0x80
[ 3481.726140]  [<ffffffff817bb4a1>] schedule_timeout+0x201/0x2a0
[ 3481.726143]  [<ffffffff8101d749>] ? read_tsc+0x9/0x10
[ 3481.726146]  [<ffffffff810e6e5e>] ? ktime_get+0x3e/0xa0
[ 3481.726147]  [<ffffffff817b9240>] ? bit_wait+0x60/0x60
[ 3481.726149]  [<ffffffff817b8136>] io_schedule_timeout+0xa6/0x110
[ 3481.726151]  [<ffffffff817b925f>] bit_wait_io+0x1f/0x70
[ 3481.726152]  [<ffffffff817b8e90>] __wait_on_bit+0x60/0x90
[ 3481.726155]  [<ffffffff81176708>] ? find_get_pages_tag+0xc8/0x170
[ 3481.726157]  [<ffffffff81175980>] wait_on_page_bit+0xc0/0xd0
[ 3481.726159]  [<ffffffff810b76b0>] ? autoremove_wake_function+0x40/0x40
[ 3481.726160]  [<ffffffff81175a80>] filemap_fdatawait_range+0xf0/0x180
[ 3481.726175]  [<ffffffffc04dc283>] btrfs_wait_ordered_range+0x73/0x110 [btrfs]
[ 3481.726184]  [<ffffffffc05075a3>] btrfs_wait_cache_io+0x63/0x1d0 [btrfs]
[ 3481.726190]  [<ffffffffc04acdbf>] ? btrfs_run_delayed_refs.part.72+0x1bf/0x280 [btrfs]
[ 3481.726195]  [<ffffffffc04b069b>] btrfs_start_dirty_block_groups+0x18b/0x400 [btrfs]
[ 3481.726202]  [<ffffffffc04c1501>] btrfs_commit_transaction+0x1c1/0xb10 [btrfs]
[ 3481.726210]  [<ffffffffc04e35af>] ? free_extent_buffer+0x4f/0x90 [btrfs]
[ 3481.726215]  [<ffffffffc04a7f9b>] ? btrfs_block_rsv_check+0x2b/0x70 [btrfs]
[ 3481.726222]  [<ffffffffc04c287b>] __btrfs_end_transaction+0x39b/0x3d0 [btrfs]
[ 3481.726227]  [<ffffffffc04b2c48>] ? btrfs_update_root+0x198/0x320 [btrfs]
[ 3481.726233]  [<ffffffffc04c2c03>] btrfs_end_transaction_throttle+0x13/0x20 [btrfs]
[ 3481.726239]  [<ffffffffc04ab4c2>] btrfs_drop_snapshot+0x492/0x830 [btrfs]
[ 3481.726247]  [<ffffffffc0514c72>] merge_reloc_roots+0xd2/0x230 [btrfs]
[ 3481.726253]  [<ffffffffc05159f3>] btrfs_recover_relocation+0x353/0x3b0 [btrfs]
[ 3481.726258]  [<ffffffffc04947c7>] btrfs_remount+0x407/0x560 [btrfs]
[ 3481.726260]  [<ffffffff811ee66a>] do_remount_sb+0x6a/0x1c0
[ 3481.726262]  [<ffffffff8120c413>] do_mount+0x843/0xcf0
[ 3481.726263]  [<ffffffff8120ba9a>] ? copy_mount_options+0x3a/0x150
[ 3481.726265]  [<ffffffff8120cbab>] SyS_mount+0x8b/0xe0
[ 3481.726266]  [<ffffffff817bc332>] entry_SYSCALL_64_fastpath+0x16/0x75
[ 4562.225662] INFO: task mount:587 blocked for more than 120 seconds.
[ 4562.226559]       Tainted: G     U          4.2.0-30-generic #36~14.04.1-Ubuntu
[ 4562.227415] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4562.228280] mount           D ffff88043e296640     0   587    455 0x00000000
[ 4562.228283]  ffff88042798f6a8 0000000000000086 ffff88042b98b700 ffff880427946e00
[ 4562.228285]  ffff88042798f788 ffff880427990000 0000000000000000 7fffffffffffffff
[ 4562.228286]  ffff88043e5c05e8 ffffffff817b9240 ffff88042798f6c8 ffffffff817b8ac7
[ 4562.228288] Call Trace:
[ 4562.228294]  [<ffffffff817b9240>] ? bit_wait+0x60/0x60
[ 4562.228296]  [<ffffffff817b8ac7>] schedule+0x37/0x80
[ 4562.228297]  [<ffffffff817bb4a1>] schedule_timeout+0x201/0x2a0
[ 4562.228301]  [<ffffffff8101d749>] ? read_tsc+0x9/0x10
[ 4562.228304]  [<ffffffff810e6e5e>] ? ktime_get+0x3e/0xa0
[ 4562.228305]  [<ffffffff817b9240>] ? bit_wait+0x60/0x60
[ 4562.228307]  [<ffffffff817b8136>] io_schedule_timeout+0xa6/0x110
[ 4562.228309]  [<ffffffff817b925f>] bit_wait_io+0x1f/0x70
[ 4562.228310]  [<ffffffff817b8e90>] __wait_on_bit+0x60/0x90
[ 4562.228313]  [<ffffffff81176708>] ? find_get_pages_tag+0xc8/0x170
[ 4562.228315]  [<ffffffff81175980>] wait_on_page_bit+0xc0/0xd0
[ 4562.228317]  [<ffffffff810b76b0>] ? autoremove_wake_function+0x40/0x40
[ 4562.228318]  [<ffffffff81175a80>] filemap_fdatawait_range+0xf0/0x180
[ 4562.228332]  [<ffffffffc04dc283>] btrfs_wait_ordered_range+0x73/0x110 [btrfs]
[ 4562.228334]  [<ffffffff81180df1>] ? set_page_dirty+0x41/0x70
[ 4562.228343]  [<ffffffffc05075a3>] btrfs_wait_cache_io+0x63/0x1d0 [btrfs]
[ 4562.228348]  [<ffffffffc0498071>] ? btrfs_release_path+0x21/0xa0 [btrfs]
[ 4562.228354]  [<ffffffffc04b069b>] btrfs_start_dirty_block_groups+0x18b/0x400 [btrfs]
[ 4562.228361]  [<ffffffffc04c1501>] btrfs_commit_transaction+0x1c1/0xb10 [btrfs]
[ 4562.228369]  [<ffffffffc04e35af>] ? free_extent_buffer+0x4f/0x90 [btrfs]
[ 4562.228374]  [<ffffffffc04a7f9b>] ? btrfs_block_rsv_check+0x2b/0x70 [btrfs]
[ 4562.228381]  [<ffffffffc04c287b>] __btrfs_end_transaction+0x39b/0x3d0 [btrfs]
[ 4562.228386]  [<ffffffffc04b2c48>] ? btrfs_update_root+0x198/0x320 [btrfs]
[ 4562.228392]  [<ffffffffc04c2c03>] btrfs_end_transaction_throttle+0x13/0x20 [btrfs]
[ 4562.228398]  [<ffffffffc04ab4c2>] btrfs_drop_snapshot+0x492/0x830 [btrfs]
[ 4562.228405]  [<ffffffffc0514c72>] merge_reloc_roots+0xd2/0x230 [btrfs]
[ 4562.228412]  [<ffffffffc05159f3>] btrfs_recover_relocation+0x353/0x3b0 [btrfs]
[ 4562.228417]  [<ffffffffc04947c7>] btrfs_remount+0x407/0x560 [btrfs]
[ 4562.228419]  [<ffffffff811ee66a>] do_remount_sb+0x6a/0x1c0
[ 4562.228421]  [<ffffffff8120c413>] do_mount+0x843/0xcf0
[ 4562.228422]  [<ffffffff8120ba9a>] ? copy_mount_options+0x3a/0x150
[ 4562.228424]  [<ffffffff8120cbab>] SyS_mount+0x8b/0xe0
[ 4562.228425]  [<ffffffff817bc332>] entry_SYSCALL_64_fastpath+0x16/0x75
[ 7975.560908] BTRFS info (device dm-0): continuing balance
[ 7977.590090] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: (null)
[ 7978.881574] EXT4-fs (sdb2): mounted filesystem with ordered data mode. Opts: (null)
[ 7982.053560] init: failsafe main process (1047) killed by TERM signal
[...]
[ 8073.441009] BTRFS info (device dm-0): relocating block group 9553211031552 flags 36
[ 8081.239604] audit: type=1400 audit(1458647050.858:22): apparmor="STATUS" operation="profile_replace" profile="unconfined" name="/usr/sbin/mysqld" pid=1299 comm="apparmor_parser"
[ 8171.758863] init: mysql post-start process (1322) terminated with status 1
[ 8190.153034] audit: type=1400 audit(1458647159.668:23): apparmor="DENIED" operation="open" profile="/usr/sbin/mysqld" name="/var/tmp/mysql/" pid=1321 comm="mysqld" requested_mask="r" denied_mask="r" fsuid=106 ouid=0
[ 8216.355398] init: plymouth-upstart-bridge main process ended, respawning
[ 8524.324045] INFO: task btrfs-transacti:409 blocked for more than 120 seconds.
[ 8524.324102]       Tainted: G     U          4.2.0-30-generic #36~14.04.1-Ubuntu
[ 8524.324149] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 8524.324198] btrfs-transacti D ffff88043e3d6640     0   409      2 0x00000000
[ 8524.324204]  ffff880424983d68 0000000000000046 ffff88042b9b8000 ffff880425be5280
[ 8524.324208]  0000000000000246 ffff880424984000 ffff8803ff9884e0 ffff8804265801b0
[ 8524.324211]  ffff88042bbe5000 0000000000000000 ffff880424983d88 ffffffff817b8ac7
[ 8524.324215] Call Trace:
[ 8524.324225]  [<ffffffff817b8ac7>] schedule+0x37/0x80
[ 8524.324247]  [<ffffffffc04c06d0>] wait_for_commit.isra.17+0x40/0x70 [btrfs]
[ 8524.324252]  [<ffffffff810b7670>] ? prepare_to_wait_event+0xf0/0xf0
[ 8524.324267]  [<ffffffffc04c149c>] btrfs_commit_transaction+0x15c/0xb10 [btrfs]
[ 8524.324281]  [<ffffffffc04c1ee3>] ? start_transaction+0x93/0x580 [btrfs]
[ 8524.324295]  [<ffffffffc04bc99a>] transaction_kthread+0x1ba/0x240 [btrfs]
[ 8524.324307]  [<ffffffffc04bc7e0>] ? btrfs_cleanup_transaction+0x540/0x540 [btrfs]
[ 8524.324311]  [<ffffffff81095499>] kthread+0xc9/0xe0
[ 8524.324315]  [<ffffffff810953d0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 8524.324318]  [<ffffffff817bc75f>] ret_from_fork+0x3f/0x70
[ 8524.324322]  [<ffffffff810953d0>] ? kthread_create_on_node+0x1c0/0x1c0
[ 8574.196463] init: ureadahead main process (449) killed by KILL signal
[ 9000.470433] BTRFS info (device dm-0): found 106150 extents
[ 9025.568882] BTRFS info (device dm-0): 2 enospc errors during balance

Revision history for this message

Thomas Mayer (thomas303) wrote on 2016-03-27:

#25

As described in my previous comment, the fs was available in read-write mode again.

I added another disk to the btrfs and with the additional space I was able to rerun and successfully finish the btrfs balance operation (as described in https://www.slicewise.net/debian/balancierung-eines-vollen-btrfs-dateisystems/ ):

btrfs balance start x
Done, had to relocate 2797 out of 2797 chunks

So my file system is completely rescued now.

To sum it up:

There is no way I could reproduce this error again with a recent kernel.

All I can say is that Kernel 4.2.0-27.32~14.04.1-generic 4.2.8-ckt1 ran into unexpected behaviour when doing a btrfs balance:

Expected Result: Error "enospc errors during balance" (at least)

Actual Result: Kernel bug in dmesg, Filesystem goes read-only, Input/Output errors occur, many directories are not readable even in read-only mode, even ls does not work. It seems as if the system totally screwed it up.

Suggested Fix: All of these problems should be avoidable because the later reboot "healed" them (Error "enospc errors during balance" reported, fs mounted read-write, but btrfs balance still fails, which is totally fine at this point)

Revision history for this message

Launchpad Janitor (janitor) wrote on 2016-05-27:

#26

[Expired for linux (Ubuntu) because there has been no activity for 60 days.]

Changed in linux (Ubuntu):
status:	Incomplete → Expired

Ubuntu
linux package

soft lockup during btrfs balance

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntulinux package

soft lockup during btrfs balance

Bug Description

Other bug subscribers

Bug attachments

Remote bug watches

Ubuntu
linux package