fsck.btrfs of a 1TB volume triggers OOM kill on a 1GB RAM machine. (32-bit)

Bug #992480 reported by Robert Collins
12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
btrfs-tools (Ubuntu)
Confirmed
Medium
Unassigned

Bug Description

fsck.btrfs of a 1TB volume triggers OOM kill on a 1GB RAM machine. (32-bit)

Thats about all there is to say:

sudo fsck -v /dev/sdd1
fsck from util-linux 2.20.1
fsck: Warning... fsck.btrfs for device /dev/sdd1 exited with signal 9.

May 1 17:49:10 lifelesswks kernel: [639863.533203] Out of memory: Kill process 32192 (fsck.btrfs) score 846 or sacrifice child
May 1 17:49:10 lifelesswks kernel: [639863.533210] Killed process 32192 (fsck.btrfs) total-vm:2555884kB, anon-rss:772228kB, file-rss:0kB

Tags: precise
tags: added: precise
Changed in btrfs-tools (Ubuntu):
importance: Undecided → Medium
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in btrfs-tools (Ubuntu):
status: New → Confirmed
Revision history for this message
Arie Skliarouk (skliarie) wrote :

Same story, 4TB volume on a 4GB RAM machine. (64-bit).

btrfs-tools version 0.19+20100601-3ubuntu3 (as provided by ubuntu 12.04)

The reason I want to do btrfsck as there is a subvolume I can not delete anymore:

/sbin/btrfs subvolume delete backup_2012-05-17
Delete subvolume '/backups/backup_2012-05-17'
ERROR: cannot delete '/backups/backup_2012-05-17'

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

There is 0.19+20120328-2ubuntu1 in quantal. Which has much newer implementation of btrfs-tools. Can you try that?

Revision history for this message
Arie Skliarouk (skliarie) wrote :

It got better, printing bunch of output:

Backref 3770329751552 root 260 not referenced back 0x541da560
Incorrect global backref count on 3770329751552 found 1 wanted 0
backpointer mismatch on [3770329751552 4096]
owner ref check failed [3770329751552 4096]
ref mismatch on [3770329800704 4096] extent item 1, found 0
Backref 3770329800704 root 260 not referenced back 0x4db0d110
Incorrect global backref count on 3770329800704 found 1 wanted 0
backpointer mismatch on [3770329800704 4096]
owner ref check failed [3770329800704 4096]
Errors found in extent allocation tree
checking fs roots
Killed

Eventually it also ate the 4GB of memory and got killed by OOM. Note that I intentionally disabled swap here to avoid thrashing.

Revision history for this message
Dimitri John Ledkov (xnox) wrote :

Right, I am waiting for my 1TB hard-drives to arrive. Somehow I am suspecting that even the git master will die as well. If you are capable of testing with btrfs kernel/dkms modules & btrfs-tools master (danger do not use one) then by all means email the btrfs mailing list about this. Alternatively I will do this test and submit bug upstream, when I get my 1TB hard drives.

Revision history for this message
Arie Skliarouk (skliarie) wrote :

I added 2 GB to the machine, so it now has 6GB and the script finished by itself after using 3457MB of memory:

Incorrect local backref count on 307295731712 root 260 owner 4526738 offset 0 found 0 wanted 1 back 0xc5909180
backpointer mismatch on [307295731712 57344]
owner ref check failed [307295731712 57344]
ref mismatch on [307295789056 53248] extent item 1, found 0
repair deleting extent record: key 307295789056 168 53248
failed to repair damaged filesystem, aborting

It failed to repair the filesystem, but this is topic for a different bug.

Revision history for this message
Arie Skliarouk (skliarie) wrote :

I reformatted the 6TB volume, and filled it with data (3.5TB). Yesterday there was a power failure and the partition became corrupted.
btrfsck 0.19+20120328-2ubuntu1 used all available memory (of 6GB present) and got killed by OOM:

...
ref mismatch on [5606072983552 8192] extent item 1, found 0
Incorrect local backref count on 5606072983552 root 4095 owner 14571496 offset 0 found 0 wanted 1 back 0x846449d0
backpointer mismatch on [5606072983552 8192]
owner ref check failed [5606072983552 8192]
Errors found in extent allocation tree
checking fs roots
Killed

I then added 2GB and after using 6.5GB of memory, the btrfsck managed to complete, but it did not fix the filesystem:
Errors found in extent allocation tree
checking fs roots
checking root refs
found 3304564289536 bytes used err is 0
total csum bytes: 3195312740
total tree bytes: 30077300736
total fs tree bytes: 24014831616
btree space waste bytes: 7517736113
file data blocks allocated: 3557121048576
 referenced 3557121048576
Btrfs Btrfs v0.19

Is there a special flag for btrfsck to fix the filesystem?

Revision history for this message
Arie Skliarouk (skliarie) wrote :

After some search I found that --repair flag of btrfsck would cause it to try to fix the filesystem. Yet in my case btrfsck failed to do that:

# btrfsck --repair /dev/sda2
...
Incorrect local backref count on 51589234688 root 4095 owner 6854501 offset 0 found 0 wanted 1 back 0xa00a0980
backpointer mismatch on [51589234688 8192]
owner ref check failed [51589234688 8192]
ref mismatch on [51589242880 8192] extent item 1, found 0
repair deleting extent record: key 51589242880 168 8192
failed to repair damaged filesystem, aborting

I tried to zero log, but it did not help:
btrfs-zero-log /dev/sda2

Any ideas I can try before reformatting the partition and starting from scratch?

Revision history for this message
Dimitri John Ledkov (xnox) wrote : Re: [Bug 992480] Re: fsck.btrfs of a 1TB volume triggers OOM kill on a 1GB RAM machine. (32-bit)

On 25/06/12 17:00, Arie Skliarouk wrote:
>
> Any ideas I can try before reformatting the partition and starting from
> scratch?
>

Read: http://btrfs.wiki.kernel.org
Ask on #btrfs freenode IRC
..

--
Regards,
Dmitrijs.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.