rsync between two btrfs filesystems causes machines to crash with symptoms of memory leak

Bug #1238658 reported by Sean Clarke
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Invalid
High
Unassigned

Bug Description

Using the latest 13.10 (kernel 3.11.0-12-generic) we performed a backup which is essentially an rsync from a 6 disk array to a USB HDD - both are BTRFS filesystems

After several hours the machine was unresponsive and the screen full of out of memory errors and killing lots of different processes.

This was happening at the same time as bug #1237794 - the bugs may be related, they may not.

Revision history for this message
Sean Clarke (sean-clarke) wrote :

Output from:

ubuntu-bug linux

Revision history for this message
Sean Clarke (sean-clarke) wrote :

Now running mainline kernel (3.12.0-999-generic #201310090426) and restest

Revision history for this message
Brad Figg (brad-figg) wrote : Missing required logs.

This bug is missing log files that will aid in diagnosing the problem. From a terminal window please run:

apport-collect 1238658

and then change the status of the bug to 'Confirmed'.

If, due to the nature of the issue you have encountered, you are unable to run this command, please add a comment stating that fact and change the bug status to 'Confirmed'.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: New → Incomplete
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

Can you also test the mainline kernel for this bug, as you did in bug 1237794

Changed in linux (Ubuntu):
importance: Undecided → Medium
tags: added: kernel-da-key saucy
Changed in linux (Ubuntu):
importance: Medium → High
Revision history for this message
Sean Clarke (sean-clarke) wrote :

Installed 3.12.0-999-generic #201310090426 kernel and backup runs and completes as expected.

Adding tag as instructed

tags: added: kernel-fixed-upstream
Changed in linux (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

We can perform a "Reverse" kernel bisect to identify the commit that fixes this in the upstream kernel. Can you first test the latest 3.11 upstream stable kernel, to see if the fix in mainline was already sent to stable? The latest 3.11 stable kernel is available from:

http://kernel.ubuntu.com/~kernel-ppa/mainline/v3.11.5-saucy/

tags: removed: kernel-fixed-upstream
Revision history for this message
Sean Clarke (sean-clarke) wrote :

Installed 3.12.0-999-generic #201310170405 and retrying.

Revision history for this message
Sean Clarke (sean-clarke) wrote :

OK, it is reproducible - I have a filserver with 6x 3TB in a BTRFS RAID 1+0 configuration.

From a client (and using NFS) I copy a 95GB tar file from the fileserver to a USB HD.

It seems at the very end (when BTRFS deletes the 95GB file on the server it falls over - btrfs-transaction and btrfs-flush_del using 3 to 5 cores at 50 to 100%.

Revision history for this message
Joseph Salisbury (jsalisbury) wrote :

It sounds like you got two seperate results with the upstream 3.12 kernel, per comments #5 and #7/8. Was you testing in comment #8 with the 3.12 kernel, or the latest 3.11 kernel, suggested in comment #6?

Revision history for this message
Sean Clarke (sean-clarke) wrote :

Sorry Joseph, I should have been clearer. It was with the mainline kernel 3.12.0-999-generic #201310090426, I never gor around to trying the rc5 kernel as you suggested - this is my company fileserver so I cannot just do immediate changes (unless of course it has just crashed and I am recovering).

I think the lock up problem now is related to bug #1237794 which I am also looking at, I have not encountered anymore memory issues. so if you wish you could close this as fixed upstream and I will continue with #1237794

Revision history for this message
Sean Clarke (sean-clarke) wrote :

This problem was not reproducible with the 3.12 kernel, happy to have closed as fixed in 3.12

Changed in linux (Ubuntu):
status: Confirmed → Invalid
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.