Existing eCryptfs inodes are not evicted when they're the target of a rename()/mv

Bug #561129 reported by kurapix on 2010-04-12
44
This bug affects 9 people
Affects Status Importance Assigned to Milestone
eCryptfs
Medium
Tyler Hicks
ecryptfs-utils (Ubuntu)
Undecided
Unassigned
Nominated for Oneiric by Tyler Hicks
Nominated for Precise by Tyler Hicks
Nominated for Quantal by Tyler Hicks
linux (Ubuntu)
Undecided
Unassigned
Nominated for Oneiric by Tyler Hicks
Nominated for Precise by Tyler Hicks
Nominated for Quantal by Tyler Hicks

Bug Description

NOTE: A test case for this bug has been created at tests/kernel/lp-561129.sh (revno 731) in the upstream ecryptfs-utils project.

This bug is the result of existing eCryptfs inodes not being evicted when they are the target of a rename() syscall. The existing inodes are left around, meaning that the lower inodes are also left around, until the eCryptfs filesystem in unmounted. This means that disk space is not properly freed when mv'ing a file on top of another file.

I've verified that 2.6.39 and newer kernels are affected.

Here's the original bug report:
---

1 - Try to download some Ubuntu DVD versions with bit torrent (so it reserves space).
2 - Fill your disk to the maximum and leave something like 5GB.
3 - Wait
4 - Disk space will go to 0 (it doesn't make any sense since space has already been reserved)
5 - Ctrl+Alt+F1

There we can see some messages about ecryptfs :
ecryptfs_write_lower: octets_written [-28]; expected [4096]
ecryptfs_encrypt_page: Error Attempting to write lower page; rc = [-22]
[...]

I don't really know if it's really ecryptfs but it doesn't make any sense that disk space just get used up like this when only my firefox browser and bit torrent are open and not trying to write anything else.

While your disk space is at 0, it's slow as hell and it will swap quite much (even though your ram isn't full).

Thanks for the report. I'm seeing some funny behavior using bit
torrent too. My disk isn't going to 0. But things have slowed down.

Thanks for the report. We have a handful of issues around torrents inside of ecryptfs that we need to get cleaned up. I suspect all of them come down to the same one or two fundamental issues. Marking incomplete until we get a handle on the problem.

Changed in ecryptfs:
importance: Undecided → Medium
tags: added: torrent
summary: - ecryptfs suck up disk space and doesn't seem to use swap
+ ecryptfs suck up disk space and doesn't seem to use swap while
+ downloading torrent
Changed in ecryptfs:
status: New → Incomplete
Adam Porter (alphapapa) wrote :

Since upgrading to Oneiric I am seeing this. I am not using torrents, only Firefox. One minute I have 700 MB free, or even 2+GB free...the next moment I have 37 MB free! Then a minute later, zero bytes free! I delete some files, free up some space, and then a few minutes later, 0 bytes free again! I fscked the fs, and then I had 2.9 GB free. But a few minutes of Firefox and, bam, 0 free space. This is crazy. It never happened with Natty, but as soon as I upgraded to Oneiric, boom. This is utterly unusable.

Adam Porter (alphapapa) wrote :

This is a regression in Oneiric and/or the 3.0.0 kernel. This doesn't happen using the 2.6.38 kernel from Natty. Here's the sequence of events:

1. Boot 3.0.0 Oneiric kernel. ~1.9 GB free space.
2. Log in to user with ecryptfs home.
3. Do random ops in the filesystem. Free space remains the same.
4. $(du -hcs ~/.mozilla/firefox) ~= 400 MB.
5. Start Firefox. After it loads, free space is down to ~300 MB (1.6 GB less), but ~/.mozilla/firefox is still ~400MB.
6. Free space continues to decrease. A minute later, it's at 0 bytes free. ~/.mozilla/firefox is still ~400MB.
(6a. plasma-desktop hangs with 0 bytes free space [another regression], have to log out through DBUS call to ksmserver or use SAK, etc)
7. After logging out, free space still at 0 bytes.
8. Reboot into 2.6.38 kernel from Natty (rest of system still Oneiric). Free space is back at ~1.9 GB!
9. Log in to user with ecryptfs home.
10. Do random ops. Free space the same.
11. Start Firefox. After it loads, free space still ~1.9 GB.
12. Use Firefox for a while. Free space still ~1.9 GB.
13. Conclude that there's an ecryptfs bug in 3.0.0 being triggered by Firefox somehow.

I ran $(lsof | grep firefox) while running 3.0.0 and at 0 bytes free and checked with du all the directories Firefox could be writing to (i.e. not /usr/lib, /usr/share, etc), but none were using an abnormal amount of space, certainly nothing adding up to anywhere near 1.9 GB.

I did not note any ecryptfs errors in dmesg before it reached 0 bytes free. I did not run fsck between switching kernels this time. A few days ago I did run fsck several times but always found the fs to be clean in spite of this bug.

It seems strange that upon rebooting the free space is available again. Perhaps it's ecryptfs interacting with ext4 in some way to incorrectly allocate or preallocate space?

Changed in ecryptfs:
status: Incomplete → Confirmed
summary: - ecryptfs suck up disk space and doesn't seem to use swap while
- downloading torrent
+ eCryptfs sucks up all disk space with Oneiric kernel
tags: added: firefox oneiric regression

I think this bug belongs to the kernel and not the userland package.

tags: added: kernel
affects: ecryptfs → linux (Ubuntu)
no longer affects: ecryptfs-utils (Ubuntu)
Adam Porter (alphapapa) wrote :

Tyler, I saw your work on other eCryptfs bugs, so I added you to this one. I see that other bugs like #842647 are assigned to eCryptfs and ecryptfs-utils, so I'm switching it back.

affects: linux (Ubuntu) → ecryptfs
Changed in ecryptfs-utils (Ubuntu):
status: New → Confirmed

On 2012-01-11 07:06:09, Adam Porter wrote:
> 1. Boot 3.0.0 Oneiric kernel. ~1.9 GB free space.

Can you clarify what you mean by "free space" throughout your bug
comment? I suppose it is disk space, but would like to be sure that
you're not talking about memory usage.

Yes, free disk space on the ext4 root partition /dev/sda5 in which /home/username/.ecryptfs resides. Memory usage is not an issue here.

Chun-Yu (cshei) wrote :

I'm now seeing the same thing on a Gentoo system running the vanilla 3.3.0 kernel and ext4. I tried rebooting into 3.2.11, and still see the same problem. I've been running this setup for quite a while, so somewhere along the line, something broke recently.

As soon as I start Firefox 11, my free space constantly decreases until it hits 0, makeing eCryptfs unusable. Logging out and killing all my user's processes doesn't help, and looking for deleted files that were still open with "lsof +L1" showed nothing. As soon as I unmounted my encrypted home directory, though, the amount free space returned to normal.

I was also able to reproduce this on an eCryptfs on ZFS filesystem, so it's not restricted to ext4.

Chun-Yu (cshei) wrote :

Update:

Here's an example of this happening (as root, after logging in as my user, running firefox, and logging out):

# lsof +L1
# df -k /home/cshei
Filesystem 1K-blocks Used Available Use% Mounted on
/home/cshei/.Private 26172888 19455440 5405296 79% /home/cshei
# umount /home/cshei
# df -k /home/cshei
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/root 26172888 19333016 5527720 78% /

Also, if I move ~/.mozilla to a separate (unencrypted) filesystem and symlink it, it seems to slow/stop(?) the bleeding.

Adam Porter (alphapapa) wrote :

Thanks for chiming in on this, Chun-Yu--I didn't think I could really be the only one seeing it.

James Gifford (jamesgifford) wrote :

I'm seeing this on latest precise as well. However, I'm not using Firefox, I'm using chrome. Not sure if I'm having the same issue, but it seems related.

hexafraction (rarkenin) wrote :

I can confirm in an old, well-used virtual machine. After Firefox use, free space decreases from ~2.7GB to 1 GB, then no space at all within another few minutes.

Tyler Hicks (tyhicks) wrote :

Oddly enough, I've never been able to reproduce this bug until today while I was testing a fix for another bug. I've written a fix an built a set of amd64 test kernels, one for Precise and one for Quantal. If you're interested in giving one of the test kernels a try, you can find them here:

http://people.canonical.com/~tyhicks/ecryptfs/fixes/

Changed in ecryptfs:
status: Confirmed → In Progress
description: updated
summary: - eCryptfs sucks up all disk space with Oneiric kernel
+ Existing eCryptfs inodes are not evicted when they're the target of a
+ rename()/mv
Tyler Hicks (tyhicks) on 2012-09-13
description: updated
Changed in ecryptfs-utils (Ubuntu):
status: Confirmed → Invalid
Changed in linux (Ubuntu):
status: New → Triaged
Tyler Hicks (tyhicks) on 2012-09-14
description: updated
Colin Ian King (colin-king) wrote :

I've given this a good soak testing on Quantal, Precise + applied the patches to Oneiric and for each tested it against ext2, ext3, ext4, xfs and btrfs lower file systems, and it does solve this bug. Thanks Tyler.

Tyler Hicks (tyhicks) on 2012-09-14
Changed in ecryptfs:
assignee: nobody → Tyler Hicks (tyhicks)
Colin Ian King (colin-king) wrote :

SRU for Oneiric and Precise, and apply to Quantal too.

== SRU Justification ==

== Impact ==

This bug is the result of existing eCryptfs inodes not
being evicted when they are the target of a rename()
syscall. The existing inodes are left around, meaning
that the lower inodes are also left around, until the
eCryptfs filesystem in unmounted. This means that disk
space is not properly freed when mv'ing a file on top
of another file.

This can be triggered for example by:

1 - Try to download some Ubuntu DVD versions with bit
    torrent (so it reserves space).
2 - Fill your disk to the maximum and leave something
    like 5GB.
3 - Wait
4 - Disk space will go to 0 (it doesn't make any sense
    since space has already been reserved)
5 - Ctrl+Alt+F1

There we can see some messages about ecryptfs :
ecryptfs_write_lower: octets_written [-28]; expected [4096]
ecryptfs_encrypt_page: Error Attempting to write lower page; rc = [-22

== Fix ==

Apply commit 8335eafc2859e1a26282bef7c3d19f3d68868b8a

== Test Case ==

Can be tested on various file systems using the ecryptfs tests (from
lp:ecryptfs).

sudo mkdir /tmp/image /lower /upper
sudo ./tests/run_tests.sh -K -c safe -b 1000000 -D /tmp/image \
-l /lower -u /upper -f ext2,ext3,ext4,xfs,btrfs -t lp-561129.sh

Without the fix, this fails, with the fix it passes.

Download full text (4.9 KiB)

I have finally discovered what's going on here--at least, somewhat. I
discovered that, when I encounter this bug, the Adblock Plus Firefox
extension is in some kind of loop constantly rewriting one of its
filter ini files. If I constantly $(ls -l) in the directory, I can
see the same file constantly being rewritten, with its size going to
zero, then up to the proper size, and back down to zero, then back
up... It's the same filename being rewritten over and over again, and
the file is about 1.5 MB, so that explains the slow but steady
apparent loss of gigs of space. Since the space is recovered on
reboot (or perhaps on logout when the eCryptfs volume is closed), I'm
guessing it's some kind of bug related to truncating files and it not
releasing the space when the file is truncated, so the rewriting
eventually uses all available space. I can imagine a similar
situation happening with BitTorrent disk I/O patterns.

I haven't tried to reproduce the bug manually, outside of Firefox, but
perhaps this info will help point you in the right direction. I'm not
completely sure if the ABP extension is still exhibiting this
behavior--perhaps an update to it has fixed its looping bug.
Incidentally, in the few instances of my using torrents in eCryptfs, I
haven't had any problems.

Thanks for your work on this bug, Tyler.

On Thu, Sep 13, 2012 at 6:26 PM, Tyler Hicks <email address hidden> wrote:
> Oddly enough, I've never been able to reproduce this bug until today
> while I was testing a fix for another bug. I've written a fix an built a
> set of amd64 test kernels, one for Precise and one for Quantal. If
> you're interested in giving one of the test kernels a try, you can find
> them here:
>
> http://people.canonical.com/~tyhicks/ecryptfs/fixes/
>
> ** Changed in: ecryptfs
> Status: Confirmed => In Progress
>
> ** Description changed:
>
> + NOTE: A test case for this bug has been created at
> + tests/kernel/lp-561129.sh (revno 731) in the upstream ecryptfs-utils
> + project.
> +
> 1 - Try to download some Ubuntu DVD versions with bit torrent (so it reserves space).
> 2 - Fill your disk to the maximum and leave something like 5GB.
> 3 - Wait
> 4 - Disk space will go to 0 (it doesn't make any sense since space has already been reserved)
> 5 - Ctrl+Alt+F1
>
> There we can see some messages about ecryptfs :
> ecryptfs_write_lower: octets_written [-28]; expected [4096]
> ecryptfs_encrypt_page: Error Attempting to write lower page; rc = [-22]
> [...]
>
> I don't really know if it's really ecryptfs but it doesn't make any
> sense that disk space just get used up like this when only my firefox
> browser and bit torrent are open and not trying to write anything else.
>
> While your disk space is at 0, it's slow as hell and it will swap quite
> much (even though your ram isn't full).
>
> ** Summary changed:
>
> - eCryptfs sucks up all disk space with Oneiric kernel
> + Existing eCryptfs inodes are not evicted when they're the target of a rename()/mv
>
> ** Description changed:
>
> NOTE: A test case for this bug has been created at
> tests/kernel/lp-561129.sh (revno 731) in the upstream ecryptfs-utils
> project.
> +
...

Read more...

Adam Porter (alphapapa) wrote :

Haha, I noticed that Tyler tracked down the real reason...after I wrote my last comment. Oh well, I was close. Thanks for fixing this! I hope Ubuntu will release it soon!

Tyler Hicks (tyhicks) wrote :
Changed in ecryptfs:
status: In Progress → Fix Released
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 3.5.0-15.20

---------------
linux (3.5.0-15.20) quantal-proposed; urgency=low

  [ Tim Gardner ]

  * rebase to v3.5.4
  * SAUCE: CONFIG_HID_BATTERY_STRENGTH=y
    - LP: #1003090

  [ Upstream Kernel Changes ]

  * eCryptfs: Copy up attributes of the lower target inode after rename
    - LP: #561129
  * eCryptfs: Write out all dirty pages just before releasing the lower
    file
    - LP: #1047261
  * eCryptfs: Call lower ->flush() from ecryptfs_flush()
    - LP: #1047261
  * af_netlink: force credentials passing [CVE-2012-3520]
    - LP: #1052097
    - CVE-2012-3520
  * drm/i915: clarify IBX dp workaround
    - LP: #1011440
  * drm/i915: Implement w/a for sporadic read failures on waking from rc6
    - LP: #1011440
  * drm/i915: support Haswell force waking
    - LP: #1011440
  * drm/i915: add RPS configuration for Haswell
    - LP: #1011440
  * drm/i915: enable RC6 by default on Haswell
    - LP: #1011440
  * drm/i915: introduce haswell_init_clock_gating
    - LP: #1011440
  * drm/i915: enable RC6 workaround on Haswell
    - LP: #1011440
  * drm/i915: re-initialize DDI buffer translations after resume
    - LP: #1011440
  * drm/i915: fix PIPE_DDI_PORT_MASK
    - LP: #1011440
  * drm/i915: try to train DP even harder
    - LP: #1011440
  * drm/i915: add more Haswell PCI IDs
    - LP: #1011440
  * rebase to v3.5.4
    - LP: #1038651
 -- Leann Ogasawara <email address hidden> Mon, 17 Sep 2012 13:41:39 -0700

Changed in linux (Ubuntu):
status: Triaged → Fix Released
Colin Ian King (colin-king) wrote :

Tested on oneiric -proposed with ext2,ext3,ext4,xfs and btrfs lower file systems:

uname -a
Linux ubuntu 3.0.0-26-server #43-Ubuntu SMP Tue Sep 25 17:37:40 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

ubuntu@ubuntu:~/ecryptfs$ sudo ./tests/run_tests.sh -K -c safe -b 1000000 -D /tmp/image -l /lower -u /upper -t lp-561129.sh -f ext2,ext3,ext4,xfs,btrfs
Running eCryptfs filesystem tests on ext2
lp-561129 pass
Running eCryptfs filesystem tests on ext3
lp-561129 pass
Running eCryptfs filesystem tests on ext4
lp-561129 pass
Running eCryptfs filesystem tests on xfs
lp-561129 pass
Running eCryptfs filesystem tests on btrfs
lp-561129 pass

Test Summary:
5 passed
0 failed

Luis Henriques (henrix) wrote :

As per comment #21, I'm tagging this bug as verified in oneiric.

tags: added: verification-done-oneiric
Luis Henriques (henrix) wrote :

This bug is awaiting verification that the kernel for Precise in -proposed solves the problem (3.2.0-32.51). Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-precise' to 'verification-done-precise'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-precise
Download full text (3.6 KiB)

On Fri, Sep 28, 2012 at 8:40 AM, Luis Henriques
<email address hidden> wrote:
> If verification is not done by one week from today, this fix will be
> dropped from the source code, and this bug will be closed.

This is yet another example of Ubuntu's absolutely horrid bug handling
practices.

Here we have a bug that has been verified to exist by multiple people,
including the developer himself. The bug can cause a filesystem to
fill up completely, which causes all sorts of software to fail,
including causing programs to fail to exit, fail to save preferences,
fail to save user data (DATA LOSS), and even interfere with logging
out and shutting down the system, which is necessary to restore the
free space.

The developer has not only confirmed the bug, but has patched it and
released a patch upstream. Now the patch is ready to be tested for
release in the LONG TERM SUPPORT release, so that this LTS release
might actually BE a dependable release.

And what happens next? Non-specific people interested in the fixing
of the bug are virtually threatened with extortion!

I have very little free time right now to devote to reporting and
following up on and testing bugs. It may be one or two or even more
weeks between times that I am able to go through my bug-related
emails. I could easily not even notice this message until the time
had expired!

This bug is of CRITICAL importance. But what happens if no one is
able to test it in this arbitrary time limit? The fix will be
dropped, and THE BUG WILL BE CLOSED! CLOSED! This is a CONFIRMED bug
with SERIOUS consequences, and it will be CLOSED as if it were fixed!

There is absolutely NO EXCUSE for this unconscionable behavior.

1. A one-week time limit is not enough. There are plenty of reasons
why a fix like this might not be tested that quickly.
2. If the time limit does expire, it should be trivial to reopen the
testing process.
3. Whether or not anyone EVER tests the fix and reports on it, a bug
should NEVER, EVER be closed until it is FIXED! This goes DOUBLE for
bugs like this of CRITICAL IMPORTANCE!

What is Ubuntu thinking? "Oh well, no one has tested the fix yet, and
I'm too impatient to wait, so it must not matter anymore. Let's just
pretend it never happened, and we can all go on working in ignorant
bliss."

I don't know what the motivation behind this policy is, but that's the
way it comes off to people who spend their VALUABLE TIME helping to
investigate and fix bugs like this. And regardless of the motivation,
it is completely unreasonable.

I have said it so many times, and I will say it again: Ubuntu should
learn from Debian. Bugs in Debian are never, ever closed unless
they're truly fixed or the software is removed from the archive. A
bug might remain open for years, but until it's fixed, it is
documented and remains "open" for others to reference. Debian is
honest; Debian has integrity; Debian values truth over statistics and
convenience. Debian does not disparage the contributions of its
users--unlike Ubuntu, which makes unreasonable threats such as this,
while saying "Thank you for helping to make Ubuntu better" out of the
other side of its mouth.

Again, I do...

Read more...

Colin Ian King (colin-king) wrote :

Tested on Precise -proposed 3.2.0-32-generic #51-Ubuntu with ext2,ext3,ext4,xfs and btrfs lower file system:

ubuntu@ubuntu:~/ecryptfs$ sudo ./tests/run_tests.sh -K -c safe -b 1000000 -D /tmp/image -l /lower -u /upper -f ext2,ext3,ext4,xfs,btrfs -t lp-561129.sh
Running eCryptfs filesystem tests on ext2
lp-561129 pass
Running eCryptfs filesystem tests on ext3
lp-561129 pass
Running eCryptfs filesystem tests on ext4
lp-561129 pass
Running eCryptfs filesystem tests on xfs
lp-561129 pass
Running eCryptfs filesystem tests on btrfs
lp-561129 pass

Test Summary:
5 passed
0 failed

tags: added: verification-done-precise
removed: verification-needed-precise

The verification of this Stable Release Update has completed successfully and the package has now been released to -updates. Subsequently, the Ubuntu Stable Release Updates Team is being unsubscribed and will not receive messages about this bug report. In the event that you encounter a regression using the package from -updates please report a new bug using ubuntu-bug and tag the bug report regression-update so we can easily find any regresssions.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers