ecryptfs sometimes seems to add trailing garbage to encrypted files

Reported by Erik Carstensen on 2010-01-18
786
This bug affects 232 people
Affects Status Importance Assigned to Milestone
eCryptfs
High
Tyler Hicks
linux (Ubuntu)
Medium
Unassigned
Lucid
Undecided
Colin King
Maverick
Undecided
Tim Gardner
Natty
Undecided
Tim Gardner
Oneiric
Medium
Unassigned
Precise
Undecided
Unassigned
Quantal
Medium
Unassigned

Bug Description

Quite frequently (about once per month), a file in my ecryptfs-encrypted home directory gets a few KiBs of extra trailing garbage bytes (it's usually padded up to about 12 KiB). I have only noticed the error in git repositories so far, probably because git creates a huge number of files, and because it doesn't tend to ignore trailing garbage anywhere.

The trailing garbage usually consists mostly of zero bytes; sometimes I have also seen it contain a copy of parts of the original file.

If I re-mount the ecryptfs volume (by logging out and logging in again), the trailing garbage always disappears; this is why I think it's caused by an ecryptfs bug. I cannot rule out a faulty RAM, either (I have only reproduced it on my laptop, which doesn't have ECC RAM).

I'm using x86-64 Ubuntu 9.10, my ecryptfs volume resides on an ext4 partition.

I understand that it's impossible for you to reproduce the problem given this report, but I'm willing to put some effort in tracking down the cause of this. Do you have any ideas on how I can extract useful debugging information the next time the problem occurs?

Dustin Kirkland  (kirkland) wrote :

Tyler-

You've chased down a few of this git-and-bzr-on-ecryptfs issues before... Any chance you can take a look at this?

Changed in ecryptfs:
importance: Undecided → High
assignee: nobody → Dustin Kirkland (kirkland)
assignee: Dustin Kirkland (kirkland) → Tyler Hicks (tyhicks)
deja_vu (deja-vu) wrote :

I've also run into this a few times (and again just now). The files in question were padded up to 12K, also with the original file data (plus zeros).

I'm using Debian testing with a 2.6.32.3 kernel, also on an ext4 partition.

Oh, wait ... are we talking about encrypted files?

All encrypted files are padded by ecryptfs. That's by design.

Are you seeing any bad data in your cleartext?

deja_vu (deja-vu) wrote :

The cleartext is affected. Which in my case Unison catches when I try to synchronise two computers.

I haven't tried remounting the partition, but copying the lower file somewhere else allows me to recover the original file.

Dustin Kirkland  (kirkland) wrote :

Okay, then I am interested in Tyler's take.

Tyler Hicks (tyhicks) wrote :

Erik and deja_vu - next time you see this, please do the following and report your findings:

1.) Run `stat` on the decrypted file and paste the results.
2.) Run `stat` on the encrypted file and paste the results.
3.) Run `hexdump -Cn 8` against the encrypted file and paste the results.
4.) Remount the eCryptfs mount.
5.) Repeat steps 1 through 3.

Unfortunately, if you're using encrypted filenames, it is going to be difficult to figure out the encrypted filenames in order to run stat and hexdump against those files.

Also, if you don't mind running a bleeding edge kernel, I rewrote the previously buggy truncate path and that patch (http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=5f3ef64f4da1c587cdcfaaac72311225b7df094c) was released in 2.6.33-rc5. If I had to put my money on it, I'd guess that this is happening after truncating the eCryptfs inode.

I ran the git test suite in an eCryptfs mount several times and it only complained about the same tests that failed on plain ext4.

Dustin Kirkland  (kirkland) wrote :

If you're trying to map encrypted -> decrypted filenames, I use this
nasty little hack...

Chmod the file to a really odd permission, like "chmod 123 foo".

Then use find to locate your oddly permissioned file:
 find . -perm 123

Nasty, yes, but it works quite well.

:-Dustin

Erik Carstensen (sandberg) wrote :

I got the error again today, here's the output of the commands you suggested:

Before remount:

$ stat decrypted
  File: `/home/sandberg/4.2/.git/objects/49/776103dc27a460c5210dc388c98f11658a272c'
  Size: 12288 Blocks: 24 IO Block: 4096 regular file
Device: 19h/25d Inode: 474495 Links: 1
Access: (0444/-r--r--r--) Uid: ( 7654/sandberg) Gid: ( 1000/sandberg)
Access: 2010-02-05 11:45:12.000000000 +0100
Modify: 2010-02-02 18:12:07.844575172 +0100
Change: 2010-02-05 11:44:59.413928291 +0100
$ stat encrypted
  File: `/home/.ecryptfs/sandberg/.Private/ECRYPTFS_FNEK_ENCRYPTED.FWaZJpbmp..tZUREe43P.qrEcyavEC2QZFw1ykQVXjRxZSzscBFVoMzQsk--/ECRYPTFS_FNEK_ENCRYPTED.FWaZJpbmp..tZUREe43P.qrEcyavEC2QZFw10yDq6kkbjn9lr7qrnxlAvk--/ECRYPTFS_FNEK_ENCRYPTED.FWaZJpbmp..tZUREe43P.qrEcyavEC2QZFw1la465N-wDANSG0ml4M8DpU--/ECRYPTFS_FNEK_ENCRYPTED.FWaZJpbmp..tZUREe43P.qrEcyavEC2QZFw107I-eQq24En6jzyC.APn1E--/ECRYPTFS_FNEK_ENCRYPTED.FYaZJpbmp..tZUREe43P.qrEcyavEC2QZFw1zNb4HZf1uQT2mFtWKsF287cTiKR8wMOGzmCY2IvqAtFOyyN3siOt83eXT.ooIMCf'
  Size: 12288 Blocks: 24 IO Block: 4096 regular file
Device: 806h/2054d Inode: 474495 Links: 1
Access: (0444/-r--r--r--) Uid: ( 7654/sandberg) Gid: ( 1000/sandberg)
Access: 2010-02-05 11:45:12.873957850 +0100
Modify: 2010-02-02 18:12:07.844575172 +0100
Change: 2010-02-05 11:44:59.413928291 +0100
$ hexdump -Cn 8 encrypted
00000000 00 00 00 00 00 00 00 cd |........|
00000008

After remount:
$ stat decrypted
  File: `/home/sandberg/4.2/.git/objects/49/776103dc27a460c5210dc388c98f11658a272c'
  Size: 205 Blocks: 24 IO Block: 4096 regular file
Device: 16h/22d Inode: 474495 Links: 1
Access: (0444/-r--r--r--) Uid: ( 7654/sandberg) Gid: ( 1000/sandberg)
Access: 2010-02-05 11:45:12.873957850 +0100
Modify: 2010-02-02 18:12:07.844575172 +0100
Change: 2010-02-05 11:44:59.413928291 +0100
$ stat encrypted
  File: `/home/.ecryptfs/sandberg/.Private/ECRYPTFS_FNEK_ENCRYPTED.FWaZJpbmp..tZUREe43P.qrEcyavEC2QZFw1ykQVXjRxZSzscBFVoMzQsk--/ECRYPTFS_FNEK_ENCRYPTED.FWaZJpbmp..tZUREe43P.qrEcyavEC2QZFw10yDq6kkbjn9lr7qrnxlAvk--/ECRYPTFS_FNEK_ENCRYPTED.FWaZJpbmp..tZUREe43P.qrEcyavEC2QZFw1la465N-wDANSG0ml4M8DpU--/ECRYPTFS_FNEK_ENCRYPTED.FWaZJpbmp..tZUREe43P.qrEcyavEC2QZFw107I-eQq24En6jzyC.APn1E--/ECRYPTFS_FNEK_ENCRYPTED.FYaZJpbmp..tZUREe43P.qrEcyavEC2QZFw1zNb4HZf1uQT2mFtWKsF287cTiKR8wMOGzmCY2IvqAtFOyyN3siOt83eXT.ooIMCf'
  Size: 12288 Blocks: 24 IO Block: 4096 regular file
Device: 806h/2054d Inode: 474495 Links: 1
Access: (0444/-r--r--r--) Uid: ( 7654/sandberg) Gid: ( 1000/sandberg)
Access: 2010-02-05 11:45:12.873957850 +0100
Modify: 2010-02-02 18:12:07.844575172 +0100
Change: 2010-02-05 11:44:59.413928291 +0100
$ hexdump -Cn 8 encrypted
00000000 00 00 00 00 00 00 00 cd |........|
00000008

I also saved copies of the decrypted and encrypted files from before and after the remount, if you are interested (it's a rather uninteresting git tree object).

Tyler Hicks (tyhicks) wrote :

Hey Erik - thanks for the *great* report!

So the eCryptfs inode's i_size is out of sync from what is stored in the eCryptfs metadata (the first 8 bytes of the encrypted file). Some initial guesses are that we're missing an i_size_write() somewhere (maybe down an error path?) or we could be incorrectly passing the lower inode's i_size to i_size_write() since the upper and lower i_sizes are the same.

Are you seeing any eCryptfs error messages in the logs? I'll begin looking at the suspect code paths.

Changed in ecryptfs:
status: New → Confirmed
Erik Carstensen (sandberg) wrote :

OK, I found a few ecryptfs errors in the logs. I'm attaching the output of 'grep ryptfs syslog'.

During the logged period, the computer was never rebooted, but it was suspended a couple of times.

It seems that the errors started at feb 3, and that a pair of messages (valid ecryptfs headers not found etc) appears about every 5 mins except when the computer was suspended. It might be of interest that the computer was turned on between
Feb 4 09:50 and Feb 4 16:12:20, with no messages displayed.

There is a bunch of different messages from ecryptfs_read_lower/ecryptfs_readpage/ecryptfs_decrypt_page on Feb 4. If I recall correctly, I didn't notice the broken git files until Feb 5, though.

Erik Carstensen (sandberg) wrote :

I got the error again today, this time I know the error appeared within a couple of hours; during this time I could only see the following message in the logs (it appears in both syslog and kern.log, and in no other log):

Feb 17 13:36:37 ockeghem kernel: [549471.301068] ecryptfs_read_lower: octets_read = [-4]; expected [4096]
Feb 17 13:36:37 ockeghem kernel: [549471.301098] ecryptfs_read_and_validate_header_region: Error reading header region; rc = [-22]

Another possible clue: The only command I did around 13.36 was a 'git status', and when repeating the same 'git status' under strace, the only relevant system call related to the affected file is an lstat:
lstat("src/devices/PI7C9X1XX/pi7c9x110.dml", {st_mode=S_IFREG|0644, st_size=12288, ...}) = 0

Tyler Hicks (tyhicks) wrote :

Hi Erik - Once again, thanks for the excellent work on this bug. I've found the problem in the code, found a consistent way to reproduce the problem and will write a fix soon.

Changed in ecryptfs:
status: Confirmed → Triaged
Erik Carstensen (sandberg) wrote :

Do you know of any way to remount an ecryptfs mount without first having to log out? That would suffice as a workaround most of the time, and would make this bug a lot less annoying for me.

Serge Hallyn (serge-hallyn) wrote :

Well if you're just trying to reproduce in a testcase, you can
probably just do a
custom ecryptfs mount as root:

mkdir testcrypt, testplain
cat > testme.sh << EOF
mount -t ecryptfs <options> testcrypt testplain
echo ab >> testplain/ab
umount testplain
EOF
chmod ugo+x testme.sh

for i in `seq 1 100`; do
  ./testme.sh
done

On Fri, Apr 9, 2010 at 2:38 AM, Erik Carstensen <email address hidden> wrote:
> Do you know of any way to remount an ecryptfs mount without first having
> to log out? That would suffice as a workaround most of the time, and
> would make this bug a lot less annoying for me.
>
> --
> ecryptfs sometimes seems to add trailing garbage to encrypted files
> https://bugs.launchpad.net/bugs/509180
> You received this bug notification because you are a member of eCryptfs
> Developers, which is subscribed to eCryptfs.
>
> Status in eCryptfs - Enterprise Cryptographic Filesystem: Triaged
> Status in “linux” package in Ubuntu: New
>
> Bug description:
> Quite frequently (about once per month), a file in my ecryptfs-encrypted home directory gets a few KiBs of extra trailing garbage bytes (it's usually padded up to about 12 KiB). I have only noticed the error in git repositories so far, probably because git creates a huge number of files, and because it doesn't tend to ignore trailing garbage anywhere.
>
> The trailing garbage usually consists mostly of zero bytes; sometimes I have also seen it contain a copy of parts of the original file.
>
> If I re-mount the ecryptfs volume (by logging out and logging in again), the trailing garbage always disappears; this is why I think it's caused by an ecryptfs bug. I cannot rule out a faulty RAM, either (I have only reproduced it on my laptop, which doesn't have ECC RAM).
>
> I'm using x86-64 Ubuntu 9.10, my ecryptfs volume resides on an ext4 partition.
>
> I understand that it's impossible for you to reproduce the problem given this report, but I'm willing to put some effort in tracking down the cause of this. Do you have any ideas on how I can extract useful debugging information the next time the problem occurs?
>
>
>

tags: added: kj-triage
Jeremy Foshee (jeremyfoshee) wrote :

Hi Erik,

This bug was reported a while ago and there hasn't been any activity in it recently. We were wondering if this is still an issue? Can you try with the latest development release of Ubuntu? ISO CD images are available from http://cdimage.ubuntu.com/releases/ .

If it remains an issue, could you run the following command from a Terminal (Applications->Accessories->Terminal). It will automatically gather and attach updated debug information to this report.

apport-collect -p linux 509180

Also, if you could test the latest upstream kernel available that would be great. It will allow additional upstream developers to examine the issue. Refer to https://wiki.ubuntu.com/KernelMainlineBuilds . Once you've tested the upstream kernel, please remove the 'needs-upstream-testing' tag. This can be done by clicking on the yellow pencil icon next to the tag located at the bottom of the bug description and deleting the 'needs-upstream-testing' text. Please let us know your results.

Thanks in advance.

    [This is an automated message. Apologies if it has reached you inappropriately; please just reply to this message indicating so.]

tags: added: needs-kernel-logs
tags: added: needs-upstream-testing
Changed in linux (Ubuntu):
status: New → Incomplete

Architecture: amd64
ArecordDevices:
 **** List of CAPTURE Hardware Devices ****
 card 0: Intel [HDA Intel], device 0: STAC92xx Analog [STAC92xx Analog]
   Subdevices: 2/2
   Subdevice #0: subdevice #0
   Subdevice #1: subdevice #1
AudioDevicesInUse:
 USER PID ACCESS COMMAND
 /dev/snd/controlC0: sandberg 1987 F.... pulseaudio
CRDA: Error: [Errno 2] No such file or directory
Card0.Amixer.info:
 Card hw:0 'Intel'/'HDA Intel at 0xfe9fc000 irq 21'
   Mixer name : 'SigmaTel STAC9205'
   Components : 'HDA:838476a0,102801fe,00100204 HDA:14f12c06,14f1000f,00100000'
   Controls : 25
   Simple ctrls : 16
DistroRelease: Ubuntu 9.10
HibernationDevice: RESUME=UUID=5a83ce18-fd90-4c2c-ba30-96445496a9f9
InstallationMedia: Ubuntu 9.10 "Karmic Koala" - Release amd64 (20091027)
MachineType: Dell Inc. Latitude D830
Package: linux (not installed)
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
ProcCmdLine: BOOT_IMAGE=/boot/vmlinuz-2.6.31-20-generic root=UUID=5f04b461-c266-4082-80b5-36054e46688e ro quiet splash
ProcEnviron:
 SHELL=/bin/bash
 PATH=(custom, user)
 LANG=en_US.UTF-8
ProcVersionSignature: Ubuntu 2.6.31-20.58-generic
RelatedPackageVersions:
 linux-backports-modules-2.6.31-20-generic N/A
 linux-firmware 1.26
Uname: Linux 2.6.31-20-generic x86_64
UserGroups: adm admin cdrom dialout fuse lpadmin mail sambashare www-data
dmi.bios.date: 06/07/2007
dmi.bios.vendor: Dell Inc.
dmi.bios.version: A02
dmi.board.name: 0HN341
dmi.board.vendor: Dell Inc.
dmi.chassis.type: 8
dmi.chassis.vendor: Dell Inc.
dmi.modalias: dmi:bvnDellInc.:bvrA02:bd06/07/2007:svnDellInc.:pnLatitudeD830:pvr:rvnDellInc.:rn0HN341:rvr:cvnDellInc.:ct8:cvr:
dmi.product.name: Latitude D830
dmi.sys.vendor: Dell Inc.

Changed in linux (Ubuntu):
status: Incomplete → New
tags: added: apport-collected
Erik Carstensen (sandberg) wrote :

The above data was posted with my current kernel, which is the standard one from karmic. With this setup the problem still happens. I will upgrade to lucid shortly, we'll see if the problem still exists there.

Erik Carstensen (sandberg) wrote :

I got the same problem again today, with a new lucid kernel.

uname -a
Linux ockeghem 2.6.32-22-generic #36-Ubuntu SMP Thu Jun 3 19:31:57 UTC 2010 x86_64 GNU/Linux

The following error message is probably related:
Jun 21 14:22:07 ockeghem kernel: [191186.234951] ecryptfs_read_and_validate_header_region: Error reading header region; rc = [-4]

Tyler, you mentioned in February that you were working on a patch; what's its status?

Erik Carstensen (sandberg) wrote :
Download full text (5.9 KiB)

I got this traceback in syslog today, together with a different set of symptoms (no signs of padded files, but a process is helplessly hung; when I try to inspect the process with gdb or strace they hang too and must be SIGKILL:ed. And I cannot suspend). ecryptfs appears in the traceback so I thought it might be another symptom of the same problem. Using the standard lucid kernel.

Jun 22 09:12:05 ockeghem kernel: [220132.893407] general protection fault: 0000
[#1] SMP
Jun 22 09:12:05 ockeghem kernel: [220132.893412] last sysfs file: /sys/devices/L
NXSYSTM:00/LNXSYBUS:00/PNP0C0A:00/power_supply/BAT0/charge_full
Jun 22 09:12:05 ockeghem kernel: [220132.893414] CPU 0
Jun 22 09:12:05 ockeghem kernel: [220132.893416] Modules linked in: ppp_deflate
zlib_deflate bsd_comp ppp_async crc_ccitt option usbserial nls_utf8 isofs usb_st
orage usbhid hid cryptd aes_x86_64 aes_generic binfmt_misc ppdev dm_crypt snd_hd
a_codec_idt snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_
pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event joydev
 snd_seq snd_timer snd_seq_device arc4 psmouse snd pcmcia soundcore snd_page_all
oc serio_raw dell_wmi iwl3945 dell_laptop dcdbas iwlcore mac80211 led_class cfg8
0211 lp parport yenta_socket rsrc_nonstatic pcmcia_core fbcon tileblit font bitb
lit softcursor vga16fb vgastate i915 drm_kms_helper ohci1394 tg3 ieee1394 ahci i
ntel_agp drm i2c_algo_bit video output
Jun 22 09:12:05 ockeghem kernel: [220132.893466] Pid: 28411, comm: simics-common
 Not tainted 2.6.32-22-generic #36-Ubuntu Latitude D830
Jun 22 09:12:05 ockeghem kernel: [220132.893469] RIP: 0010:[<ffffffff810f3859>]
 [<ffffffff810f3859>] find_get_page+0x39/0xa0
Jun 22 09:12:05 ockeghem kernel: [220132.893477] RSP: 0018:ffff880001c81918 EFL
AGS: 00010203
Jun 22 09:12:05 ockeghem kernel: [220132.893479] RAX: 07371f01fffe00ff RBX: ffff
88009b8be600 RCX: ffff8800bf6813c8
Jun 22 09:12:05 ockeghem kernel: [220132.893481] RDX: 0000000000000000 RSI: 0000
000000000040 RDI: 07371f01fffe0100
Jun 22 09:12:05 ockeghem kernel: [220132.893483] RBP: ffff880001c81928 R08: 0000000000001000 R09: 0000000000000001
Jun 22 09:12:05 ockeghem kernel: [220132.893485] R10: ffff880001c81fd8 R11: 0000000000000000 R12: 0000000000000040
Jun 22 09:12:05 ockeghem kernel: [220132.893487] R13: 0000000000000040 R14: ffff880073b69d80 R15: 0000000000000000
Jun 22 09:12:05 ockeghem kernel: [220132.893490] FS: 00007f905ffaa700(0000) GS:ffff880028200000(0000) knlGS:0000000000000000
Jun 22 09:12:05 ockeghem kernel: [220132.893492] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Jun 22 09:12:05 ockeghem kernel: [220132.893494] CR2: 00007f905b31b008 CR3: 000000009228d000 CR4: 00000000000006f0
Jun 22 09:12:05 ockeghem kernel: [220132.893496] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Jun 22 09:12:05 ockeghem kernel: [220132.893498] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Jun 22 09:12:05 ockeghem kernel: [220132.893501] Process simics-common (pid: 28411, threadinfo ffff880001c80000, task ffff88008fd2dbc0)
Jun 22 09:12:05 ockeghem kernel: [220132.893502] Stack:
Jun 22 09:12:05 ockeghem kernel: [2201...

Read more...

Andres Jaan Tack (ajtack) wrote :

I am running into this problem as well, running Ubuntu 10.04. I have git objects being padded up to 12kB.

Is there a formulaic workaround?

Is there some information I can provide about the incidence on my system?

Andres Jaan Tack (ajtack) wrote :

No need for a workaround: I think I have it, for my particular situation.

http://superuser.com/questions/162589/problem-with-git-corrupted-files/163034#163034

The offer for more details still stands, as I imagine this will occur in the future.

Erik Carstensen (sandberg) wrote :

One painful workaround is to not touch the file and remount the ecryptfs partition (which usually means that you have to log out and re-login).

Another workaround is to use a script that I just wrote, which automatically tries to drop the trailing garbage (kind of like the link you posted, but automated)

The link you posted didn't observe these things:
- In some cases the trailing garbage can contain non-zeros; if the original object size was s this is typically seen in the intervals [4096, 4096 + s) and [8192, 8192+s).
- For non-blobs, there is a small probability (1/256) that the original object ends with a 0 byte.

My script will therefore try all 12288 possibile lengths, but in a smart order so the above cases are tried early.

To use it, go to the working tree's root directory and run the script, with the sha1 sum as only argument.

The git corruption symptom still occurs in 10.10 with ext4 + ecryptfs home.

And Erik's solution worked perfectly for me:
https://bugs.launchpad.net/ecryptfs/+bug/509180/comments/41

So, thank you very much for the workaround !

Brad Figg (brad-figg) on 2010-12-03
tags: added: acpi-method-return

From my dmesg:

[111706.640947] ecryptfs_read_and_validate_header_region: Error reading header region; rc = [-4]
[111706.640952] Valid eCryptfs headers not found in file header region or xattr region
[111706.640953] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

Error occurred in a git repository. Don't know exactly what I did.

Gioele Barabucci (gioele) wrote :

This problem is still present in Ubuntu Lucid 10.04.1 LTS.

A completely up-to-date installation with eCryptFS just trashed some of the files stored under $HOME.

Thomas Perl (thp) wrote :

I still have this issue on Ubuntu 10.10, versions of ecryptfs packages:

ii ecryptfs-utils 83-0ubuntu3 ecryptfs cryptographic filesystem (utilities
ii libecryptfs0 83-0ubuntu3 ecryptfs cryptographic filesystem (library)

Kernel version: (uname -r) 2.6.35-25-generic

The git-remove-trailing-garbage.py script from comment 41 works as a workaround. This was in a Git repository inside my ecryptfs-mounted $HOME. The "outer" filesystem (i.e. that of "/") is ext3.

Tyler Hicks (tyhicks) wrote :

Hi Paolo - that patch is just masking the real problem behind this bug. This one is caused by a bad error path after failing to read the crypto metadata. I'll attach a fix for that.

We may want to still use something like the patch you linked to, in combination with the following fix, though.

Changed in ecryptfs:
status: Triaged → In Progress
Tyler Hicks (tyhicks) wrote :

I may end up breaking this one up into two patches before upstreaming it. The -EIO piece is a bit unrelated to the rest of the patch.

Tyler Hicks (tyhicks) wrote :

Just so I don't forget, a decent way of reproducing this is by doing the following:

(ext4 is mounted at /lower and eCryptfs is mounted at /upper and foo isn't created until the truncate below)

# truncate -s 1 /upper/foo
# hexedit /lower/foo
      Note: increment the 9th byte by 1 so that the eCryptfs marker fails validation
# umount /upper/foo
# mount -i /upper/foo
# hexdump -C /upper/foo
hexdump: /upper/foo: Invalid argument
hexdump: /upper/foo: Bad file descriptor
# hexedit /lower/foo
      Note: decrement the 9th byte by 1 so that the eCryptfs marker is correct again
# hexdump -C /upper/foo
      Note: You should see extra zeroes at the end of the file

Dustin Kirkland  (kirkland) wrote :

Tyler,

When you upstream this patch, would you please note that here with the git commits? We're going to want to pull this into the Ubuntu kernel for Natty.

Thanks,
Dustin

Gioele Barabucci (gioele) wrote :

Can you please also backport this fix to the 10.04.x LTS kernel?

With this problem eCryptfs for $HOME (advertised at installation time) is basically unusable in Lucid.

Lealcy B. Junior (lealcy) wrote :

I'm got a bunch of this messages on my syslog:

Mar 15 16:54:40 lealcy kernel: [448054.760153] Valid eCryptfs headers not found in file header region or xattr region
Mar 15 16:54:40 lealcy kernel: [448054.760158] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
Mar 15 16:54:41 lealcy kernel: [448054.824814] Valid eCryptfs headers not found in file header region or xattr region
Mar 15 16:54:41 lealcy kernel: [448054.824819] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
Mar 15 16:54:41 lealcy kernel: [448054.842427] Valid eCryptfs headers not found in file header region or xattr region
Mar 15 16:54:41 lealcy kernel: [448054.842434] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
Mar 15 16:54:41 lealcy kernel: [448054.898242] Valid eCryptfs headers not found in file header region or xattr region
Mar 15 16:54:41 lealcy kernel: [448054.898247] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

My linux: Linux version 2.6.35-27-generic (buildd@crested) (gcc version 4.4.5 (Ubuntu/Linaro 4.4.4-14ubuntu5) ) #48-Ubuntu SMP Tue Feb 22 20:25:46 UTC 2011 x86_64 GNU/Linux

I don't have any 0 byte files on my ~/.Private folder.

Tyler Hicks (tyhicks) wrote :

After doing some more testing, I realized my previous fix is incorrect. It doesn't allow lookups of lower files that don't have proper metadata (plaintext files, 0 length files, etc.).

I'm attaching another fix, which will likely go upstream for 2.6.39-rc1.

Roland Dreier (roland.dreier) wrote :

Does this latest patch address all the cases of signals interrupting ecryptfs operations? I don't know enough about ecryptfs to know whether this metadata problem is the only place the issue hits.

I've started seeing problems where I get

    ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
    ecryptfs_write_begin: Error decrypting page at index [15315]; rc = [-4]

in the kernel log, and then later tasks hang in sync_page called from truncate_inode_pages from ecryptfs_evict_inode.

Tyler,

Should we probably carry this fix in Ubuntu's 2.6.38 kernel for 11.04?

Dustin

Tyler Hicks (tyhicks) wrote :

On Wed Mar 16, 2011 at 12:19:04AM -0000, Roland Dreier <email address hidden> wrote:
> Does this latest patch address all the cases of signals interrupting
> ecryptfs operations? I don't know enough about ecryptfs to know whether
> this metadata problem is the only place the issue hits.

The latest patch allows userspace to handle any interrupted eCryptfs
operations. When we're trying to read from the lower filesystem,
vfs_read() sometimes returns with -EINTR, so we'll just propagate that
to userspace and let the app deal with it.

>
> I've started seeing problems where I get
>
> ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
> ecryptfs_write_begin: Error decrypting page at index [15315]; rc = [-4]
>
> in the kernel log, and then later tasks hang in sync_page called from
> truncate_inode_pages from ecryptfs_evict_inode.

That's another bug that I just stumbled across myself and will have
fixed in the 2.6.39-rc1 time frame. We're not unlocking the page in the
ecryptfs_write_begin() error path.

Tyler Hicks (tyhicks) wrote :

On Wed Mar 16, 2011 at 02:48:37PM -0500, Dustin Kirkland <email address hidden> wrote:
> Should we probably carry this fix in Ubuntu's 2.6.38 kernel for 11.04?

Yes, once it goes upstream. I kept it simple for easy back porting.
While I don't think there's really any on-disk data corruption here,
this is an annoying problem for those who are affected.

BTW, I normally tag any important patches, which I feel are
backport-worthy, with 'Cc: <email address hidden>' and this one will get
that tag when it goes upstream.

Shahar Or (mightyiam) wrote :

Would it be possible to backport this to all supported releases?

One reason to do so is that users are getting scared about their data being corrupted, even if it is not so.

Sam Liddicott (sam-liddicott) wrote :

If the user copies or archives the "corrupted looking" file then their copy has become corrupted.

Copying files to a memory stick should me a good backup not a corrupt backup.

The excuse "your data is OK.... until you try to read it" isn't much of a distinction....

Andy Whitcroft (apw) on 2011-03-17
tags: added: kernel-key
removed: needs-kernel-logs needs-upstream-testing
Andy Whitcroft (apw) on 2011-03-17
Changed in linux (Ubuntu):
status: New → Triaged
importance: Undecided → Medium
assignee: nobody → Andy Whitcroft (apw)
John Johansen (jjohansen) wrote :

shahar Or,

We will pick this patch up for older releases when tyler submits it upstream to the stable kernel trees

Changed in linux (Ubuntu):
assignee: Andy Whitcroft (apw) → John Johansen (jjohansen)
Andy Whitcroft (apw) wrote :

Ok i have pulled Tylers patch up to the Natty kernel for testing. Could those of you who are hitting this regularly please check out these kernels and let us know if they resolve the issues for you. Please report any testing here. Kerenls are at the URL below:

    http://people.canonical.com/~apw/lp509180-natty/

Thanks!

Changed in linux (Ubuntu):
status: Triaged → Incomplete
Shahar Or (mightyiam) wrote :

Thanks, John,

Good work!

Nicolas Diogo (nicolasdiogo) wrote :

same error here

had my $HOME full at on point and there was no error warning about it.

after deleting some files to make room.
and rebooting. i can no longer login using KDE.

i have run:

find $HOME/.Private/ -size 0c -exec ls '{}' \; | wc -l

and got:

203

i also noticed that my $HOME/.ecryptfs is corrupted.
but i am able to access all my files still.

so what is the next option here? delete all these (as they are empty anyhow.
how should i proceed.

i suppose we should try to fix the BUG of no alert when running out of space here

Paolo Bonzini (bonzini) wrote :

Can you please update the patch so that it avoids spamming the kernel log upon EINTR?

Tomi Pieviläinen (tpievila) wrote :

While it doesn't seem to be the focus of this bug, the relevant bugs have been marked as duplicates of this one.

I'm running 2.6.38-8 in Natty beta, and still getting a lot of

Apr 12 14:46:32 puppy-ubuntu kernel: [31106.234038] Valid eCryptfs headers not found in file header region or xattr region
Apr 12 14:46:32 puppy-ubuntu kernel: [31106.234047] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

so it would seem that either the patch hasn't fixed this, or #372014 is not actually a duplicate.

Quoting Tomi Pieviläinen (<email address hidden>):
> While it doesn't seem to be the focus of this bug, the relevant bugs
> have been marked as duplicates of this one.
>
> I'm running 2.6.38-8 in Natty beta, and still getting a lot of
>
> Apr 12 14:46:32 puppy-ubuntu kernel: [31106.234038] Valid eCryptfs headers not found in file header region or xattr region
> Apr 12 14:46:32 puppy-ubuntu kernel: [31106.234047] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
>
> so it would seem that either the patch hasn't fixed this, or #372014 is
> not actually a duplicate.

Is this with clean underlying fs? If non-ecryptfs or corrupted files are
already there, you'll keep getting those warning until you remove them.
The bug would only be the creation of new files.

It seems like it might be helpful to log the underlying inode number
in these printks.

Tyler Hicks (tyhicks) wrote :

Linus has merged 3aeb86ea4cd15f728147a3bd5469a205ada8c767, which is the fix for this bug.

Changed in ecryptfs:
status: In Progress → Fix Committed
Tyler Hicks (tyhicks) wrote :

This fix was released in 2.6.39-rc5 as commit 3aeb86ea4cd15f728147a3bd5469a205ada8c767

Changed in ecryptfs:
status: Fix Committed → Fix Released
Brad Figg (brad-figg) on 2011-05-05
tags: added: b73a1py79
Shahar Or (mightyiam) wrote :

What about the damage that's already been done to files? This should be pretty common amongst Ubuntu users who've enabled their home directory encryption feature.

Mine is. I've upgraded to natty and my dmesg is still full of those errors.

Erik Carstensen (sandberg) wrote :

> What about the damage that's already been done to files

The damage is temporary and disappears as soon as you unmount the ecryptfs volume, unless you for some reason read the file and write it back to disk.

I wrote a script that fixes up git objects broken by this bug: https://bugs.launchpad.net/ecryptfs/+bug/509180/comments/41
If you have any non-git files that are affected by this, you might want to do something similar based on the script.

> Mine is. I've upgraded to natty and my dmesg is still full of those errors.

I think the dmesg errors are just one symptom of the bug; trailing garbage is another. Can you still see trailing garbage in any of your files?

Sam Liddicott (sam-liddicott) wrote :

Repair script works great and is painless! Thanks very much for posting that.

Thorsten Zachmann (t-zachmann) wrote :

I had two times the error again since I updated to natty. From the bug messages I'm not sure the fix is included in nattay. Has this been included in the kernel available on natty?

Another request for clarification on the status of the fix: has it been released for Maverick?

I'm continuing to get these error messages very frequently, followed immediately by the system becoming totally unresponsive, at which point I have to hard power off. This can't be good for the hardware nor the data and, I'm sorry, but the lack of clear information on such a critical bug is very, very frustrating.

I notice there is no mention of freezes/hangs on this bug, which is maybe why this has been considered as important, but that's consistently what I'm experiencing and there are several reports of that on this duplicate: https://bugs.launchpad.net/ecryptfs/+bug/372014

agent 8131 (agent-8131) wrote :

I found this to be helpful for finding the corrupted files. They may not be 0 length:

find . -type f -exec cat {} \; > /dev/null

and look for "Input/output error"

taken from:

https://bugs.launchpad.net/ubuntu/+source/ecryptfs-utils/+bug/372014/comments/68

Shahar Or (mightyiam) wrote :

Thanks for the tip, agent!

Ville Ranki (ville-ranki) wrote :

I just got this issue on an up to date Natty x86_32. Ubuntu crashed totally while i was compiling and after reboot i got Input/output error's on files modified by the build (Makefiles, .o's etc). Modifying files (for example with make) doesn't work, only deleting helps. This bug may cause loss of data, so I'd treat it with high priority.

avdd (avdd) wrote :

Will this fix be released for lucid? When?

avdd (avdd) wrote :

I found this bug because I am observing the padding effects (although to other multiples of 4K). To me this is filesystem corruption plain and simple. The only conclusion is that ecryptfs in lucid (LTS) is buggy and should not be used. This is somewhat disappointing as it was stable on karmic.

Incidentally, to confirm that I am seeing the same bug, I tried the steps from comment #50. First, I don't know what this means:

# umount /upper/foo
# mount -i /upper/foo

Is it a typo, or am I missing something about mounting files?

Second, assuming a typo, I cannot reproduce the padding effect.

This bug was filed against a series that is no longer supported and so is being marked as Won't Fix. If this issue still exists in a supported series, please file a new bug.

This change has been made by an automated script, maintained by the Ubuntu Kernel Team.

Changed in linux (Ubuntu):
status: Incomplete → Won't Fix
avdd (avdd) wrote :

What series? Where is the series stated? How do I report a bug against a series?

Tuomas Heino (iheino+ub) wrote :

Where should I report a bug on mentioned "automated script" ignoring duplicates' series?
This one was originally reported against Karmic, but duplicates include all later releases and some earlier ones as well.
Several other bug reports besides this one may have been marked "Won't Fix" by said script.

avdd (avdd) wrote :

As requested, new bug report filed here: https://bugs.launchpad.net/ecryptfs/+bug/810860

Sam Liddicott (sam-liddicott) wrote :

How did get launchpad get to be so rubbish?

The typical ubuntu scenario is like like this:

1. file a bug
2
. wait 1-2 years
3. bug gets marked as invalid
4. if you are lucky it was fixed upstream and is fixed in a future release

the use of the system seems designed to disuade people from using launchpad.
Rather than the point of "report all bugs" it is the point of "being insulted for reporting any bug".

And this is a serious bug - ubuntu is corrupt users files if they use an offered feature.

Mint is where it is at; while ubuntu is "debian done nicely" - mint seems to be: "ubuntu with the bugs fixed"

I'm still on ubuntu but I'm not expecting to stay, I've had enough of being insulted by launchpad scripts.

elementz (memetical) wrote :

+1 for Sam Liddicots comment!
I already tried the ecrypfs mailinglist quite some time ago, but did not hear anything back from the developers til this day.
I believe our best option is simply NOT USING ecryptfs in a production environment. Especially since developers in ubuntu and ecryptfs itself are so darn unresponsive in this matter.

Sam Liddicott (sam-liddicott) wrote :

What makes it worse is that Shuttleworth abandoned his bounty idea; so it is impossible for individual users financially support Canonical or Ubuntu, or get support or contribute to support on issues that matter.

Individual users (non-enterprise customers) are reduced to the level of beggars who can contribute if they don't mind being insulted by launchpad. And any value in the contributions will leak away.

Sam Liddicott (sam-liddicott) wrote :

I know I'm filing in the wrong forum. I did open a question on this: https://answers.launchpad.net/ubuntu/+question/157752
 but someone re-filed it wrongly and left it.

I try to point out what is going wrong, but actively people don't want to hear.

avdd (avdd) wrote :

reddit bomb?

Sam Liddicott (sam-liddicott) wrote :

what does "reddit bomb" mean?

elementz (memetical) wrote :

This will be too offtopic, but I since this bug is abondened anyways
 I guess avdd refers to the fact, that it could be of help to raise awareness of this topic and the shortcomings of launchpad/ubuntu bugfixing over at reddit. If upvoted enough, this could lead to high exposure and gain traction simply by the sheer mass of users interested in the topic.
But, I highly doubt that useres over at reddit will really care. Maybe hackernews could be a better place for this.

On Fri Jul 15, 2011 at 09:34:47AM -0000, memetical <email address hidden> wrote:
> I already tried the ecrypfs mailinglist quite some time ago, but did
> not hear anything back from the developers til this day. I believe
> our best option is simply NOT USING ecryptfs in a production
> environment. Especially since developers in ubuntu and ecryptfs itself
> are so darn unresponsive in this matter.

Sorry - I must have missed your message on the mailing list. This bug
has been fixed upstream (I'm the upstream maintainer) and will need to
be backported to Ubuntu kernels. I'll alert a member of the Ubuntu
kernel team about the severity of this bug.

Tim Gardner (timg-tpi) on 2011-07-15
Changed in linux (Ubuntu Oneiric):
status: Won't Fix → Fix Released
Changed in linux (Ubuntu Lucid):
assignee: nobody → John Johansen (jjohansen)
status: New → In Progress
Changed in linux (Ubuntu Maverick):
assignee: nobody → John Johansen (jjohansen)
status: New → In Progress
Changed in linux (Ubuntu Natty):
assignee: nobody → John Johansen (jjohansen)
status: New → In Progress
Tim Gardner (timg-tpi) wrote :

Tyler - is there a backportable commit? I see 3 with the word 'corrupt' in them.

Tyler Hicks (tyhicks) wrote :

Tim - upstream git commit is 3aeb86ea4cd15f728147a3bd5469a205ada8c767

Tim Gardner (timg-tpi) wrote :

Tyler - please review this backport to 2.6.38 (natty). The only thing I had to think about was the value to assign ECRYPTFS_I_SIZE_INITIALIZED. I chose to simply increment the mask by <<1.

Changed in linux (Ubuntu Natty):
assignee: John Johansen (jjohansen) → Tim Gardner (timg-tpi)
Tim Gardner (timg-tpi) wrote :
Changed in linux (Ubuntu Maverick):
assignee: John Johansen (jjohansen) → Tim Gardner (timg-tpi)
Changed in linux (Ubuntu Oneiric):
assignee: John Johansen (jjohansen) → nobody
Dustin Kirkland  (kirkland) wrote :

From Bug #810860:

Tyler Hicks (tyhicks) wrote:
  The upstream fix for this is http://git.kernel.org/linus/3aeb86ea4cd15f728147a3bd5469a205ada8c767

Tyler Hicks (tyhicks) wrote :

On Fri Jul 15, 2011 at 05:15:08PM -0000, Tim Gardner <email address hidden> wrote:
> Tyler - please review this backport to 2.6.38 (natty). The only thing
> I had to think about was the value to assign
> ECRYPTFS_I_SIZE_INITIALIZED. I chose to simply increment the mask by
> <<1.

Ack - looks good to me.

FYI, this trivial patch is why you had to use different flag values:
http://git.kernel.org/linus/fed8859b3ab94274c986cbdf7d27130e0545f02c

Those flags never hit the disk (only live in memory), so there is no
harm in the patch that you applied.

Tim Gardner (timg-tpi) on 2011-07-20
Changed in linux (Ubuntu Natty):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Maverick):
status: In Progress → Fix Committed
Changed in linux (Ubuntu Lucid):
status: In Progress → Fix Committed
Tim Gardner (timg-tpi) wrote :

SRU Justification

Impact: files mounted under ecryptfs can be corrupted.

Patch Description: backport upstream 3aeb86ea4cd15f728147a3bd5469a205ada8c767, eCryptfs: Handle failed metadata read in lookup

Tim Gardner (timg-tpi) wrote :

Tyler - I updated the patch for Lucid (2.6.32) to use num_header_bytes_at_front instead of metadata_size in ecryptfs_i_size_init(). Would you have another look to make sure this version isn't going to toast the on disk format ? Thanks.

Ian! D. Allen (idallen) wrote :

Linux linux 2.6.38-11-generic #47-Ubuntu SMP Fri Jul 15 19:27:09 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Description: Ubuntu 11.04
Codename: natty

After months of kernel and application crashes and corrupt files, I
was ready to abandon Ubuntu 11.04 and go to some other distribution.
Before doing that, I wrote a script to repeatedly md5sum the files in
my ecryptfs directory and compare the results. I went single-user to
ensure that nothing else was running and, sure enough, the script showed
that md5sums changed randomly on files that I never touched.

I went looking today and found this launchpad entry indicating that you
have known for 18 months (since 9.10!) that ecryptfs is broken and is
not suitable for production use.

Why didn't you disable it in recent releases? My 11.04 install offered
to encrypt my home directory during installation, yet you've known for
18 months that to do so would corrupt my system and crash it.

I am frustrated that nobody took prompt action to disable the use of
ecryptfs and to notify those of us using it.

Ian-

I'd love to understand why some users are affected by this, and other
users aren't. Some people, such as myself, have hundreds of GB,
thousands of files, using them all day, every day, and do not suffer
from this problem.

--
:-Dustin

Ian! D. Allen (idallen) wrote :

@Dustin
Things that might make a difference: 64-bit SMP kernel on AMD quad-core, 120GB SSD hard disk, 8GB memory.

Gioele Barabucci (gioele) wrote :

@idallen:

This also happens on a somewhat old Dell Optiplex SD280: HT Pentium 4, 1 GiB of RAM, rotational HD.

Mario Vukelic (mario-vukelic) wrote :

@Dustin: I have a 300 GB internal traditional HD and am running the amd64+mac version (Natty, fresh install) on an Intel Core 2 Duo P8700, 4 GB RAM

I've just been bitten with this. Soemthing ate 500GB (yes GB!) one night when my machine was on.. I couldn't find out what had done it, the machine eventually restarted and the culprit was gone so it must have been some crazy cache issue. It has made a right mess of KDE on me, Akonadi looks to be broken because of it. I continually get messages like:

[32223.735781] Valid eCryptfs headers not found in file header region or xattr region
[32223.735788] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

Kernel is: 2.6.38-11-generic #48-Ubuntu SMP Fri Jul 29 19:02:55 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Running find ../.ecryptfs/felim/.Private -xdev -size 0c -exec ls '{}' \; | wc -l

Gives me 351 files... I've tried following this thread but it's hard to know what can be done. IS a restore from backup the only real option?

Tim Gardner (timg-tpi) wrote :

According to the upstream maintainer (Tyler Hicks) the patch to fix this issue is https://bugs.launchpad.net/ecryptfs/+bug/509180/comments/96 which has been applied to Lucid/Maverick/Natty. You can get an experimental kernel containing this patch (and others) from https://launchpad.net/~kernel-ppa/+archive/pre-proposed

Hmmm I did that and I am running:

2.6.38-11-generic #49~pre201108030903-Ubuntu SMP Wed Aug 3 09:34:11 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

But just got:

[ 155.829002] Valid eCryptfs headers not found in file header region or xattr region
[ 155.829008] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

Herton R. Krzesinski (herton) wrote :

This bug is awaiting verification that the kernel for maverick in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-maverick' to 'verification-done-maverick'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-maverick
Nicolas Diogo (nicolasdiogo) wrote :

hello

just joining the choir

it is still present on Natty x64

should this not have a high level

do we require further info?

======================================================
# mount
/dev/sda3 on / type ext4 (rw,noatime,commit=0)
proc on /proc type proc (rw,noexec,nosuid,nodev)
none on /sys type sysfs (rw,noexec,nosuid,nodev)
fusectl on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
none on /dev type devtmpfs (rw,mode=0755)
none on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
none on /dev/shm type tmpfs (rw,nosuid,nodev)
none on /var/run type tmpfs (rw,nosuid,mode=0755)
none on /var/lock type tmpfs (rw,noexec,nosuid,nodev)
/dev/sda6 on /home type ext4 (rw,commit=0)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noexec,nosuid,nodev)
/home/MYUSER/.Private on /home/MYUSER type ecryptfs (ecryptfs_check_dev_ruid,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs,ecryptfs_sig=<KEY>,ecryptfs_fnek_sig=<KEY2>)
gvfs-fuse-daemon on /home/MYUSER/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev,user=MYUSER)
/dev/sda1 on /media/System Reserved type fuseblk (rw,nosuid,nodev,allow_other,blksize=4096,default_permissions)
======================================================

dmesg
======================================================
[34279.235317] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[34279.235390] Valid eCryptfs headers not found in file header region or xattr region
[======================================================

please let me know if you want more info

thanks,

Nicolas Diogo (nicolasdiogo) wrote :

added the option for 'proposed' updates on synaptics

and the problem persists.

=======================================================
[ 481.927093] Valid eCryptfs headers not found in file header region or xattr region
[ 481.927099] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 481.927312] Valid eCryptfs headers not found in file header region or xattr region
[ 481.927317] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
=======================================================

my question is - how reliable is this method of storing my personal data encrypted?

besides keeping a backup of everything i have stored on my $HOME- is there an option to this?

thanks,

Steve Conklin (sconklin) wrote :

This bug is awaiting verification that the kernel for Lucid in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-lucid' to 'verification-done-lucid'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-lucid
Steve Conklin (sconklin) wrote :

For the people reporting that this is still present after updating the kernel from -proposed, can you absolutely confirm for us which version of the kernel you are running by pasting the output from 'uname -a' into this bug?

Thanks. It's not that we doubt you, we just want to double check.

Steve Conklin (sconklin) wrote :

This fix has not been verified as being fixed in the -proposed kernels for Lucid or Maverick, and the patch will be reverted from those series

tags: added: verification-reverted-lucid
removed: verification-needed-lucid
Herton R. Krzesinski (herton) wrote :

This bug is awaiting verification that the kernel for Natty in -proposed solves the problem. Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-natty' to 'verification-done-natty'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-natty
Tomi Pieviläinen (tpievila) wrote :

$ uname -a
Linux puppy-ubuntu 2.6.39-020639rc5-generic #201105041556 SMP Wed May 4 15:59:47 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

I enabled proposed on natty, but still get the messages. Does the new kernel fix an already occured problem, or should I remove all the offending files like and see if the problem reappears?

Herton R. Krzesinski (herton) wrote :

@Tomi: since the original problem is corruption of files (extra trailing garbage bytes added), I would expect that you have to ignore already corrupted files and see if it happens again with non corrupted ones.

tags: added: verification-reverted-maverick
removed: verification-needed-maverick
Tomi Pieviläinen (tpievila) wrote :

I have deleted (again) the unreadable files, but I think a week is unreasonably short time to verify the fix, as this bug does not have a certain way to reproduce. Also I don't see how it makes sense to remove the fix AND close the bug without verification leaving it certainly not fixed.

Tomi Pieviläinen (tpievila) wrote :

I tried "cat /dev/zero > zerofile" since heavy writing / disk exhaustion seemed to be related to the problem. And sure enough, after the empty space went to zero, same errors appeared again to dmesg. After deleting the zerofile I logged out and tried to login, but it failed. Then I tried to shutdown cleanly, but it failed. After a hard reboot the disk space had been freed, and the .Private contained again zero length files. The perm trick reveals that zeroed files include gnome applet files and other stuff that logically could have been written into without user action. All of them are now completely unreadable due to input/output errors.

I'd say that ecryptfs is still extremely dangerous to use due to data corruption problems in a common scenario, writing too much. It simply cannot be offered to users without any kind of warning that this happens.

Tomi Pieviläinen (tpievila) wrote :

I'm now getting the same errors in dmesg, in fact I'm not doing anything and there's just more and more (seems like two errors every two seconds). *But* there's no zero sized files under .Private so I no longer can even remove them and cannot get rid of the erros even temporarily...

Getting the same errors constantly in dmesg (several times in a second) with 64-bit natty an 0 0 length files in .Private. First started to happen bit after upgrade to natty, then disappeared for few days and now started to happen again yesterday. Don't seem to be able to find the offending files anyhow, so even if the proposed would help, couldn't fix the existing prob...

As this is hanging up nautilus constantly, it makes normal usage almost impossible. Somehow btw nautilus runs fine when running with gksudo.

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 2.6.35-30.59

---------------
linux (2.6.35-30.59) maverick-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #837449

  [ Upstream Kernel Changes ]

  * Revert "drm/nv50-nvc0: work around an evo channel hang that some people
    see"
  * Revert "eCryptfs: Handle failed metadata read in lookup"

linux (2.6.35-30.58) maverick-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #828376

  [ Upstream Kernel Changes ]

  * proc: fix oops on invalid /proc/<pid>/maps access, CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020

linux (2.6.35-30.57) maverick-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #823306

  [ Tim Gardner ]

  * SAUCE: rtl8192se: Force a build for a 2.6/3.0 kernel
    - LP: #805494
  * [Config] Add enic/fnic to udebs
    - LP: #801610

  [ Upstream Kernel Changes ]

  * taskstats: don't allow duplicate entries in listener mode,
    CVE-2011-2484
    - LP: #806390
    - CVE-2011-2484
  * dccp: handle invalid feature options length, CVE-2011-1770
    - LP: #806375
    - CVE-2011-1770
  * eCryptfs: Handle failed metadata read in lookup
    - LP: #509180
  * pagemap: close races with suid execve, CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020
  * report errors in /proc/*/*map* sanely, CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020
  * close race in /proc/*/environ, CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020
  * auxv: require the target to be tracable (or yourself), CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020
  * deal with races in /proc/*/{syscall, stack, personality}, CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020
  * rose: Add length checks to CALL_REQUEST parsing, CVE-2011-1493
    - LP: #816550
    - CVE-2011-1493
  * Bluetooth: l2cap and rfcomm: fix 1 byte infoleak to userspace.
    - LP: #819569
    - CVE-2011-2492
  * drm/nv50-nvc0: work around an evo channel hang that some people see
    - LP: #583760
 -- Herton Ronaldo Krzesinski <email address hidden> Tue, 30 Aug 2011 12:11:13 -0300

Changed in linux (Ubuntu Maverick):
status: Fix Committed → Fix Released
Herton R. Krzesinski (herton) wrote :

Please ignore the fixed messages from the janitor. The included fix was reverted because verification was not done or failed on the three releases (natty, maverick, lucid).

tags: added: verification-reverted-natty
removed: verification-needed-natty
Changed in linux (Ubuntu Maverick):
status: Fix Released → Incomplete
Changed in linux (Ubuntu Lucid):
status: Fix Committed → Incomplete
Changed in linux (Ubuntu Natty):
status: Fix Committed → Incomplete
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 2.6.38-11.50

---------------
linux (2.6.38-11.50) natty-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #848246

  [ Upstream Kernel Changes ]

  * Revert "eCryptfs: Handle failed metadata read in lookup"
  * Revert "KVM: fix kvmclock regression due to missing clock update"
  * Revert "ath9k: use split rx buffers to get rid of order-1 skb
    allocations"

linux (2.6.38-11.49) natty-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #836903

  [ Adam Jackson ]

  * SAUCE: drm/i915/pch: Fix integer math bugs in panel fitting
    - LP: #753994

  [ Keng-Yu Lin ]

  * SAUCE: Input: ALPS - Enable Intellimouse mode for Lenovo Zhaoyang E47
    - LP: #632884, #803005

  [ Stefan Bader ]

  * [Config] Force perf to use libiberty for demangling
    - LP: #783660

  [ Tim Gardner ]

  * [Config] Add enic/fnic to udebs
    - LP: #801610

  [ Upstream Kernel Changes ]

  * eeepc-wmi: add keys found on EeePC 1215T
    - LP: #812644
  * eCryptfs: Handle failed metadata read in lookup
    - LP: #509180
  * pagemap: close races with suid execve, CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020
  * report errors in /proc/*/*map* sanely, CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020
  * close race in /proc/*/environ, CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020
  * auxv: require the target to be tracable (or yourself), CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020
  * deal with races in /proc/*/{syscall, stack, personality}, CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020
  * vmscan: fix a livelock in kswapd
    - LP: #813797
  * mmc: Add PCI fixup quirks for Ricoh 1180:e823 reader
    - LP: #773524
  * mmc: Added quirks for Ricoh 1180:e823 lower base clock frequency
    - LP: #773524
  * rose: Add length checks to CALL_REQUEST parsing, CVE-2011-1493
    - LP: #816550
    - CVE-2011-1493
  * pata_marvell: Add support for 88SE91A0, 88SE91A4
    - LP: #777325
  * GFS2: make sure fallocate bytes is a multiple of blksize, CVE-2011-2689
    - LP: #819572
    - CVE-2011-2689
  * Bluetooth: l2cap and rfcomm: fix 1 byte infoleak to userspace.
    - LP: #819569
    - CVE-2011-2492
  * drm/nv50-nvc0: work around an evo channel hang that some people see
    - LP: #583760
  * KVM: fix kvmclock regression due to missing clock update
    - LP: #795717
  * Add mount option to check uid of device being mounted = expect uid,
    CVE-2011-1833
    - LP: #732628
    - CVE-2011-1833
  * proc: fix oops on invalid /proc/<pid>/maps access, CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020
  * ipv6: make fragment identifications less predictable, CVE-2011-2699
    - LP: #827685
    - CVE-2011-2699
  * ath9k: use split rx buffers to get rid of order-1 skb allocations
    - LP: #728835
  * perf: Fix software event overflow, CVE-2011-2918
    - LP: #834121
    - CVE-2011-2918
 -- Herton Ronaldo Krzesinski <email address hidden> Mon, 12 Sep 2011 17:23:38 -0300

Changed in linux (Ubuntu Natty):
status: Incomplete → Fix Released
Ian! D. Allen (idallen) wrote :
Download full text (14.6 KiB)

See also comment #104.

Corruption is still evident in Ubuntu 11.04 natty with this kernel:

Linux linux 2.6.38-11-generic #50-Ubuntu SMP Mon Sep 12 21:17:25 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

I mount a 17GB ecryptfs directory and run an enhanced md5sum on the
directory and save the results. (Enhanced means the md5sum is saved
along with a Perl "stat" of the file - see "man perlfunc" for the order
of the 13 fields from "stat".) I then loop, running the enhanced md5sum
and comparing the new results with the saved. When something differs, I
output info about the saved and the new. So what you see below is pairs
of lines where the inode numbers (third field) are the same between runs
but the md5sums differ.

Not every run produces differences - it's about one run in four. A run
with differences usually shows only one or two differences, but one run
today turned up eight.

One unusual common factor is that all the files whose md5sums change
have between 2 and 186 hard links. No single-link files yet.

Most of the differences in md5sums are unique (only happened to that
inode once during my half-day of testing), but six inodes appear twice
and one appears three times (inode 5260333, the one with 186 hard links).

I did not run this half-day test single-user this time, so there is
the small chance that some process other than ecryptfs is causing these
problems. I got the same behaviour when I ran it singe-user (see #104).

1f9656d3d4e6379cd02c81d6b5125a41 25 5125149 33188 2 777 777 0 875 1316720060 1001274528 1286771949 4096 24
4c19712378cc9074f5aa01c16a53dd8f 25 5125149 33188 2 777 777 0 875 1316720060 1001274528 1286771949 4096 24
464025ab8ecca844ef14927d2ebbfe7f 25 5124704 33188 2 777 777 0 147 1316720060 938293572 1286771949 4096 24
7806d1a26b5ce00b3079d97c29ff2bd6 25 5124704 33188 2 777 777 0 147 1316720060 938293572 1286771949 4096 24
19d7c14415e33af9cbd0e70fbb002e77 25 6685993 33056 2 777 777 0 10240 1316720036 1200422921 1286771775 4096 40
83a70208705f5934366b108aff9d3116 25 6685993 33056 2 777 777 0 10240 1316720036 1200422921 1286771775 4096 40
7c6351e26037a45d2279be4ffb4b6563 25 6686006 33056 2 777 777 0 10240 1316720036 1204740421 1286771775 4096 40
b7f958f96a2ec4a4835ed29fd2bbf70e 25 6686006 33056 2 777 777 0 10240 1316720036 1204740421 1286771775 4096 40
20df6ab0a39207fe7049ede1ea685f75 25 6686420 33056 2 777 777 0 10240 1316720036 1201029149 1286771775 4096 40
d222ff46ee1f1e489391e53c89f36909 25 6686420 33056 2 777 777 0 10240 1316720036 1201029149 1286771775 4096 40
22bfb8c1dd94b5f3813a2b25da67463f 25 5260333 33188 186 777 777 0 220 1316719636 1268985590 1306955878 4096 24
7faace8dee9768922d3dacc09723d1ae 25 5260333 33188 186 777 777 0 220 1316719636 1268985590 1306955878 4096 24
22bfb8c1dd94b5f3813a2b25da67463f 25 5260333 33188 186 777 777 0 220 1316719636 1268985590 1306955878 4096 24
7faace8dee9768922d3dacc09723d1ae 25 5260333 33188 186 777 777 0 220 1316719636 1268985590 1306955878 4096 24
260c0685b36144d0cad219a98a03079d 25 5124701 33188 2 777 777 0 125 1316720060 938295176 1286771948 4096 24
8e342f7915ece5de1b6a1924a2a00470 25 5124701 33188 2 777 777 0 125 1316720060 938295176 1286771948 4096 24
4ca1bba41e08525aa5...

Ian! D. Allen (idallen) wrote :

See also comments #104 and #128.

In parallel with the looping md5sum of everything in the 17G ecryptfs
read-only partition described above, I picked one of the files that
had produced different md5sums on different runs and wrote this script
to repeatedly md5sum the same file over and over (filename shortened
for clarity):

    #!/bin/sh -u
    # -Ian! D. Allen - <email address hidden> - www.idallen.com

    f=filename
    set -- $( md5sum "$f" )
    start=$1
    while : ; do
 set -- $( md5sum "$f" )
 if [ "$1" != "$start" ] ; then
     echo "$start != $*"
 fi
    done

Over the course of two days running my tests, the above script produced
two outputs, indicating two times when that one file was read and had
a different checksum from when the script started:

c68292cceb13f21de7375c46c0ffdf9a != 58a90406649b9e795f9e1e6b34b806f8 filename
c68292cceb13f21de7375c46c0ffdf9a != 58a90406649b9e795f9e1e6b34b806f8 filename

This file is on a read-only ecryptfs partition. The md5sum should not
be changing.

I note that when the md5sum does change, it always changes to the same
"other" value; it's not random.

In my tests, the ecryptfs corruption seems to happen only when reading
from files with more than one link.

The file itself is 1132 bytes. Its encrypted base file is 12288 bytes.

Rich Wales (richw) wrote :

Since you have identified a specific file which seems to be read in two different ways, it might be useful to capture the actual content of the file in each case (not just the md5sum values) and see if there is any discernible pattern to the corruption (e.g., extraneous trailing nulls in one of the two versions).

And if the corruption happens only with files that have more than one hard link, this might help explain why some people have not reported seeing the problem.

I've upgraded all five Ubuntu systems under my control to the latest Natty kernel (2.6.38-11.50), but I'm obviously still nervous as long as this bug appears to be evident.

Ian! D. Allen (idallen) wrote :

Ubuntu 11.04 natty
Linux ubuntu 2.6.38-11-generic #50-Ubuntu SMP Mon Sep 12 21:17:25 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

I'm now reproducing the problem in a VMware virtual machine by repeatedly
md5sum scanning a subset of files on that 17GB directory as small as
119 MB. (If I scan too small a directory, some sort of caching seems
to happen and I don't see any errors.)

What is happening is that the sizes of some files are changing on some
subsequent md5sum passes. Usually when one stat()s the file, one gets
the actual size of the unencrypted file. Occasionally, the size of the
underlying *encrypted* file is substituted for the real file size and
so the md5sum sums some extra bytes that it shouldn't.

It's tricky to get at the bad content of the file, since to do so I'd
have to actually copy every file to a temporary location, check to
see if the size of the copy matched the original, and complain if not.
I'll work on doing that. Until then, here is an example.

Here is the correct file, md5sum, and size (61440):

9a601197629e5c0b68ecad8d039d1b51 23 1575321 33056 2 777 777 0 61440 1316934939 1163069184 1316934615 4096 136 /home/idallen-ecryptfs/[filename]

When I repeated the md5sum scanning over and over, suddenly I got this:
a wrong md5sum and wrong file size (69632):

e26ef8caef4c1eca7672a6e0678b3190 23 1575321 33056 2 777 777 0 69632 1316934939 1163069184 1316934615 4096 136 /home/idallen-ecryptfs/[filename]

Where does 69632 come from? Well, look at the encrypted file size that
corresponds to that inode:

# find /mnt/sdb1/idallen-ecryptfs/.Private -inum 1575321 -ls
1575321 68 -r--r----- 2 idallen idallen 69632 Nov 9 2006 /mnt/sdb1/idallen-ecryptfs/.Private/[...]

There it is! The underlying *encrypted* file size is bleeding up to be
the (incorrect) size of the *unencrypted* file on some passes.

I'm also seeing these errors in kern.log, but these messages don't always
appear when I'm seeing corruption, so I don't know how to relate them:

[...]
Sep 25 20:43:12 ubuntu kernel: [59348.543107] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
Sep 25 20:43:12 ubuntu kernel: [59348.543113] ecryptfs_readpage: Error decrypting page; rc = [-4]
Sep 25 21:59:10 ubuntu kernel: [63906.460167] ecryptfs_read_and_validate_header_region: Error reading header region; rc = [-4]
Sep 25 21:59:10 ubuntu kernel: [63906.460334] Valid eCryptfs headers not found in file header region or xattr region
Sep 25 21:59:10 ubuntu kernel: [63906.460336] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

Ian! D. Allen (idallen) wrote :

See also comments #104, #128, #129, #131.

Corruption (file size incorrect for ecryptfs files) also happens with this kernel from ppa:kernel-ppa/pre-proposed :

Linux ubuntu 2.6.38-11-generic #51~pre201109230902-Ubuntu SMP Fri Sep 23 09:15:48 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

dmesg shows:

[ 245.530576] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
[ 245.530600] ecryptfs_readpage: Error decrypting page; rc = [-4]
[ 245.531485] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
[ 245.531494] ecryptfs_readpage: Error decrypting page; rc = [-4]

As noted previously, corruption of file sizes happens even when no errors show in dmesg or kern.log.

Launchpad Janitor (janitor) wrote :
Download full text (16.9 KiB)

This bug was fixed in the package linux - 2.6.32-34.77

---------------
linux (2.6.32-34.77) lucid-proposed; urgency=low

  [Steve Conklin]

  * Release Tracking Bug
    - LP: #849228

  [ Upstream Kernel Changes ]

  * Revert "drm/i915: Remove BUG_ON from i915_gem_evict_something"
  * Revert "drm/i915: Periodically flush the active lists and requests"
  * Revert "drm/i915/evict: Ensure we completely cleanup on failure"
  * Revert "drm/i915: Maintain LRU order of inactive objects upon access by
    CPU (v2)"
  * Revert "drm/i915: Implement fair lru eviction across both rings. (v2)"
  * Revert "drm/i915: Move the eviction logic to its own file."
  * Revert "drm/i915: prepare for fair lru eviction"

linux (2.6.32-34.76) lucid-proposed; urgency=low

  [Steve Conklin]

  * Release Tracking Bug
    - LP: #836914

  [ Upstream Kernel Changes ]

  * Revert "drm/nv50-nvc0: work around an evo channel hang that some people
    see"
  * Revert "eCryptfs: Handle failed metadata read in lookup"
  * Revert "tunnels: fix netns vs proto registration ordering"

linux (2.6.32-34.75) lucid-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #832332

  [ Upstream Kernel Changes ]

  * drm/i915: Remove BUG_ON from i915_gem_evict_something
    - LP: #828550

linux (2.6.32-34.74) lucid-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #828375

  [ Upstream Kernel Changes ]

  * proc: fix oops on invalid /proc/<pid>/maps access, CVE-2011-1020
    - LP: #813026
    - CVE-2011-1020

linux (2.6.32-34.73) lucid-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #824148

  [ Tim Gardner ]

  * SAUCE: rtl8192se: Force a build for a 2.6/3.0 kernel
    - LP: #805494
  * [Config] Add enic/fnic to udebs
    - LP: #801610

  [ Upstream Kernel Changes ]

  * tty: icount changeover for other main devices, CVE-2010-4076,
    CVE-2010-4077
    - LP: #720189
    - CVE-2010-4077
  * fs/partitions/efi.c: corrupted GUID partition tables can cause kernel
    oops
    - LP: #795418
    - CVE-2011-1577
  * ftrace: Only update the function code on write to filter files
    - LP: #802383
  * kmemleak: Do not return a pointer to an object that kmemleak did not
    get
    - LP: #802383
  * CPU hotplug, re-create sysfs directory and symlinks
    - LP: #802383
  * Fix memory leak in cpufreq_stat
    - LP: #802383
  * powerpc/kexec: Fix memory corruption from unallocated slaves
    - LP: #802383
  * powerpc/oprofile: Handle events that raise an exception without
    overflowing
    - LP: #802383
  * mtd: mtdconcat: fix NAND OOB write
    - LP: #802383
  * x86, 64-bit: Fix copy_[to/from]_user() checks for the userspace address
    limit
    - LP: #802383
  * ext3: Fix fs corruption when make_indexed_dir() fails
    - LP: #802383
  * jbd: Fix forever sleeping process in do_get_write_access()
    - LP: #802383
  * jbd: fix fsync() tid wraparound bug
    - LP: #802383
  * ext4: release page cache in ext4_mb_load_buddy error path
    - LP: #802383
  * Fix Ultrastor asm snippet
    - LP: #802383
  * x86, amd: Do not enable ARAT feature on AMD processors below family
    0x12
    - LP: #802383
  * x86, ...

Changed in linux (Ubuntu Lucid):
status: Incomplete → Fix Released
Ian! D. Allen (idallen) wrote :

> Launchpad Janitor (janitor) wrote 18 hours ago: #133
> This bug was fixed in the package linux - 2.6.32-34.77

It's broken in stock Ubuntu 11.04 using 2.6.38-11-generic #50-Ubuntu SMP Mon Sep 12 21:17:25 UTC 2011 x86_64.

Paolo Bonzini (bonzini) wrote :

I still see occasional errors:

[33945.269075] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
[33945.269084] ecryptfs_readpage: Error decrypting page; rc = [-4]

that are reported as "short read: Success" when doing a "git grep" on an encrypted repository. Redoing the grep fixes the problems for those files and may transfer it to others, until all of them are in the cache and the grep succeeds.

Please, anyone responding to this report, please please please tell us
exactly which kernel you're running.

uname -a

tankdriver (stoneraider) wrote :

kernel: [18393.960955] ecryptfs_encrypt_page: Error attempting to write lower page; rc = [-5]
kernel: [18393.960962] ecryptfs_writepage: Error encrypting page (upper index [0x000000000000006c])

$ uname -a
Linux thomas-VPCF13J0E 3.0.0-12-generic #19-Ubuntu SMP Fri Sep 23 21:23:39 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Tyler Hicks (tyhicks) wrote :

On 2011-10-06 15:18:11, tankdriver wrote:
> kernel: [18393.960955] ecryptfs_encrypt_page: Error attempting to write lower page; rc = [-5]

This is an -EIO error returned from the lower filesystem. It isn't
related to this bug and isn't an eCryptfs bug. If anything, it is
eCryptfs being too verbose when it sees an error code.

Tyler Hicks (tyhicks) wrote :

On 2011-10-06 14:34:12, Paolo Bonzini wrote:
> I still see occasional errors:
>
> [33945.269075] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
> [33945.269084] ecryptfs_readpage: Error decrypting page; rc = [-4]
>
> that are reported as "short read: Success" when doing a "git grep" on an
> encrypted repository. Redoing the grep fixes the problems for those
> files and may transfer it to others, until all of them are in the cache
> and the grep succeeds.

This is what the fix was intended to do. When eCryptfs is trying to
read from the lower filesystem and gets interrupted, pass the EINTR
error code onto userspace. However, eCryptfs doesn't need to write a log
message about it.

From your description, I can't tell if grep handled the EINTR error
correctly. If you can reproduce this, can you please strace grep and
attach the output so that I can take a better look at what grep is
seeing returned from system calls?

$ strace -o grep.strace grep [GREP ARGS]

Oh, and please listen to kirkland's request and give us `uname -a`
output.

harcesz (harcesz) wrote :

bump, one more guinea pig happy to test possible solutions (whole system drive encryption btw)

Linux parafia 2.6.38-11-generic #50-Ubuntu SMP Mon Sep 12 21:17:25 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Ian! D. Allen (idallen) wrote :

See also comments #104, #128, #129, #131, #132, #134.

Still seeing ecryptfs file size corruption (leading to file corruption) in latest Ubuntu 11.04 kernel update:

Linux linux 2.6.38-12-generic #51-Ubuntu SMP Wed Sep 28 14:27:32 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

The ecryptfs occasionally returns the (larger) size of the underlying ecryptfs file, not the size of the
decrypted file being accessed in the decrypted and mounted partition.

Joe Edmonds (joee) wrote :

Ever since moving to an ext4+ecryptfs oneiric system, I receive this message about 6,000 times a day in /var/log/kern.log:

Oct 16 01:56:39 hostname kernel: [32384.222198] Valid eCryptfs headers not found in file header region or xattr region
Oct 16 01:56:39 hostname kernel: [32384.222211] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

I don't know whether trailing garbage is being added to my encrypted files. But the bug about the kernel message is marked as a duplicate of this one:

https://bugs.launchpad.net/ubuntu/+source/ecryptfs-utils/+bug/372014

$ uname -srvmo
Linux 3.0.0-12-generic #20-Ubuntu SMP Fri Oct 7 14:56:25 UTC 2011 x86_64 GNU/Linux

Ian! D. Allen (idallen) wrote :

See also my comments #104, #128, #129, #131, #132, #134, #141.

The above ecryptfs corruption is reduced but not gone in Ubuntu 11.10 Oneiric:

    Linux ubuntu 3.0.0-12-generic #20-Ubuntu SMP Fri Oct 7 14:56:25 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

I repeatedly ran md5sum on a read-only ecryptfs partition with 18GB of files (many with multiple hard links) and the md5sum results *changed* on a few files on some of the runs.

It took many more runs (several dozen) to discover the corruption than when using the 2.6.38 kernel, and *only* the md5sums changed between runs, not the file sizes as well as was true with 2.6.38. I also saw the corruption on files with only one link, where under 2.6.38 it seemed to happen only on files with more than one link.

I'm running these tests using a VMware install of a fully-updated Ubuntu 11.10 Oneiric with a separate 32GB virtual disk containing the 18GB ecryptfs partition.

I'm looping a full md5sum of all 18GB in one terminal window and in a second window I'm looping a simultaneous md5sum of a 4.2GB subdirectory. When corruption happens, both windows show it happening on the same file(s). Out of 48 runs over the full 18GB so far, four runs turned up files with different checksums. In the other loop, out of 171 runs over the 4.2GB subdirectory so far, 11 runs turned up files with different checksums.

I did see some kern.log errors early in the testing process:

Oct 20 14:57:49 ubuntu kernel: [ 65.128562] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null)
Oct 20 16:04:30 ubuntu kernel: [ 4066.054573] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
Oct 20 16:04:30 ubuntu kernel: [ 4066.054580] ecryptfs_readpage: Error decrypting page; rc = [-4]
Oct 20 16:47:30 ubuntu kernel: [ 6645.362178] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
Oct 20 16:47:30 ubuntu kernel: [ 6645.362263] ecryptfs_readpage: Error decrypting page; rc = [-4]

Those errors all appeared long before the runs showing corruption.

/dev/sdb1 on /mnt/sdb1 type ext4 (ro)
/mnt/sdb1/idallen-ecryptfs/.Private on /home/idallen-ecryptfs type ecryptfs (ro,ecryptfs_unlink_sigs,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_sig=xxxxxxxxxxxxxxxx,ecryptfs_fnek_sig=yyyyyyyyyyyyyyyy)

Ian! D. Allen (idallen) wrote :
Download full text (3.8 KiB)

See also my comments #104, #128, #129, #131, #132, #134, #141, 143.

Ubuntu 11.10 Oneiric desktop new install, fully updated
shows file corruption when reading ecryptfs files.

Linux ubuntu 3.0.0-12-generic #20-Ubuntu SMP Fri Oct 7 14:56:25 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Reading the same read-only 18GB ecryptfs file system I've been using
above, I took md5sums of all the files and created a base file of 435,649
md5sum file entries.

I ran a script that looped for every file in the base file. It copied
each file to a tmp file, did an md5sum on the tmp file, and compared
that md5sum with the saved md5sum in the base file. I ran three looped
iterations of this script in parallel, each offset to start at a different
third of the md5sum base file. On the third iteration, all three scripts
found the same md5sum difference on the same file name:

MD5MISMATCH at line 62461 d4015d1962291b00abff6d14d8e71bb6 639f9b2e0f6bce8b287b879a1b0e17b5

All three scripts, running separately, came up with the same incorrect
md5sum for the tmp file copy, at three different times. The tmp copy was
the same size as the original. Comparing the tmp file with the original:

$ cmp -bl [original] [tmp]
 4865 0 ^@ 110 H
 4866 320 M-P 127 W
 4867 314 M-L 254 M-,
 4868 360 M-p 370 M-x
 4869 3 ^C 147 g
 4870 4 ^D 73 ;
 4871 360 M-p 237 M-^_
 4872 340 M-` 16 ^N
 4873 60 0 272 M-:
 4875 0 ^@ 347 M-g
 4876 0 ^@ 220 M-^P
 4877 0 ^@ 36 ^^
 4878 0 ^@ 347 M-g
 4879 240 M- 336 M-^
 4880 5 ^E 231 M-^Y
 4896 157 o 177 ^?

On the next five subsequent passes by each of the three scripts, no
md5sum changes were detected in any of the files, including the problem
file identified above. The corruption is intermittent.

From the kern.log file, here are some ecryptfs errors, but the errors
below do *not* correspond to the time that the script found the md5sum
difference (which was on 06:10am Oct 23):

Oct 20 14:57:49 ubuntu kernel: [ 65.128562] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: (null)
Oct 20 16:04:30 ubuntu kernel: [ 4066.054573] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
Oct 20 16:04:30 ubuntu kernel: [ 4066.054580] ecryptfs_readpage: Error decrypting page; rc = [-4]
Oct 20 16:47:30 ubuntu kernel: [ 6645.362178] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
Oct 20 16:47:30 ubuntu kernel: [ 6645.362263] ecryptfs_readpage: Error decrypting page; rc = [-4]
Oct 22 15:49:25 ubuntu kernel: [175960.571423] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
Oct 22 15:49:25 ubuntu kernel: [175960.571503] ecryptfs_readpage: Error decrypting page; rc = [-4]
Oct 22 16:12:46 ubuntu kernel: [177361.428710] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
Oct 22 16:12:46 ubuntu kernel: [177361.428735] ecryptfs_readpage: Error decrypting page; rc = [-4]
Oct 22 16:23:12 ubuntu kernel: [177987.700577] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
Oct 22 16:23:12 ubuntu kernel: [177987.700593] ecryptfs_readpage: Error decrypting page; rc = [-4]
Oct 22 16:31:57 ubuntu kernel: [178512.471610] ecryptfs_decr...

Read more...

hackel (hackel) wrote :

Since this bug is marked "fixed" in both Natty and Oneric, is there another bug people should be following who are experiencing this issue, or does this need to be re-opened?

Mikko Rantalainen (mira) wrote :

Ian! D. Allen (idallen) wrote: "It took many more runs (several dozen) to discover the corruption than when using the 2.6.38 kernel, and *only* the md5sums changed between runs, not the file sizes as well as was true with 2.6.38."

Have you ruled out hardware causes (e.g. tried the same test on a different computer)?

I'm asking this because I was just yesterday able to detect a really hard to track hardware issue with my own computer. It was manifesting itself as random sha1sum changes to in files when computing checksums for a 18 GB data set. The system had 8 GB of SDRAM. The memtest86 was not able to detect any problems when I run it for 5 hours but I was always able to reproduce the checksum errors in less than 15 minutes by trying to compute sha1sums repeatedly. In that case, the problem turned to be an incorrectly behaving memory chip. The problem vanished when I stopped using that part!

vkapas (vkapas) wrote :

Ubuntu 10.04 x64 with all updates
Linux dell-inspiron-1501 2.6.32-36-generic #79~pre201110180903-Ubuntu SMP Tue Oct 18 10:11:27 UTC 2011 x86_64 GNU/Linux
(kernel from ppa:kernel-ppa/pre-proposed)

mount
=======================================================
/dev/sda1 on / type ext4 (rw,errors=remount-ro)
proc on /proc type proc (rw,noexec,nosuid,nodev)
none on /sys type sysfs (rw,noexec,nosuid,nodev)
none on /sys/fs/fuse/connections type fusectl (rw)
none on /sys/kernel/debug type debugfs (rw)
none on /sys/kernel/security type securityfs (rw)
none on /dev type devtmpfs (rw,mode=0755)
none on /dev/pts type devpts (rw,noexec,nosuid,gid=5,mode=0620)
none on /dev/shm type tmpfs (rw,nosuid,nodev)
none on /var/run type tmpfs (rw,nosuid,mode=0755)
none on /var/lock type tmpfs (rw,noexec,nosuid,nodev)
none on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
/dev/sda5 on /home type xfs (rw)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noexec,nosuid,nodev)
gvfs-fuse-daemon on /home/set/.gvfs type fuse.gvfs-fuse-daemon (rw,nosuid,nodev,user=set)
/home/set/.Private on /home/set type ecryptfs (ecryptfs_check_dev_ruid,ecryptfs_sig=34e21478f0216127,ecryptfs_fnek_sig=681325906c7014cb,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs)
=======================================================

dmesg | tail
=======================================================
[80152.510213] Valid eCryptfs headers not found in file header region or xattr region
[80152.510218] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[80383.179948] Valid eCryptfs headers not found in file header region or xattr region
[80383.179955] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[80682.782143] Valid eCryptfs headers not found in file header region or xattr region
[80682.782149] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[80982.794356] Valid eCryptfs headers not found in file header region or xattr region
[80982.794364] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
=======================================================

maybe it must be usefull:

$ find $HOME/.Private/ -size 0c -exec ls '{}' \; | wc -l
=======================================================
find: `/home/set/.Private/ECRYPTFS_FNEK_ENCRYPTED.FWZc2mKEP5-ImkR7bJeHmOTHeUks.Mn0L6Ar9hQNfSzwO5pSLN6-VjOv----': Отказано в доступе
192
=======================================================

Codagnoni (codagnoni) on 2011-11-09
affects: linux (Ubuntu Lucid) → nautilus (Ubuntu Lucid)
affects: nautilus (Ubuntu) → linux (Ubuntu)
Martin (lodp) wrote :

I'm seeing this bug in a 3-week old fresh install of oneiric, with encrypted home and ext4fs. Out of the blue the system froze up, then X wouldn't start on the next boot, scared the living s*** out of me. Now X intermittently does start. So it seems this bug really isn't fixed.

here's my kernel: 3.0.0-12-generic #20-Ubuntu SMP Fri Oct 7 14:56:25 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Is there a way I can fix this without re-installing the whole thing? Much of the above thread is a mystery to me -- do I have to find corrupted empty files and delete them? How would I go about that? Thanks!

Martin (lodp) wrote :

As for the bug, I forgot to mention that my disk may well have been filled 100% right before the error occurred. I think somebody posting above reported a similar thing. So maybe that can explain why the supposedly fixed bug resurfaced on my system using the 3.0.0-12 kernel?

For the record, to clean up the corrupted files, I used a variant of the tip somebody above posted to find and remove corrupted files:

find . -type f -size -10k -exec cat {} \; > /dev/null

To explain for dummies like me: This reads all files smaller than 10k and yields an Input/Output error for the files concerned. (which in my case were all 0-bytes files, mostly lockfiles). You can then go about removing those files manually.

cfaber (cfaber) wrote :

I'd like to point out that I am seeing data corruption events due to this issue. It appears that once the files are mangled they cannot be decrypted and end up showing up as null. This is a very serious problem as there is no way to retrieve the data.

A quick search with `find ~ -type f -exec tail -n1 {} \; > /dev/null' indicates a large number of frequently written configuration files, database files and temporary lock files which are now unreadable with -EIO error returned.

jseabra (jseabra) wrote :

The problem remains on my 11.10 ubuntu.
Anyone knows how to fix this issue?

Sam Liddicott (sam-liddicott) wrote :

Carefully don't fill your file system is the only remedy right now,

ilia (ilia) wrote :

This bug no longer affects me, as well as all other ecryptfs related bugs! :)

It has turned out that my needs are fulfilled by another solution, much better in terms of stability AND performance: LUKS.
The difference from ecryptfs from the user's point of view:
1. whole partition/volume is encrypted (I use a separate LVM volume for /home anyway)
2. a simplest setup requires a user to enter a password upon boot (this may be automated for those who needs it, but I don't mind)
3. very stable:
     no more kernel errors, I/O errors and trailing garbage in your files!
4. much better performance:
    in my simple tests the improvement ranges from 5% for 300M single file read to 2000% (i.e. 21 times faster) for fstat'ing 5000 files! All tests were performed with cold caches, I just show worst and best results here, but the general feeling is that average improvement close to an order of magnitude.

Quick howto "starting using LUKS right now":
1. Allocate new volume/partition, e.g.
     sudo lvcreate --size 10G --name test_luks vg00
    OR use fdisk to create something like /dev/sda6, then substitute your device name in place of /dev/mapper/vg00-test_luks below
2. Format it for LUKS
    sudo cryptsetup luksFormat /dev/mapper/vg00-test_luks
    provide a password when asked (anything, not your login one)
3. Setup automatic decryption
    echo "test_crypt /dev/mapper/vg00-test_luks none luks" | sudo tee -a /etc/crypttab
4. Format decrypted volume, e.g. as ext4
    sudo cryptdisks_start test_crypt
    sudo mkfs.ext4 /dev/mapper/test_crypt
5. Setup automatic mount, e.g.
    echo '/dev/mapper/test_crypt /mnt/test ext4 defaults 0 2' | sudo tee -a /etc/fstab
6. Reboot to test. You will need to provide a password during boot process.

When you are comfortable LUKS you can migrate your /home to it and have fun :)

More info for the curious:
1. https://wiki.archlinux.org/index.php/System_Encryption_with_LUKS
2. man cryptsetup crypttab
3. google :)

P.S. While this comment is formally off-topic I think many users affected by this bug may find it useful.

Paolo Bonzini (bonzini) wrote :

The EINTR is not passed up, and leaves bad data in the cache as if ecryptfs thought the files were not encrypted. This is what causes the short writes. Getting an strace is quite difficult, but this time I succeeded:

lstat("include/asm-ia64/ia64_int.h", {st_mode=S_IFREG|0664, st_size=12288, ...}) = 0
lstat("include/asm-ia64/init.h", {st_mode=S_IFREG|0664, st_size=12288, ...}) = 0
lstat("include/asm-ia64/init.h", {st_mode=S_IFREG|0664, st_size=12288, ...}) = 0
lstat("include/asm-ia64/iocap.h", {st_mode=S_IFREG|0664, st_size=12288, ...}) = 0
lstat("include/asm-ia64/iocap.h", {st_mode=S_IFREG|0664, st_size=12288, ...}) = 0
lstat("include/asm-ia64/linux-null/README.origin", {st_mode=S_IFREG|0664, st_size=12288, ...}) = 0
lstat("include/asm-ia64/linux-null/README.origin", {st_mode=S_IFREG|0664, st_size=12288, ...}) = 0
lstat("include/asm-ia64/ia64_int.h", {st_mode=S_IFREG|0664, st_size=12288, ...}) = 0
open("include/asm-ia64/ia64_int.h", O_RDONLY) = 3
read(3, "#ifndef _ASM_IA64_INT_H\n#define "..., 12288) = 2139
read(3, "", 10149) = 0

$ uname -a
Linux yakj.usersys.redhat.com 3.1.0-7.fc16.x86_64 #1 SMP Tue Nov 1 21:10:48 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

$ dmesg | tail -4
[162849.845909] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
[162849.845913] ecryptfs_readpage: Error decrypting page; rc = [-4]
[162849.845923] ecryptfs_decrypt_page: Error attempting to read lower page; rc = [-4]
[162849.845925] ecryptfs_readpage: Error decrypting page; rc = [-4]

-4 being EINTR.

cfaber (cfaber) wrote :

This problem seems to be exacerbated by files requiring many opens and closes and reads and writes (such as program locks).

I've noticed this the most with files generated by skype.

cfaber (cfaber) wrote :

This is NOT fixed in Oneiric.

I just noticed today that my syslog fills up with this:

Nov 28 22:28:29 oscar-laptop kernel: [87699.041735] ecryptfs_encrypt_page: Error attempting to write lower page; rc = [-5]
Nov 28 22:28:29 oscar-laptop kernel: [87699.041749] ecryptfs_writepage: Error encrypting page (upper index [0x000000000000003d])
Nov 28 22:28:59 oscar-laptop kernel: [87729.037043] ecryptfs_encrypt_page: Error attempting to write lower page; rc = [-5]
Nov 28 22:28:59 oscar-laptop kernel: [87729.037057] ecryptfs_writepage: Error encrypting page (upper index [0x000000000000003e])

Landed here on a bug with htis exact output that said it was a duplicate of this bug so reporting here. Not sure if it is this bug, related or what. Not seeing any special problems on my laptop and I don't have time to read the entire bug report. I have an encrypted /home the standard Ubuntu way from install. Tell me if I can provide something more to help.

Ubuntu 3.0.0-14-generic x86_64

Not sure if this is relevant info or not but I just rebooted and saw in my boot.log that a lot of orphaned inodes was cleared. Maybe something related?

...
/dev/sda7: Clearing orphaned inode 2621447 (uid=1000, gid=1000, mode=0100664, size=40960)
/dev/sda7: Clearing orphaned inode 2621442 (uid=1000, gid=1000, mode=0100664, size=40960)
/dev/sda7: Clearing orphaned inode 2621959 (uid=1000, gid=1000, mode=0100664, size=40960)
...

Since reboot I no longer get the ecryptfs error.

I see this problem on a clean U11.10 64bit install on a Thinkpad T420 and disk is no where near full. (As an aside: I dont know if it is related to numerous random lockups which sometimes appear on startup, sometimes appear shortly after desktop-login. Sometimes, they dont happen for hours. )

Linux T420-laptop 3.0.0-13-generic #22-Ubuntu SMP Wed Nov 2 13:27:26 UTC 2011 x86_64 x86_64 x86_64 GNU/Linux

Filesystem Size Used Avail Use% Mounted on
/dev/sda5 346G 16G 313G 5% /
udev 3.9G 4.0K 3.9G 1% /dev
tmpfs 1.6G 864K 1.6G 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 3.9G 1.2M 3.9G 1% /run/shm
/home/michcook/.Private
                      346G 16G 313G 5% /home/michcook

Please let me know if there is a fix and whether anyone else is chasing/fixing T420 random lockups.

Mathias Dietrich (theghost) wrote :

This issue is still not fixed under a fresh install of Ubuntu Oneiric x64. Dmesg is getting polluted with:

"Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO"

Please fix this finally.

Mathias Dietrich (theghost) wrote :

Ubuntu Oneiric 11.10 x64 (fresh install) with Kernel 3.1.3-030103-generic:

[ 94.300994] Valid eCryptfs headers not found in file header region or xattr region
[ 94.300998] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

Israel Vaughn (daaxix) wrote :

I am seeing similar behavior to #159, except my lockups are so frequent that I cannot even use my laptop anymore. I'm running 11.10 64-bit with all the latest updates.

I see tons of

"Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO"

messages in the log files (syslog and kern.log).

I noticed that this problem became much much worse after the most recent update.

I have ~600,000 files in my home directory, I don't know if that is a contributing factor.

Today my laptop froze, requiring reboot (could not even ssh into it) about 15 times. The last time it froze the matlab code that I was writing was completely corrupted (looked at it in a hex editor, totally mangled) upon reboot.

I'm going to convert to a non-encrypted home and I'll report back whether the freezes stop (if they don't then it must be something else).

Sam Liddicott (sam-liddicott) wrote :

Sounds like a plausible deniability feature is active.

"Honest, its not encrypted, it's corrupted. Ouch"

Of course, to be plausible, such a mode cannot be disabled.

However in England failure to decrypt could result in jail time so this bug
entry could be evidence.

Robert Simmons (rsimmons0) wrote :

I'm getting freezes as well. I was wondering what the problem was, until I checked syslog and noticed literally thousands of messages to the tune of:

12/10/11 06:53:39 PM example kernel [ 1398.520173] Valid eCryptfs headers not found in file header region or xattr region
12/10/11 06:53:39 PM example kernel [ 1398.520183] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

Israel Vaughn (daaxix) wrote :

Reference #162 for background. I converted to a plain ext4 home and am getting fewer freezes, but still getting freezes. It is baffling! I'm going to pull an Office Space on my laptop!

geez (geez) wrote :

I'm also seeing this on 11.10 Oneiric x86_64. Using an encrypted home partition, and I have filled the partition up on a few occasions. However, the massive amounts of errors being logged in in my syslog and kern.log have exceeded 2GB such that even my root partition was maxed out today.

This should *REALLY* be bumped up to critical and fixed ASAP. Data corruption is a deadly sin to operating systems.

Jacob Henner (jacobhenner) wrote :
Download full text (3.2 KiB)

[ 545.409840] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 545.410475] Valid eCryptfs headers not found in file header region or xattr region
[ 545.410478] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 545.410933] Valid eCryptfs headers not found in file header region or xattr region
[ 545.410936] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 550.590692] Valid eCryptfs headers not found in file header region or xattr region
[ 550.590695] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 550.763826] Valid eCryptfs headers not found in file header region or xattr region
[ 550.763830] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 779.602485] Valid eCryptfs headers not found in file header region or xattr region
[ 779.602489] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 779.607399] Valid eCryptfs headers not found in file header region or xattr region
[ 779.607402] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 780.397931] Valid eCryptfs headers not found in file header region or xattr region
[ 780.397934] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 788.868849] Valid eCryptfs headers not found in file header region or xattr region
[ 788.868853] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 874.393752] Valid eCryptfs headers not found in file header region or xattr region
[ 874.393756] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 874.555839] Valid eCryptfs headers not found in file header region or xattr region
[ 874.555843] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 875.243560] Valid eCryptfs headers not found in file header region or xattr region
[ 875.243564] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 1488.790864] Valid eCryptfs headers not found in file header region or xattr region
[ 1488.790867] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

These errors keep comi...

Read more...

geez (geez) wrote :

In the mean time, I worked around the problem by removing the encryption from my home folder, using this:

http://www.satansgarden.org/2010/03/05/removing-encryption-from-home-directories-in-ubuntu-9-10/

By mounting the encrypted home folder at /home/username/Private it lets you (sort of) remove the encryption in-place. After mounting it elsewhere I simply did "cp -R /home/username/Private/* /home/username/". Note that you should not forget hidden folders (starting with a dot . ), which you can see using "ls -alh"

(Also, "alias l='ls -alh --color=always'" is a good idea imo ;))

Tyler Hicks (tyhicks) wrote :

On 2012-01-03 22:43:27, Jacob Henner wrote:
> [ 1488.790864] Valid eCryptfs headers not found in file header region or xattr region
> [ 1488.790867] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

These error messages are not directly related to this bug. However, they
may be a result of hitting this bug. I've opened bug 911507 to track
the fix for this issue. It is the result of a long-standing design
decision that clearly needs to go away now that so many people are
having trouble with this part of the code. It is high on my priority
list.

Paolo Bonzini did point out a still-lingering issue with the fix for
this bug in comment 154. This bug report has become very long and I'd
like to track the fix for the lingering issue in bug 842647. It has
clear instructions on how to reproduce the lingering issue and I'm
almost done with the path for it.

Tyler

geez (geez) wrote :

Also in addition to #168 I feel I must add that some files had become unreadable. In my case there was no severe data loss, but I would stress that people should backup the data on their ecryptfs partitions.

Cinquero (cinquero) wrote :

I think we really need to improve the QA.

First step would be to have a test case/regression test that can be run automatically.

Some of the reporters/commenters probably already have some scripts flying around. Could they please provide them here?

Are there any existing efforts that try to address such issues?

Jalsot (jalsot) wrote :

I have the same issue on an up-to-date Oneric (however the issue marked as resolved on Oneric).

Plouj (plouj) wrote :

I use this script to find broken files where the encrypted file is empty:

#!/bin/bash

# Matches decrypted files in the home directory to empty files in the
# underlying encrypted directory by inode numbe

echo "saving inodes of empty files to /tmp/empty-encrypted-file-inodes"
find $HOME/ -xdev $(
    for inode in $(find $HOME/.Private/ -xdev -size 0c -exec ls -ai '{}' \;|awk '{print $1}'|tee /tmp/empty-encrypted-file-inodes); do
 echo " -inum $inode"
    done)

This might break if it constructs a command line longer than what your system allows.

Plouj (plouj) wrote :

I see this problem with "Linux durandal 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20 17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux" after accidentally filling up my harddrive.

Plouj (plouj) wrote :

Actually the script in comment #173 so here is a better version:

#!/bin/bash

# Matches decrypted files in the home directory to empty files in the
# underlying encrypted directory by inode number. Limits search to the
# CWD sub-directories.

echo "saving inodes of empty files to /tmp/empty-encrypted-inodes"
# Fake Inode 0 is used to make construction of arguments easier
find . -xdev -inum 0 $(
    for inode in $(find $HOME/.Private/ -xdev -size 0c -exec ls -ai '{}' \;|awk '{print $1}'|tee /tmp/empty-encrypted-inodes); do
        echo -n " -o -inum $inode"
    done)

Plouj (plouj) wrote :

Missed the "is broken" (sorry).

Ulugutz Özdemir (ulugutz) wrote :

getting thousands of these with. please provide fix!!!

Feb 8 20:29:19 xxxxx kernel: [ 4935.274109] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
Feb 8 20:29:19 xxxxx kernel: [ 4935.274120] Valid eCryptfs headers not found in file header region or xattr region
Fe

$uname -a
Linux xxxx 3.0.0-15-generic #26-Ubuntu SMP Fri Jan 20 17:23:00 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

$ find $HOME/.Private/ -size 0c -exec ls '{}' \; | wc -l
0

Same as ulugutz here. Is there at least an option for the kernel module to print any useful information about the broken files?

(actually, how come I get thousands of these messages but I couldn't detect any data loss yet? Seems improbable that it shouldn't hit an important file sometime?)

Andrey Andreev (andyceo) wrote :

Still can confirm this bug.

Dell Inspiron 1525, Ubuntu 11.10 amd64 (latest update)

dmesg shows:

....
[ 77.294511] init: plymouth-stop pre-start process (1944) terminated with status 1
[ 125.756619] init: bootchart post-stop process (3866) terminated with status 141
[ 248.972426] Valid eCryptfs headers not found in file header region or xattr region
[ 248.972433] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 248.985811] Valid eCryptfs headers not found in file header region or xattr region
[ 248.985819] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
[ 285.492825] Valid eCryptfs headers not found in file header region or xattr region
[ 285.492834] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
.... and so on.

tobyS (tobias-schlitt) wrote :

Issue occurred here yesterday, leaving a Git repo with ~8 not-pushed commits broken. :(

Linux tango 3.0.0-16-generic #28-Ubuntu SMP Fri Jan 27 17:44:39 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

The script for fixing (#41) does not work at all here:

$ git st
fatal: loose object df09705cdd19da7f674558d35c300748b3a2d113 (stored in .git/objects/df/09705cdd19da7f674558d35c300748b3a2d113) is corrupt
$ ./git-remove-trailing-garbage.py df09705cdd19da7f674558d35c300748b3a2d113
error: unable to find df09705cdd19da7f674558d35c300748b3a2d113
fatal: git cat-file df09705cdd19da7f674558d35c300748b3a2d113: bad file
fatal: invalid object type ""
Traceback (most recent call last):
  File "./git-remove-trailing-garbage.py", line 36, in <module>
    with open(obj, 'rb') as f:
IOError: [Errno 5] Input/output error: '/…/.git/objects/df/09705cdd19da7f674558d35c300748b3a2d113'

Any other way to get this fixed?

I'm on precise and this has happened to me.

Any way to do fsck on ecryptfs?

derp herp (junkmail-trash) wrote :

Still happening. Skype seems to exacerbate the situation.

$ find $HOME/.Private/ -size 0c -exec ls '{}' \; | wc -l
find: `/home/xxx/.Private/ECRYPTFS_FNEK_ENCRYPTED.FWYQViTKpRilBEQrvyhYzFbx53yokrMo5MdfE-hi0ZqRBXpRjN1jF7OjKU--/ECRYPTFS_FNEK_ENCRYPTED.FXYQViTKpRilBEQrvyhYzFbx53yokrMo5Mdf6-WXCmCjMQFUqNo4qH2x0HeJB218Tn.XH.DMCOJ9MTQ-': Permission denied
3
$ uname -a
Linux paperweight 3.0.0-16-generic #28-Ubuntu SMP Fri Jan 27 17:44:39 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux
n

Christian (bolek2000) wrote :

Linux host 3.0.0-16-generic-pae #28-Ubuntu SMP Fri Jan 27 19:24:01 UTC 2012 i686 i686 i386 GNU/Linux
New Thinkpad X220, quite fresh install, all updates, using Docking station, disk never filled up
I got aware because Skype (comes up at startup) always asked me for the license agreement again, and again.

Then I found:
Feb 27 22:53:54 CS-TP kernel: [ 99.745589] Valid eCryptfs headers not found in file header region or xattr region
Feb 27 22:53:54 CS-TP kernel: [ 99.745602] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

The I tried to remember what might have caused this...
After I tried to hibernate (what as I read now is not working with ecryptfs ?) the system could neither hibernate nor shut down anymore and always came back to the login screen. Then I had to switch it off, because I was in a hurry. Not really sure what caused the problem.
During this up and down cycle one can see different error messages.

There are 3 zero byte files, I can provide more information...

tags: removed: verification-reverted-lucid
Herton R. Krzesinski (herton) wrote :

This bug is awaiting verification that the kernel for Lucid in -proposed solves the problem (2.6.32-40.87). Please test the kernel and update this bug with the results. If the problem is solved, change the tag 'verification-needed-lucid' to 'verification-done-lucid'.

If verification is not done by one week from today, this fix will be dropped from the source code, and this bug will be closed.

See https://wiki.ubuntu.com/Testing/EnableProposed for documentation how to enable and use -proposed. Thank you!

tags: added: verification-needed-lucid
Changed in linux (Ubuntu Lucid):
status: Fix Released → Fix Committed
assignee: John Johansen (jjohansen) → Colin King (colin-king)
Colin King (colin-king) wrote :

Tested and verified working for Lucid -proposed i386 2.6.32-40.87

tags: added: verification-done-lucid
removed: verification-needed-lucid
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 2.6.32-40.87

---------------
linux (2.6.32-40.87) lucid-proposed; urgency=low

  [Herton R. Krzesinski]

  * Release Tracking Bug
    - LP: #947375

  [ Upstream Kernel Changes ]

  * IB/mlx4: pass SMP vendor-specific attribute MADs to firmware
    - LP: #932043
  * mm/filemap_xip.c: fix race condition in xip_file_fault()
    - LP: #932043
  * NFSv4: Fix up the callers of nfs4_state_end_reclaim_reboot
    - LP: #932043
  * NFSv4: The state manager shouldn't exit on errors that were handled
    - LP: #932043
  * NFSv4: Ensure the state manager handles NFS4ERR_NO_GRACE correctly
    - LP: #932043
  * NFSv4: Handle NFS4ERR_GRACE when recovering an expired lease.
    - LP: #932043
  * NFSv4: Fix open recovery
    - LP: #932043
  * rpc client can not deal with ENOSOCK, so translate it into ENOCONN
    - LP: #932043
  * udf: Mark LVID buffer as uptodate before marking it dirty
    - LP: #932043
  * eCryptfs: Infinite loop due to overflow in ecryptfs_write()
    - LP: #932043
  * atmel_lcdfb: fix usage of CONTRAST_CTR in suspend/resume
    - LP: #932043
  * Staging: asus_oled: fix image processing
    - LP: #932043
  * Staging: android: binder: Don't call dump_stack in binder_vma_open
    - LP: #932043
  * Staging: android: binder: Fix crashes when sharing a binder file
    between processes
    - LP: #932043
  * usb: gadget: zero: fix bug in loopback autoresume handling
    - LP: #932043
  * usb: Skip PCI USB quirk handling for Netlogic XLP
    - LP: #932043
  * USB: usbserial: add new PID number (0xa951) to the ftdi driver
    - LP: #932043
  * mmc: cb710 core: Add missing spin_lock_init for irq_lock of struct
    cb710_chip
    - LP: #932043
  * net: fix sk_forward_alloc corruptions
    - LP: #932043
  * net: sock_queue_err_skb() dont mess with sk_forward_alloc
    - LP: #932043
  * Linux 2.6.32.57
    - LP: #932043
  * Ban ecryptfs over ecryptfs
    - LP: #932987
  * eCryptfs: Remove mmap from directory operations
    - LP: #400443
  * eCryptfs: Use notify_change for truncating lower inodes
    - LP: #451368
  * ecryptfs: read on a directory should return EISDIR if not supported
    - LP: #719691
  * eCryptfs: Remove extra d_delete in ecryptfs_rmdir
    - LP: #723518
  * eCryptfs: Clear i_nlink in rmdir
    - LP: #723518
  * KVM: Device assignment permission checks
    - LP: #897812
    - CVE-2011-4347
  * block: Fix io_context leak after clone with CLONE_IO
    - LP: #940743
    - CVE-2012-0879
  * block: Fix io_context leak after failure of clone with CLONE_IO
    - LP: #940743
    - CVE-2012-0879
  * eCryptfs: Handle failed metadata read in lookup
    - LP: #509180
  * drm/i915: Fix TV Out refresh rate.
    - LP: #945114
  * Linux 2.6.32.57+drm33.23
    - LP: #945114
 -- Herton Ronaldo Krzesinski <email address hidden> Mon, 05 Mar 2012 16:09:18 -0300

Changed in linux (Ubuntu Lucid):
status: Fix Committed → Fix Released
Sam Liddicott (sam-liddicott) wrote :

On Thu, Mar 22, 2012 at 2:55 PM, Launchpad Bug Tracker <
<email address hidden>> wrote:

> This bug was fixed in the package linux - 2.6.32-40.87
>
>
hurrah, and thanks to everyone involved.

Colin King (colin-king) wrote :

verified on 2.6.38-14.58 -proposed with ext2, ext3, ext4, xfs, btrfs lower file systems.

tags: added: verification-done-natty
removed: verification-reverted-natty
Adam Porter (alphapapa) wrote :

The reports seem to indicate that this isn't fixed in Oneiric. Should that still be marked "Fix Released"?

Clint Byrum (clint-fewbar) wrote :

Excerpts from Adam Porter's message of Sat Mar 31 23:16:30 UTC 2012:
> The reports seem to indicate that this isn't fixed in Oneiric. Should
> that still be marked "Fix Released"?
>

It was confirmed to be fixed in at least this context. If there is an
additional case where it is not, then that would be a new bug, please
report a new bug with 'apport-bug linux'. You will want to reference
this bug report in that one, suggesting that it may be a regression or
at the very least, has similar symptoms.

Colin King (colin-king) wrote :

@Adam, it was verified for 2.6.38-14.58, which was in proposed at the time of testing. Check that you are using this kernel before filing a new bug. Thanks.

JD Rogers (rogersjd) wrote :

Colin, I'm confused.
Oneiric currently uses 3.0.0-17 and the messages are certainly not fixed for me.
This bug is listed as "fix released" for oneiric as well as older 9.10 in the original bug report. Should a new bug be opened for oneiric and 3.0 kernels even though 11.10 is listed as affected by this bug?

Colin King (colin-king) wrote :

@JD Rogers, the relevant fix is in Oneiric, commit 3aeb86ea4cd15f728147a3bd5469a205ada8c767, so perhaps you are seeing another related issue.

JD Rogers (rogersjd) wrote :

@Colin, ok, I'll gather more info and see if I need to file a new bug. Thanks.

James Gifford (jamesgifford) wrote :

This still is happening, its also in 12.04.

vkapas (vkapas) wrote :

The bug still is happening in Ubuntu 10.04.3.

$ uname -a
Linux dell-inspiron-1501 2.6.32-41-generic #88-Ubuntu SMP Thu Mar 29 13:10:32 UTC 2012 x86_64 GNU/Linux

$ mount -l
/dev/sda1 on / type ext4 (rw,errors=remount-ro)
proc on /proc type proc (rw,noexec,nosuid,nodev)
/dev/sda5 on /home type xfs (rw)
binfmt_misc on /proc/sys/fs/binfmt_misc type binfmt_misc (rw,noexec,nosuid,nodev)
/user/set/.Private on /home/user type ecryptfs (ecryptfs_check_dev_ruid,ecryptfs_sig=34e21478f0216127,ecryptfs_fnek_sig=681325906c7014cb,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs)

In /var/log/messages:

Apr 26 22:29:39 dell-inspiron-1501 kernel: [174527.741111] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
Apr 26 22:34:39 dell-inspiron-1501 kernel: [174827.558166] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
Apr 26 22:39:39 dell-inspiron-1501 kernel: [175128.064340] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
Apr 26 22:44:39 dell-inspiron-1501 kernel: [175427.531031] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO
Apr 26 22:49:39 dell-inspiron-1501 kernel: [175727.510556] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

And this is last entries before the computer freezes.

vkapas (vkapas) wrote :

>The bug still is happening in Ubuntu 10.04.3.
10.04.4

hanfkeks (c-weber23) wrote :

Confirming it for 12.04, did not occur for me with earlier Ubuntu releases.

Just encountered this bug in 12.04. Due to bugs in evolution, I was trying to export my folders
by saving to mbox format, and ran across another evolution bug where it wrote out gigabytes of
data to the mbox file (the folder is only a few megabytes) filling up the home file system. I now have
corrupted files in evolution and chrome and not sure where else (95 files are corrupted).

This is a dangerous bug and needs mentioning in the release notes, especially given how long it's apparently
been in the wild.

I forgot to mention that remounting or rebooting does not resolve the corruption.

Christian Mertes (cmertes) wrote :

Yep Martin, this is the reason why I moved away from ecryptfs. It's dangerous and alpha and should not be used productively by anyone. Use a LUKS encrypted partition on $HOME/Private with pam_mount. With an LVM you can even grow it, albeit not automatically.

Torsten Krah (tkrah) wrote :

Maybe someone with sufficient permission can reopen the state of lucid + add series precise to this bug to track this non fixed bug.

Tyler Hicks (tyhicks) wrote :

On 2012-04-26 19:16:25, vkapas wrote:
> The bug still is happening in Ubuntu 10.04.3.

<snip>

> Apr 26 22:29:39 dell-inspiron-1501 kernel: [174527.741111] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

Seeing this particular error message alone does not mean that you've hit
this bug. It could be seen from the result of this bug, but there are
other reasons that you may see this error message.

Please see comment #78 for a 'find' command that should show you which
file is triggering this error.

Tyler Hicks (tyhicks) wrote :

On 2012-04-29 18:49:40, hanfkeks wrote:
> Confirming it for 12.04, did not occur for me with earlier Ubuntu
> releases.

Hello - Can you please be very clear about what it is that you're
confirming? There are a number of unrelated issues mentioned in the
comments to this bug and it can be difficult for developers to
understand if it this bug is actually not fixed or if there is a
slightly different bug affecting you. Thanks!

On 2012-05-04 03:23:27, Martin Faltesek wrote:
> Just encountered this bug in 12.04. Due to bugs in evolution, I was trying to export my folders
> by saving to mbox format, and ran across another evolution bug where it wrote out gigabytes of
> data to the mbox file (the folder is only a few megabytes) filling up the home file system. I now have
> corrupted files in evolution and chrome and not sure where else (95 files are corrupted).

This is a completely unrelated (but serious!) issue. Please file a new
bug for this. In the meantime, I'll try to recreate what you're seeing.

Rhadamantys (rhadamantys) wrote :

Interestingly, I haven't had that message with kernel 2.6.38-14-generic on 11.04, but now, whith 12.04, I get it regulary and the kernel panics right after that. Is this coincidence? Is there a way to check the integrity of the whole eCryptFS?

Christian Mertes (cmertes) wrote :

Tyler Hicks (tyhicks) wrote on 2012-05-04:

> This is a completely unrelated (but serious!) issue. Please file a new
> bug for this. In the meantime, I'll try to recreate what you're seeing.

Well yes and no, I reported this issue in Bug #957843
https://bugs.launchpad.net/ubuntu/+source/ecryptfs-utils/+bug/957843 which had been marked a duplicate of this bug. I agree though that this was just a lazy way to get rid of one bug.

Vadim Peretokin (vperetokin) wrote :

I was still getting this error after reinstalling 12.04 on a brand-new SSD. So this issue still spawns.

As per comment 78, I got several files with the io error. I take it these files were unreachable because programs using those files were affected.

I'm not certain why did this corruption happen. Is encryptfs not reliable?

Vadim Peretokin (vperetokin) wrote :

Deleting them seems to have worked:

vadi@gooseberry:~$ find . -type f -exec cat {} \; > /dev/null
cat: ./.config/google-chrome/.com.google.Chrome.ok97wK: Input/output error
cat: ./.config/google-chrome/.com.google.Chrome.vQ5Sr2: Input/output error
cat: ./.config/google-chrome/.com.google.Chrome.wVBbQ2: Input/output error
cat: ./.config/google-chrome/Default/Cookies-journal: Input/output error
find: `./.gvfs': Transport endpoint is not connected
cat: ./.Skype/shared.xml: Input/output error
vadi@gooseberry:~$ rm ./.Skype/shared.xml
vadi@gooseberry:~$ rm ./.config/google-chrome/.com.google.Chrome.ok97wK
vadi@gooseberry:~$ rm ./.config/google-chrome/.com.google.Chrome.vQ5Sr2
vadi@gooseberry:~$ rm ./.config/google-chrome/.com.google.Chrome.wVBbQ2
vadi@gooseberry:~$ rm ./.config/google-chrome/Default/Cookies-journal
vadi@gooseberry:~$ find . -type f -exec cat {} \; > /dev/null
find: `./.gvfs': Transport endpoint is not connected
vadi@gooseberry:~$

Charl P. Botha (cpbotha) wrote :

It seems like this bug (also see duplicate https://bugs.launchpad.net/ubuntu/+source/linux/+bug/529002 ) is unfortunately still very much present on 12.04.

$ uname -a
Linux cpbotha-e6410 3.2.0-24-generic #38-Ubuntu SMP Tue May 1 16:18:50 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux

$ find $HOME/.Private/ -size 0c -exec ls '{}' \; | wc -l
253

Indeed, my dropbox is stuck on indexing 247 files (this is how I noticed). When I go to one of the files in question:

$ ls -la envedit.exe.manifest
-rw-rw-r-- 1 cpbotha cpbotha 0 May 14 17:17 envedit.exe.manifest

$ cat envedit.exe.manifest
cat: envedit.exe.manifest: Input/output error

... and then in my dmesg:

[ 6554.071505] Valid eCryptfs headers not found in file header region or xattr region, inode 4882470
[ 6554.071511] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

In other words, I have at least 253 files (some of them more important than envedit.exe.manifest) that are at risk.

Got this back after upgrading to 12.04 too. 4 of 84 encrypted files are affected.

Charl P. Botha (cpbotha) wrote :

For those of you seeing specifically the 0-length lower file resulting in IO errors in the mounted ecryptfs filesystem, please go to the report for that: bug #957843 -- remember indicate that the bug affects you, and to subscribe to it. (both will increase the bug heat)

Kai Falkenberg (wooosh) wrote :

Hi,

I can confirm the bug's presence in 12.04:

$ uname -a
Linux ninux 3.0.0-19-generic #33-Ubuntu SMP Thu Apr 19 19:05:57 UTC 2012 i686 i686 i386 GNU/Linux

$ find $HOME/.Private/ -size 0c -exec ls '{}' \; | wc -l
4

[ 444.916666] Valid eCryptfs headers not found in file header region or xattr region, inode 2890741
[ 444.916670] Either the lower file is not in a valid eCryptfs format, or the key could not be retrieved. Plaintext passthrough mode is not enabled; returning -EIO

and
$ cat $HOME/.ICEauthority
cat: .ICEauthority: Input/Output error

Can't login to Xfce with this user.. :(

Kai Falkenberg (wooosh) wrote :

Sorry, just read #213. Please ignore my previous comment. :-)

ilUomo (menchini) wrote :

After trying the 'delete zero-length files in $HOME/.Private, I found that deleting $HOME/.Skype and restarting skype fixed it for me! (oneric on a Lenovo W520)

J (jonathan-vola) wrote :

I believe I'm seeing this on 12.04 with ecryptfs home folder: I have no 0-length errors (In fact I have no errors at all concerning ecryptfs in my logs) but files are being corrupted regularly.

Specifically: Downloading torrents (Which are supposed to be verified while downloading) then manually verifying the data results in the client finding a corrupted piece then redownloading it and saying it's 100% again. Verifying manually again makes it just do that same thing over.

Using the file then throws errors about corrupted source files - in one case I had a zip file (Downloaded through firefox - not torrent) that wouldn't even open.

None of these problems occur if I download directly to a non-encrypted partition.

Pavel Verba (pavel.s.verba) wrote :

Is there any chance that this issue might be negatively and severely affecting startup time of a system?

Tim Gardner (timg-tpi) wrote :

Maverick is no longer supported.

Changed in linux (Ubuntu Maverick):
status: Incomplete → Won't Fix
Changed in linux (Ubuntu Precise):
status: New → Fix Released
Changed in linux (Ubuntu Quantal):
status: Won't Fix → Fix Released
Changed in linux (Ubuntu):
status: Won't Fix → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers