support "reverse" mode

Bug #376580 reported by Witold Baryluk on 2009-05-14
46
This bug affects 6 people
Affects Status Importance Assigned to Milestone
eCryptfs
Wishlist
Unassigned

Bug Description

Normally eCryptfs works by encrypting all writes in highfs and writing this encrypted data in lowfs. One of very nice feature is that encrypted files from lowfs can be safley synchronized with some backup server (modification times and other file properties are accessible), in example using rsync.

Example:
/home/baryluk/.Private (encrypted) ------ecryptfs-mount-----> /home/baryluk/Private (plain)
    |
    \------> rsync to server

I think it will be very good idea to support somehow reversed mode of operation. We have some directory with plain text files (in. whole file system), and want to securly backup it to remote unsafe location. I would like to mount it to some directory and view there virtual files which will be encrypted, and can be securly backuped.

Example:
/home/baryluk/ (plain) ------ecryptfs-reverse-mount-----> /home/baryluk/encryptedHome (encrypted) ------> rsync to server

What are advangaes of this?

Whole file system can be encrypted this way.
Content of files most times didn't need to be really encrypted. Most backup tool first consider metadata. Only size of encrypted file need to be calculated correctly.
There is no performance issues with encrypting all your data.
Format of encrypted files are compatible with ecrypts. So backups can be restored easly knowing full passphrase.

What you think about this?

Witold Baryluk (baryluk) wrote :

Only problem I can see with this is if there is some additionall data in the header of encrypted file which can be random (in example part of encryption key). Then it is impossible to safly encrypt data to the exactly the same form

There are some ways but not really perfect:
  1. reading headers of encrypted file in remote location,
  2. storing headers in some other place,
  3. using some predefined value for all files
  4. using simple scheme based on the content of file, so reconstructing this data in deterministric way is easy.

So question is if there is any additionall random data needed to encrypt. I think it is because, simple doing:
touch ~/Private/{a,b}; md5sum ~/.Private/* (different md5sums)
leed to conclusion that encrypted files are keyed with some additional random informations beside my key.

My option if for solution numer 3.

There is also problem when data in lowfs is changing when reading data in highfs, but standard ecryptsfs have similar problems. And actually any backup tool will have this problem. This can be resolved using standard snapshoting techinques (particulary easy on btrfs or ZFS).

I forgot to add that reverse mode should be mounted only with readonly flag.

Still waiting for some comments. :)

Tyler Hicks (tyhicks) wrote :

Hi Witold - I think this a great idea. It has been discussed before and I agree with you that it would be useful. However, it isn't high on my priority list right now. I'd be willing to offer assistance to any developer that would like to work on this feature.

Tyler Hicks (tyhicks) wrote :

Can you elaborate on what random data in the header are you talking about? Each file has a unique file encryption key (FEK) that is encrypted with your mount key and stored in the header. As long as you have your mount key, you will be able to decrypt the encrypted FEK and then decrypt the file contents. Is that the random date you were referring to?

Dnia 2009-05-14, czw o godzinie 17:54 +0000, Tyler Hicks pisze:
> Can you elaborate on what random data in the header are you talking
> about? Each file has a unique file encryption key (FEK) that is
> encrypted with your mount key and stored in the header. As long as you
> have your mount key, you will be able to decrypt the encrypted FEK and
> then decrypt the file contents. Is that the random date you were
> referring to?
>

Yes, exactly this random data.

Generally if it will be used for backup we would want to encrypt lowfs
to the predictable (but still in some secret way) content in highfs, so
only incremental data will be needed to send over network and stored on
server. If every reboot or remount of such reverse ecryptfs FEK will
change, then it will be impossible to easily check if files really
changed (we can first check metadata, but still ie. rsync check content
of file wen metadata disagree, so still only delta are send, ie. when
only single block was changes, or data was appended to it).

There some ways to resolve this, but they will involve mounting remote
files (ie. using sshfs) or reading their headers (so original FEK will
be restored, and reused). This isn't possible on dumb servers and will
be less efficient than original way the rsync protocol works, especially
on slow links. If we will consider for mounting remote files, actually
we don't need reverse mode ecryptfs at all. Just rsync plain files
there!

But reverse mode still will be useful even if remote sshfs can be
mounted with ecryptfs for such purpose. Example: full system backup to
DVD. (ok there are probably better ways, like tar+gnupg encryption).
When doing 4 GB of backup we don't want to first create ecryptfs
mountpoint, copy this 4GB there, then copy lowfs encrypted file to DVD.
I would like to do this in single pass.

BTW. I was trying to mount ecryptfs on top of sshfs, but failed.

--
Witold Baryluk

Witold Baryluk (baryluk) wrote :

Dnia 2009-05-14, czw o godzinie 18:17 +0000, Witold Baryluk pisze:
> Dnia 2009-05-14, czw o godzinie 17:54 +0000, Tyler Hicks pisze:
> > Can you elaborate on what random data in the header are you talking
> > about? Each file has a unique file encryption key (FEK) that is
> > encrypted with your mount key and stored in the header. As long as you
> > have your mount key, you will be able to decrypt the encrypted FEK and
> > then decrypt the file contents. Is that the random date you were
> > referring to?
> >
>
> Yes, exactly this random data.

On the margin just want to add that FNEK (file name enc.)
will not be problematic. Because as I just checked, encryptfs
encrypts file names in deterministric way (which is particullary
good for finding conflicts in names) based only on key and real
filename. (In example, path of file relative to mountpoint isn't used -
this is called IV chaining in encfs file system, fuse based).

--
Witold Baryluk

Witold Baryluk (baryluk) wrote :

Dnia 2009-05-14, czw o godzinie 18:17 +0000, Witold Baryluk pisze:
> Dnia 2009-05-14, czw o godzinie 17:54 +0000, Tyler Hicks pisze:
> > Can you elaborate on what random data in the header are you talking
> > about? Each file has a unique file encryption key (FEK) that is
> > encrypted with your mount key and stored in the header. As long as you
> > have your mount key, you will be able to decrypt the encrypted FEK and
> > then decrypt the file contents. Is that the random date you were
> > referring to?
> >
>
> Yes, exactly this random data.
>
> Generally if it will be used for backup we would want to encrypt lowfs
> to the predictable (but still in some secret way) content in highfs, so
> only incremental data will be needed to send over network and stored on
> server. If every reboot or remount of such reverse ecryptfs FEK will
> change, then it will be impossible to easily check if files really
> changed (we can first check metadata, but still ie. rsync check content
> of file wen metadata disagree, so still only delta are send, ie. when
> only single block was changes, or data was appended to it).

Some new thoughts...

Ok, anyway I see this will not work so easly. It will work on append or
single block change. But it will fail for adding or removing data inside
of file. All data beyond this removed/added blocks will need to be sent
over network. Not completly efficient like orginal rsync, but secure and
should just work. Any way, even when only part of file changed and we
need send whole file, this isn't very big problem. Only small percentage
of files on big file system changes actually, and if they change they
mostly changes completly.

This is exactly the same way when synchronising normal ecryptfs lowfs,
so still "reverse" mode can be usefull.

There is tool which mitigates this problem: rsyncrypto. It keeps locally
enrypted copy of files, and synchronises them in effective way (only
what was changed, added or removed). It does this be interesting mode
of encryption. Modified CBC mode.

http://rsyncrypto.lingnu.com/index.php/Algorithm

Unfortunetly i think CBC mode (particulary modified version of it),
isn't very secure.

BTW. In which mode ecryptfs operates? CTR?

Ok, i see it is more complicated ;)

Going to read
ecryptfs.sourceforge.net/ecryptfs_design_doc_v0_1.pdf

--
Witold Baryluk

Witold Baryluk (baryluk) wrote :

Dnia 2009-05-16, sob o godzinie 18:50 +0000, Witold Baryluk pisze:
> Dnia 2009-05-14, czw o godzinie 18:17 +0000, Witold Baryluk pisze:
> > Dnia 2009-05-14, czw o godzinie 17:54 +0000, Tyler Hicks pisze:
> > > Can you elaborate on what random data in the header are you talking
> > > about? Each file has a unique file encryption key (FEK) that is
> > > encrypted with your mount key and stored in the header. As long as you
> > > have your mount key, you will be able to decrypt the encrypted FEK and
> > > then decrypt the file contents. Is that the random date you were
> > > referring to?
> > >
> >
> > Yes, exactly this random data.
> >

> Some new thoughts...

Was thinking more about FEKs and FNEKs.

I think i found general and flexible solution:

1. mount plain filesystem in reverse mode
2. start user space daemon which will be asked for FEKs of this files.
   - it can open already (possibly remote) encrypted file, read header
     and use the same FEKS (they can be locally, they can be on sshfs,
      anywhere, or download it using rsync or anything)
   - or it can maintain local copy of headers (which is safer, because
     remote server can change headers to something easier to crack).
      - in standard files (but with only header)
      - or in DB (sqlite or Berkley DB)
   - or provide the same FEK or some deterministic function of available
     informations (like file name + private key)
3. kernel space will ask for this FEKs, and use them for encryption,
   preferably in ECB or CTR mode, so rsync will be quite happy.
4. additionally kernel will not ask for this keys if this file isn't
   going to be opened (rsync for example first check modification time)
   Maybe only FNEK will be needed to ask, so we will know what filenames
   to give to opendir()

The last point can be optimised to allow asking for multiple keys,
(reducing context switches). This also makes kernel part simple.

Part of this infrastructure is already present in ecryptfsd, right?

--
Witold Baryluk

Tyler Hicks (tyhicks) wrote :

Witold Baryluk wrote:
> Dnia 2009-05-16, sob o godzinie 18:50 +0000, Witold Baryluk pisze:
>> Dnia 2009-05-14, czw o godzinie 18:17 +0000, Witold Baryluk pisze:
>>> Dnia 2009-05-14, czw o godzinie 17:54 +0000, Tyler Hicks pisze:
>>>> Can you elaborate on what random data in the header are you talking
>>>> about? Each file has a unique file encryption key (FEK) that is
>>>> encrypted with your mount key and stored in the header. As long as you
>>>> have your mount key, you will be able to decrypt the encrypted FEK and
>>>> then decrypt the file contents. Is that the random date you were
>>>> referring to?
>>>>
>>> Yes, exactly this random data.
>>>
>
>> Some new thoughts...
>
> Was thinking more about FEKs and FNEKs.
>
> I think i found general and flexible solution:
>
> 1. mount plain filesystem in reverse mode
> 2. start user space daemon which will be asked for FEKs of this files.
> - it can open already (possibly remote) encrypted file, read header
> and use the same FEKS (they can be locally, they can be on sshfs,
> anywhere, or download it using rsync or anything)

This is a possibility, but sounds a little too over-engineered. I like
your next idea much better.

> - or it can maintain local copy of headers (which is safer, because
> remote server can change headers to something easier to crack).
> - in standard files (but with only header)
> - or in DB (sqlite or Berkley DB)

Better yet, store the metadata in an xattr along with the plaintext file
itself. We already support storing the metadata in xattrs. I had been
thinking it wasn't a very useful feature and was wanting to rip that
code out, but this is a valid use case for it.

> - or provide the same FEK or some deterministic function of available
> informations (like file name + private key)

Deterministic FEK generation is not a good idea. Using the same FEK is
doable, but I still like idea #2 much more.

> 3. kernel space will ask for this FEKs, and use them for encryption,
> preferably in ECB or CTR mode, so rsync will be quite happy.

Kernel space can read the xattrs itself, so it won't have to switch out
to user space. eCryptfs only supports CBC mode, but each 4k extent is
encrypted with a unique IV, so rsync should still be pretty happy.

> 4. additionally kernel will not ask for this keys if this file isn't
> going to be opened (rsync for example first check modification time)
> Maybe only FNEK will be needed to ask, so we will know what filenames
> to give to opendir()
>
> The last point can be optimised to allow asking for multiple keys,
> (reducing context switches). This also makes kernel part simple.
>
> Part of this infrastructure is already present in ecryptfsd, right?
>

If we store the metadata in the xattr, there probably won't be a lot of
new code that will need to be written. We should be able to use many
functions that already exist and minimal user space changes will be needed.

Witold Baryluk (baryluk) wrote :

Yes, xattr are really good idea. I completly forgoten about this way. Storing keys in xattr will be much easier, will solve many problems and wouldn't need any additional deamon.

Changed in ecryptfs:
importance: Undecided → Wishlist
Changed in ecryptfs:
status: New → Confirmed
Witold Baryluk (baryluk) wrote :

Hi,

i discovered that encfs is ofering exactly this mode of operation, it is even called reverse mode.
http://sharp.hall.name/2008/12/encrypted-offsite-backup-with-encfs-amazon-s3-and-s3cmd/

Further details in manual. Unfortunetly there are some drawbacks in their implementation.

Actually I don't have enaugh time to implement this in ecrypts unfortunetly (full time job now). Mayby in October. Eventually somebody else have resources and knowledge to do this.

Regards.

Witold Baryluk (baryluk) wrote :

Hi,

i discovered that encfs is ofering exactly this mode of operation, it is even called reverse mode.
http://madduck.net/blog/2009.07.11:mirroring-data-remotely-on-untrusted-systems/

Further details in manual. Unfortunetly there are some drawbacks in their implementation.

Actually I don't have enaugh time to implement this in ecrypts unfortunetly (full time job now). Mayby in October. Eventually somebody else have resources and knowledge to do this.

Regards.

Tyler Hicks (tyhicks) wrote :

There's been no activity on this bug for years. Marking it won't fix.

Changed in ecryptfs:
status: Confirmed → Won't Fix
Glenn Washburn (crass) on 2013-05-11
information type: Public → Public Security
Glenn Washburn (crass) on 2013-05-11
information type: Public Security → Public

This feature is specially useful when using block device encryption. After mounting, files are seen in plaintext and a reverse mode would allow us to sync on-the-fly encrypted files to services like Dropbox.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers