E-book viewer modifies the epub file upon closing

Bug #858957 reported by Li Fanxi on 2011-09-25
40
This bug affects 8 people
Affects Status Importance Assigned to Milestone
calibre
Undecided
Unassigned

Bug Description

calibre version: 0.8.20
OS: Linux (Debian)

How to reproduce:
- Open E-book viewer
- Check the MD5SUM of a epub file
- Open the epub file in the E-book viewer
- Close the E-book viewer
- Check the MD5SUM of the epub file again

Expected:
The MD5SUM value does not change. E-book viewer should not modify the e-book files.

Actual:
The MD5SUM value changes, the e-book file has been modified. The impact of this behavior is if the ebook files are under control of some synchronization services such as Dropbox, each time the epub file get read, it needs to be uploaded and synchronized to all other places.

The ebook viewer stores booksmarks in the epub file. This is by design. You
can turn it off by unsetting the "remeber last page when closing" option in
the viewer preferences.

 status wontfix

Changed in calibre:
status: New → Won't Fix
Li Fanxi (lifanxi) wrote :

So there will be no way to have the "Remember last page when closing" function along with the ability to not change the epub file when closing?

I am wondering why mobi or txt files don't have the similar problem, why they can remember the last reading position without changing the file? I didn't dive into the code, but I guess there will be place outside the ebook files to record this information. Why not using the same mechanism for the epub files?

Kovid Goyal (kovid) wrote :

Because there is no way to put arbitrary information into MOBI and txt files.

Li Fanxi (lifanxi) wrote :

Yes.

My question is why not using the same mechanism for epub files: store the bookmark information outside the epub files like what has been done for mobi or txt files, then we no longer need to modify the epub files to keep the bookmark.

Kovid Goyal (kovid) wrote :

Because storing bookmarks inside epub files in convenient. It allwos you to
transmit the epub file and preserve the bookmarks.

Dan Muresan (danmbox) wrote :

This is actually a pretty surprising default. What do you do, silently ignore errors on read-only mounts? Perhaps you should warn the user the first time you modify an epub (once per installation, not per-book)

avdd (avdd) wrote :

> This is by design

The design is wrong.

Kovid Goyal (kovid) wrote :

ROFL

I am not sure why you think there is a problem here, calibre by default
stores an extra copy of the bookmark data inside the ebook file if the
ebook format supports that, and there is an option to disable that if
you value unmodified files more than you value being able to store your
reading location and bookmarks together with the ebook.

Since you apparently feel so strongly about this that you revived a
6-year-old bug just to tell us the design is somehow wrong, I can only
assume you know of a better design.

Please tell what the better design is, other than simply removing this
*optional* feature.

Imagine that someone reads a detective novel in which the identity of murderer is revealed on the last page. He then closes the ebook and submits it to a friend, recommending lecture. The friend then opens the book and finds out who is the murderer.

ebook-viewer should by default, as the name suggests, be used only to view ebooks, not to modify them; checksum should be unchanged. If ebook-viewer wants to modify an ebook, it should be explicitly approved by the user (perhaps once per ebook or per installation, as danmbox suggests). Or the program name should be changed to something less missleading.

Eli Schwartz (eschwartz) wrote :

So your rationale for disabling this feature by default is that when
saving the reading position of your book in order that you can continue
where you left off, you may accidentally spoil the book for your
mysterious friend who finds himself continuing where you left off?

That's a pretty weak objection. It is a fringe case to begin with, and
made yet more unlikely by the fact that it is really easy to just... go
back to the start of the book, unless an even more unlikely coincidence
happens and the exact saved location has an actual spoiler which is
unlikely in my experience.

Claiming that an "ebook-viewer" has a misleading name if it also
modifies the book, which it doesn't as it merely stores additional
information inside the book like a number of other document formats, is
just silly. There is no rule of etiquette or expectation that a "viewer"
is fundamentally incompatible with modifying the file. You are making an
arbitrary differentiation that exists only in your own mind and was
devised solely for the purpose of defending your claim right here, right
now. But if you truly believed that, then users should not even be
allowed to *opt in* to such an "untruthful" program behavior.

In my opinion, modifying user’s files without his consent (either explicit or implicit) is bad manners and it will break things in ways hard to predict. It may create unnecessary copies of ebooks if they are distributed via IPFS, leading to fragmentarization. It will create unnecessary entries if ebooks are distributed through Dat protocol. If ebooks are signed, it will break signatures. It will add bloat to Dropbox/rsync/whatever synchronization, as mentioned in the original report. In my case, I’ve noticed that MOBI file in one of my repositories had been regenerated without source file being modified (because EPUB had been used as an intermediary), which had been confusing.

The above issues may not always be catastrophic and may not be frequent, but many people will expect that a viewer by default doesn’t modify their files and will be surprised by different behaviour. Actually, it’s hard for me to find an example of another interactive viewer or even an editor that acts similarly.

Paul C. Bryan (pbryan) wrote :

FWIW, I support the feature request that Calibre not write to an ebook by default (i.e. default "safe"). This automatic writing feature adversely affected me, modifying most of my ebook collection without my knowledge. Not what I was expecting.

Eli Schwartz (eschwartz) wrote :

> It may create unnecessary copies of ebooks if they are
> distributed via IPFS, leading to fragmentarization. It will create
> unnecessary entries if ebooks are distributed through Dat protocol. If
> ebooks are signed, it will break signatures. It will add bloat to
> Dropbox/rsync/whatever synchronization, as mentioned in the original
> report.

These are all things that cause no harm and the user having noticed them
can then check what changed and disable the feature. Essentially, you
are arguing that IPFS/Dat distributors are a more common case that
should be catered to, than users who want to continue reading their
calibre books on another device.

> In my case, I’ve noticed that MOBI file in one of my
> repositories had been regenerated without source file being modified
> (because EPUB had been used as an intermediary), which had been
> confusing.

I have no idea what this means, are you saying that you did a
MOBI-to-MOBI conversion or something? In what way does this relate to
what was mentioned here -- MOBIs cannot be edited the way EPUB can and
they therefore don't have these bookmarks anyway.

> The above issues may not always be catastrophic and may not be frequent,
> but many people will expect that a viewer by default doesn’t modify
> their files and will be surprised by different behaviour. Actually, it’s
> hard for me to find an example of another interactive viewer or even an
> editor that acts similarly.

Unfortunately, there is no standard for ebook annotations, if you can
convince the IDPF to be useful for once and publish one, calibre will be
delighted to migrate to that instead.

As for other viewers, PDF also does embedded annotations/bookmarks.

Eli Schwartz (eschwartz) wrote :

> [...] modifying most of my ebook collection without my knowledge.

This is scare-tactic language used to distract attention away from the
fact that your books are not, in fact, modified. All that happened was
that some additional metadata was bolted onto the side.

If you are worried that calibre opening the file in write mode will
somehow corrupt your files beyond all repair, there is no earthly reason
to think calibre will do that unless you have pre-existing issues.

Claiming that file modification adversely affects file synchronization
is at least an intelligible protest, but I have no idea what your issue
is, largely because you haven't actually mentioned it.

Paul C. Bryan (pbryan) wrote :

> This is scare-tactic language used to distract attention away from the
> fact that your books are not, in fact, modified. All that happened was
> that some additional metadata was bolted onto the side.

It's a fact. It altered my ebook files, and it did so without my knowledge or consent.

> I have no idea what this means, are you saying that you did a
> MOBI-to-MOBI conversion or something? In what way does this relate to
> what was mentioned here -- MOBIs cannot be edited the way EPUB can and
> they therefore don't have these bookmarks anyway.

I have a Markdown file and a Makefile which is used to generate other formats, in particular:

• EPUB (using Pandoc);
• MOBI (from the generated EPUB, using ebook-convert).

If I open the EPUB using ebook-viewer’s default settings and run “make” again, the MOBI file is regenerated.

> As for other viewers, PDF also does embedded annotations/bookmarks.

It may, but I doubt that PDF viewers save annotations/bookmarks without getting user’s approval.

>> As for other viewers, PDF also does embedded annotations/bookmarks.

> It may, but I doubt that PDF viewers save annotations/bookmarks without getting user’s approval.

They certainly don't modify files without saving/prompting. Plus, annotations and bookmarks don't happen automatically. A user has to do *something* to add a bookmark or annotation. Okular (most often used with KDE) will save annotations and bookmarks without prompting but it never modifies the original files.

ebook-viewer on the other hand modifies a file by just opening it and closing it.

I found this unexpected (to me) behaviour doing sanity checks on my file system. "WTH are these files changed? I've only viewed them…"

I can see how it's a useful feature but I think it really should be opt-in. Maybe there could be a one-time prompt when a book is first opened.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Duplicates of this bug

Other bug subscribers