Viewer displays Djvu text layer instead of images layer

Bug #1586827 reported by santropedro on 2016-05-29
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Undecided
Unassigned

Bug Description

I have calibre 2.57.1 [64 bits] in windows 10.

When trying to view ebooks on djvu format WITH THE CALIBRE VIEWER, I encounter a problem that I described on a question on ebooks stack exchange. I will ask you to read it: It describes my problem with images, and even got a response from a former Calibre developer (he made djvu viewer technology for calibre he says, and has a big reputation on stack exchange so it seems true, also his username gives google results that are coherent with that). So I link the question because It sums up all I need to say. Don't skip the link please.

http://ebooks.stackexchange.com/questions/6632/calibre-viewer-displays-books-wrong

The problem is that the viewer is displaying the hidden text file used for selection and copying. This makes it impossible to read. I want to see the scanned book, but it shows me the text layer, wich I'm not interested.

How to Reproduce the problem: Just open any DJVU ebook with calibre viewer, from windows explorer or calibre itself.

Also, it crashes when trying to load pdfs. The developer that responded the question said it could be because it tries to find the text hidden layer and there isn't one for that book.

How do I turn on image viewing instead of hidden text layer view?

santropedro (santropedro) wrote :

I am willing to make experiments, so please contact me. I want to help. You are awesome.

Eli Schwartz (eschwartz) wrote :

calibre's E-Book Viewer works by internally converting an ebook to EPUB before displaying the result. Formats which have both text and image layers have never been handled well -- not just DJVU but PDF also.

To be honest, I think you are better off telling calibre to NOT open them in calibre's internal Viewer, but to open them using the OS default instead (whatever DJVU reader you have installed).

See Preferences ==> Behavior ==> Use internal viewer for
DJVU and PDF are disabled by default.

santropedro (santropedro) wrote :

Look, I'm amazed that this is not a primary concern. Calibre CLAIMS ITSELF to be "a solution for all your ebook needs". A working viewer is the absolute minimum that a program can ask, and I'm asking about DJVU, the second most used format in the world for ebooks.

Thanks for your help. I think that this is not a "feature request" this is a request for the program to do something it's supposed to do, and not doing it is hugely dissapointing.

Probably I'm missing some reason why this is not a HUGE problem.

Eli Schwartz (eschwartz) wrote :

Well, don't mind me -- I am just a random user posting a suggestion. :)
(And I am a happy inmate of the Amazon ecosystem, too. I am not your go-to guy for being considerate of other formats, even EPUB...)

It is up to Kovid to decide what to do with your request.
(He is currently traveling, see: http://www.mobileread.com/forums/showthread.php?t=274572 so he should reply ~Friday.)

...

I do find your claim that DJVU is the second-most-used format to be somewhat... ambitious... but that is neither here nor there.

santropedro (santropedro) wrote :

Sorry Eli, it was a fabrication or lie that about DJVU books. In my collection, I have 300 pdf and 90 djvu, and the rest of the formats have very little books. But my collection is very specific: science books, specially maths, specially downloaded from almost the same source (or 2 specific sources) so they can be biased. Still, Djvu is a major format when you download books (not paid) from the internet, less than pdf but still really popular. I hope they put attentioninto this because it's important, it's not a dumb bug, it's ruining the product.

Kovid Goyal (kovid) wrote :

calibre deals with *text* the djvu files you refer to are page scans not text. Just because something calls itself an ebook does not make it an ebook. A collection of page scans is not a ebook. If you have a djvu file with actual text in it, calibre will handle it fine.

Changed in calibre:
status: New → Invalid
santropedro (santropedro) wrote :
Download full text (3.4 KiB)

Sorry for being rude attacking your program. I don't have any intention whatsoever of making anyone feel bad and I seriously appreciate the work you do. I want to make clear that I intent to improve your program and I'm proud of you. I thank you and I'm in a inferior and more ignorant position in this topic than you (you made a program and what I made?)

Now to the point. You missinterpret the problem. The ebook that I have has BOTH images and text. And it's displaying the text. Of course the text doesn;t look good and it's wrong... that's fault of the text, not your program. That's true....

 But then your program lacks the ability to switch to another behaviour in wich displays images...

To wich you answer (I imagine, correct me if I don't get your point) "that's not a real ebook".

Now, equations can't be displayed like that, because they are unreadable. I beg you to check one last time on my stack exchange question on this bug's post (up there) the images. You will see that it's impossible and dumb trying to read the text, but the images looks gorgeous.

I'm a newbie on this issue and you sure now more than me. There are many things I sure ignore about this. I speak with humility.

But I have major points I ask you to consider:

0)Why they "aren't ebooks"? They satisfy this definition of ebook: https://en.wikipedia.org/wiki/E-book.
From the 80 books I own in DJVU, all science books, I opened 11 and 9 displayed the text (
What do you mean when you say they aren't ebooks? They don't satisfy you? So that means that the majority of books that I own, aren't actually ebooks? That seems a very elitist definition. With my definition, I could view all my DJVU books, with yours, I? cant view them.

1) Is this about piracy on the bottom? You don't want people to view pirate scaned books? If it;s that then I resign to convince you.

1)How hard it would be to OCR a math book,considering math notation it's higly complicated on the caligraphical level? For example Tex and LaTex were in considerable part created to deal with math notation being hard to handle. It's impossible to ask computers to digitize a math equation correctly. They can see letters, but math equations contain bars, sums, subindex, integrals with upeer and down limits... It's not reasonable to ask for a digital version of a printed book. If the publisher doesn;t provide the original latex file then... Scanning it's much faster than typing. So we will never have math ebooks of most math titles with your elitist definition.

2) Scanned book it's actually more true to the editorial intent than text based. Yes, text based is more manipulable, but scanned looks exactly like the book in print.

3) There is a huge (the majority that I have I least) portion of ebooks that are contain scanned images formats that your app can't handle... Science books are mostly the format with scanned images (at least those uploaded to LIBGEN)
I know, I have mostly downloaded from free sites the books I own, so my collection is biased in "not ebooks". In this poor country in wich a dollar is 14 $ it's not affodable a math book. There is two alternatives: be ignorant or pirate.

4) It is not that hard and...

Read more...

Eli Schwartz (eschwartz) wrote :

Saying "pirates need this feature" is an excellent way to get your request ignored. calibre's target userbase is most certainly not the lawbreakers (even if you try to justify yourself by saying you can't afford the book, that still doesn't make it legally acceptable).

And math textbooks can absolutely be created as EPUB using embedded images or better yet MathJax (both of which calibre will display)... though I doubt pirates will ever put in that effort.

And what makes you so positive that "It is not that hard" to add support for another fundamental way of processing ebook Conversion Input?

santropedro (santropedro) wrote :

I wouldn't claim piracy of books is legal, it's indeed illegal.
But it's still moral, and in this case, morality is above the law. It's also moral because my country is poor, and if Ican read Ican't educate myself and I my country will continue in sea of ignorance and misery.

So yes, if you want to ignore my request for that, go ahead, I'm proud of my desition and some day you will understand more than just plainly reading some law.

For example this book is 130 dollars for example (and in ebook is 130, but let's try to get it cheaper) that means to me 130*14=1820. Given the minimum salary argentine by hour: http://www.elsalario.com.ar/main/Salario/salario-minimo

That's a minimum of 36 hours of work. If you have a great salary can be 20.

That's too much for a book that you don't even know if you like.
That's what matters to me. I live in a democracy and most people in my country (70%) are in favour of piracy, so even if the elite rich minority, with the help of lawyers try to prevent me by writing laws, I do what's moral. I could donate to the author but my donation would be small in dollars because I'm comparatevely in a poor country.

I hope you guys understand.

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers