Find (ctrl-f) can be used even though the pdf-file has no searchable text.

Bug #590834 reported by Roman Brodylo
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Evince
Expired
Medium
evince (Ubuntu)
Triaged
Wishlist
Unassigned

Bug Description

Binary package hint: evince

PDF Files can be void of any searchable text. For example PDF's created using simple-scan. Enabling the Find function for this type of PDF is misleading. I can enter a string and Evince answers 0 found - although the string/word is actually in the text. The user is mislead to believe the word/string doesn't appear in this text.

PDF's not containing any searchable text should be recognized by the application(Evince) and a search not offered to the user; Because a search really isn't done.
This can really set you of track if you're used to "standard" pdf functionality (searchability) and are oblivious of this other kind of pdf.

Ubuntu GNU/Linux 10.04 "Lucid Lynx"
evince 2.30.1-0ubuntu3

Revision history for this message
Pedro Villavicencio (pedro) wrote :

could you attach an example to the report? Thanks.

Changed in evince (Ubuntu):
importance: Undecided → Low
status: New → Incomplete
Revision history for this message
Roman Brodylo (roman-brodylo) wrote :

Just to clarify: my issue is the user interface. I feel I've been mislead when the app say's it didn't find a word, where in reality no search took place.

Okay, the example:
I printed this bug report page, scanned it with simple-scan, now it's a PDF.
Open with Evince, ctrl-f and search for "pedro". Result is "0 found".

This really is easily reproducible by anyone.

I think someone with a good knowledge of the technicalities of PDF should be able to quickly assess this and determine if it can be fixed.

Revision history for this message
Sebastien Bacher (seb128) wrote :

> Just to clarify: my issue is the user interface. I feel I've been mislead when the app say's it didn't find a word, where in reality no search took place.

the search took place but your example has no text only images, evince could try to be smart about it and that would work in cases where you get images only but real world example often have text and images and you would be back in the issue that only the text can be searched this way not images

Revision history for this message
Roman Brodylo (roman-brodylo) wrote :

Maybe the search-routine was invoked - that's an internal technical view. If the search function can only be applied to text (which is the case here) then evince should try to be "smart about it" when no object (text) for the function (find) is available.

I installed acrobat reader to see if it does what I'm suggesting but it doesn't.
It acts as if it's busy searching through the file with the result: "no matches found".
But I found a function in acrobat reader under Document, Accessibility Quick Check and this function, if let loose on the same document comes up with the result:
 "This document appears to contain no text. It may be a scanned image." Oh, and how right it is!
Now if this function could be called when the document is loaded, then if I press ctrl-f (or select Edit, Find) I could be presented with a grayed, not writable search-box and with the message:
 "This document appears to contain no text. It may be a scanned image." (So give up trying to find something, dummy!)
I like apps that are smart about what they're doing.

I'll be more precise (and concise) about the bug:
Any filetype that can be shown with evince that does not have any (via ctrl-f or edit,find) searchable content, shouldn't
 - display the search-bar
 - allow inputting a search term
 - and return "0 found on this page"
as though a search had taken place.

I know this might not be the most important issue on the planet. I just stumbled across it and thought I'd let you know. If you think no action needs to be taken, I can accept that.

Revision history for this message
Sebastien Bacher (seb128) wrote :

Thank you for your bug report. The issue is an upstream one and it would be nice if somebody having it could send the bug the to the people writting the software (https://wiki.ubuntu.com/Bugs/Upstream/GNOME)

Changed in evince (Ubuntu):
importance: Low → Wishlist
status: Incomplete → New
Revision history for this message
Sebastien Bacher (seb128) wrote :

you are right there is one case where it would work, my point was that a 75 pages documents with only image and one line of text would not display this warning and still fall in the "search is useless on this document" category, your workaround is only for a very specific case

Revision history for this message
Roman Brodylo (roman-brodylo) wrote :

Agreed. Optimal solution would be a visual cue - images and text with different, very light colored backgrounds (or frames).
But that's it for this issue from my side.
A similar bug was already filed upstream; I added my stuff.
https://bugzilla.gnome.org/show_bug.cgi?id=596888
And a patch covering 80% of my troubles is also already there.
So this bug has become obsolete.
Thanks for your efforts.

Changed in evince (Ubuntu):
status: New → Triaged
Changed in evince:
status: Unknown → Confirmed
Changed in evince:
importance: Unknown → Medium
Revision history for this message
Pierre Slamich (pierre-slamich) wrote :

Just 2 additional comments:
-simple-scan should ocr and add a text version of the image in invisible characters, overlayed on the actual images
-evince could ocr pdf files that only have images in them

Changed in evince:
status: Confirmed → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.