Cannot copy text from specific pdf in evince

Bug #545176 reported by drpjkurian on 2010-03-23
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Poppler
Won't Fix
Medium
evince (Ubuntu)
Undecided
Unassigned

Bug Description

Binary package hint: evince

1) lsb_release -rd
Description: Ubuntu Vivid Vervet (development branch)
Release: 15.04

2) apt-cache policy evince
evince:
  Installed: 3.14.1-0ubuntu1
  Candidate: 3.14.1-0ubuntu1
  Version table:
 *** 3.14.1-0ubuntu1 0
        500 http://us.archive.ubuntu.com/ubuntu/ vivid/main amd64 Packages
        100 /var/lib/dpkg/status

3) What is expected to happen via https://bugs.launchpad.net/ubuntu/+source/evince/+bug/545176/+attachment/1239564/+files/emrscheme.pdf is that one may select any of the text, just like Adobe Reader.

4) What happens instead is only a handful of letters are selectable as per https://launchpadlibrarian.net/41742091/Screenshot-2.png .

ProblemType: Bug
Architecture: i386
Date: Tue Mar 23 21:09:22 2010
DistroRelease: Ubuntu 9.10
ExecutablePath: /usr/bin/evince
Package: evince 2.28.1-0ubuntu1.2
ProcEnviron:
 LANGUAGE=en_IN.UTF-8
 PATH=(custom, user)
 LANG=en_IN.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-20.58-generic
SourcePackage: evince
Uname: Linux 2.6.31-20-generic i686

drpjkurian (drpjkurian-gmail) wrote :
madbiologist (me-again) wrote :

Can you attach the problem document to this bug report?

Changed in evince (Ubuntu):
status: New → Incomplete
drpjkurian (drpjkurian-gmail) wrote :

Hi
I am using Karmic Kaola with the latest kernel
When I try to copy the text from a paragraph in a pdf file, The programme selects few random alphabets (usually the first letter of the word)from the paragraph. I was not able to select the whole paragraph to copy.

I hope this is what you meant(problem document). Sorry If I'am wrong

madbiologist (me-again) wrote :

The image you attached is useful. It shows us the problem you are experiencing, more clearly than a written description. Can you also attach the PDF file so that we can do some testing?

drpjkurian (drpjkurian-gmail) wrote :

Hi
Please find herewith the attached copy of the .pdf file

With regards
Dr Kurian

madbiologist (me-again) on 2010-03-27
Changed in evince (Ubuntu):
status: Incomplete → Confirmed
madbiologist (me-again) wrote :

Confirmed on Ubuntu 10.04 "Lucid Lynx" alpha 3 with updates.

Package versions: evince 2.29.92-0ubuntu1
                              poppler 0.12.4-0ubuntu2
                              libcairo2 1.8.10-2ubuntu1

Still confirmed with evince 2.30 on lucid beta 2

drpjkurian (drpjkurian-gmail) wrote :

The problem still persist in Lucid stable version

madbiologist (me-again) wrote :

Still occurring on Maverick.

uname: Linux 2.6.35-22-generic i686
poppler 0.14.3-0ubuntu1
evince 2.32.0-0ubuntu1

description: updated
Changed in evince (Ubuntu):
importance: Undecided → Medium
status: Confirmed → Triaged
summary: - cannot copy text from pdf in evince
+ Cannot copy text from specific pdf in evince
Changed in evince:
importance: Unknown → Medium
status: Unknown → New
madbiologist (me-again) on 2014-12-18
tags: added: karmic lucid maverick
Sitaram Shelke (sitaramshelke) wrote :

Is the PDF format correct?
I have tried to open the PDF with Ubuntu-evince ,Ubuntu-okular and Mac-default PDF Reader and I have attached respective screenshots if you want to take a look at it. All ther reader gives different selection with none of them perfect.

Changed in evince:
status: New → Confirmed

Created attachment 118948
emrscheme-1.pdf

Upstreaming as advised in:
https://bugzilla.gnome.org/show_bug.cgi?id=741623

Downstream report:
https://bugs.launchpad.net/ubuntu/+source/evince/+bug/545176

lsb_release -rd
Description: Ubuntu Wily Werewolf (development branch)
Release: 15.10

apt-cache policy poppler-utils
poppler-utils:
  Installed: 0.33.0-0ubuntu3
  Candidate: 0.33.0-0ubuntu3
  Version table:
 *** 0.33.0-0ubuntu3 0
        500 http://us.archive.ubuntu.com/ubuntu/ wily/main amd64 Packages
        100 /var/lib/dpkg/status

What is expected to happen with Evince is that when one opens the attached PDF file, one may select any of the text, just like in Adobe Reader.

What happens instead is only a handful of letters are selectable as per https://launchpadlibrarian.net/41742091/Screenshot-2.png . First reported against Ubuntu 9.10 evince 2.28.1-0ubuntu1.2 / poppler-utils 0.12.4-0ubuntu4.

Changed in evince:
status: Confirmed → Unknown

I don't see why it would be useful to select text from this document. You aren't going to be able to copy and paste any readable text from it. The document doesn't use a standard encoding or include a way to map from charcode to text, so at best you're going to get a bunch of random letters and control characters.

Changed in poppler:
importance: Unknown → Medium
status: Unknown → Confirmed

Jason Crain, thanks for taking a look. Given your point about the documents encoding is confirmed with Windows 10's built-in PDF reader, this is considered closed.

drpjkurian, as per upstream:
"The document doesn't use a standard encoding or include a way to map from charcode to text"

Hence, the root cause here resides with how the original author created the PDF, versus a software bug in a PDF viewer.

Changed in evince (Ubuntu):
importance: Medium → Undecided
status: Triaged → Invalid
no longer affects: evince
Changed in poppler:
status: Confirmed → Won't Fix
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.