pdftotext -htmlmeta does output incomplete metadata, pdfinfo outputs them all
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Poppler |
New
|
Unknown
|
|||
poppler (Ubuntu) |
New
|
Undecided
|
Unassigned |
Bug Description
pdftotext -htmlmeta does miss metadata from PDF catalog. pdfinfo does output all values known:
e.g. a pdfinfo output:
Title: Titel
Author: Word
Creator: WordToPDF 2.4 build 127
Producer: AFPL Ghostscript 8.54
CreationDate: Fri Jul 2 09:14:02 2007
ModDate: Fri Jul 2 09:14:02 2007
Tagged: no
Pages: 6
Encrypted: no
Page size: 595 x 842 pts (A4)
File size: 104664 bytes
Optimized: no
PDF version: 1.3
in contrast the meta section of the pdftotext -htmlmeta output:
<head>
<title>
<meta name="Author" content="Word"/>
<meta name="Creator" content="WordToPDF 2.4 build 127"/>
<meta name="Producer" content="AFPL Ghostscript 8.54"/>
<meta name="CreationDate" content=""/>
</head>
Does not match and miss some meta data.
ProblemType: Bug
DistroRelease: Ubuntu 11.10
Package: poppler-utils 0.16.7-2ubuntu2
Uname: Linux 3.3.3-030303-
NonfreeKernelMo
ApportVersion: 1.23-0ubuntu4
Architecture: amd64
Date: Wed May 2 15:44:06 2012
InstallationMedia: Ubuntu 11.04 "Natty Narwhal" - Release amd64+mac (20110427.1)
ProcEnviron:
LANGUAGE=
PATH=(custom, user)
LANG=de_DE.UTF-8
SHELL=/bin/bash
SourcePackage: poppler
UpgradeStatus: Upgraded to oneiric on 2012-02-16 (76 days ago)
Changed in poppler: | |
importance: | Unknown → Medium |
status: | Unknown → Confirmed |
tags: | added: precise |
Changed in poppler: | |
status: | Confirmed → Unknown |
Changed in poppler (Ubuntu): | |
status: | Incomplete → New |
no longer affects: | poppler |
Changed in poppler: | |
status: | Unknown → New |
This bug was originally reported at https:/ /bugs.launchpad .net/ubuntu/ +source/ poppler/ +bug/993292
pdftotext -htmlmeta output is missing metadata from PDF catalog. pdfinfo does output all values known:
e.g. a pdfinfo output:
Title: Titel
Author: Word
Creator: WordToPDF 2.4 build 127
Producer: AFPL Ghostscript 8.54
CreationDate: Fri Jul 2 09:14:02 2007
ModDate: Fri Jul 2 09:14:02 2007
Tagged: no
Pages: 6
Encrypted: no
Page size: 595 x 842 pts (A4)
File size: 104664 bytes
Optimized: no
PDF version: 1.3
in contrast the meta section of the pdftotext -htmlmeta output:
<head> Titel</ title>
<title>
<meta name="Author" content="Word"/>
<meta name="Creator" content="WordToPDF 2.4 build 127"/>
<meta name="Producer" content="AFPL Ghostscript 8.54"/>
<meta name="CreationDate" content=""/>
</head>