To fix broken documents run the following:
simple-scan --fix-pdf ~/Documents/*.pdf
It should be safe to run this on all PDF documents but PLEASE BACKUP FIRST. It will also copy the existing document to DocumentName.pdf~ so you have those in case anything goes wrong.
If you can't wait for the next simple-scan, you can also run this Python program (i.e. python fixpdf.py broken.pdf > fixed.pdf)
import sys
import re
lines = file (sys.argv[1]).readlines ()
xref_offset = int(lines[-2])
xref_offset = 0
for (n, line) in enumerate (lines):
# Fix PDF header and binary comment
if (n == 0 or n == 1) and line.startswith ('%%'): xref_offset -= 1 line = line[1:]
# Fix xref format
match = re.match ('(\d\d\d\d\d\d\d\d\d\d) 0000 n\n', line)
if match != None: offset = int (match.groups ()[0]) line = '%010d 00000 n \n' % (offset + xref_offset)
# Fix xref offset
if n == len(lines) - 2: line = '%d\n' % (int (line) + xref_offset)
# Fix EOF marker
if n == len(lines) - 1 and line.startswith ('%%%%'):
line = line[2:]
The solution was found by Rafał Mużyło in this bug: /bugs.gentoo. org/show_ bug.cgi? id=380429
https:/
This is now fixed for 3.1.90.
To fix broken documents run the following:
simple-scan --fix-pdf ~/Documents/*.pdf
It should be safe to run this on all PDF documents but PLEASE BACKUP FIRST. It will also copy the existing document to DocumentName.pdf~ so you have those in case anything goes wrong.
If you can't wait for the next simple-scan, you can also run this Python program (i.e. python fixpdf.py broken.pdf > fixed.pdf)
import sys
import re
lines = file (sys.argv[ 1]).readlines ()
xref_offset = int(lines[-2])
xref_offset = 0
xref_ offset -= 1
line = line[1:]
for (n, line) in enumerate (lines):
# Fix PDF header and binary comment
if (n == 0 or n == 1) and line.startswith ('%%'):
# Fix xref format d\d\d\d\ d\d\d\d) 0000 n\n', line)
offset = int (match.groups ()[0])
line = '%010d 00000 n \n' % (offset + xref_offset)
match = re.match ('(\d\d\
if match != None:
# Fix xref offset
line = '%d\n' % (int (line) + xref_offset)
if n == len(lines) - 2:
# Fix EOF marker
if n == len(lines) - 1 and line.startswith ('%%%%'):
line = line[2:]
print line,