Convert epub to pdf, pdf appearance looks correct, but some of the copied text is incorrect
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
calibre |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
* The calibre version (get this by looking at the bottom of the main calibre screen)
I've tried both 4.6.0 and 4.7.0
* The operating system you are running calibre on (Windows, OS X, Linux)
I tried windows 10, but I think other Windows versions should have samme problems.
* My issue
For the ePub file with Chinese text converted to PDF, some text will be garbled. I suspect that there may be some problems in the processing of CMAP / CID by PDF. So I expect to add several parameters to the PDF output options for debugging the output problems. I would rather accept that the document is bigger than that the text is incorrect.
The following code is taken from "src\calibre\
if opts.pdf_
merge_
if opts.pdf_
num_removed = dedup_type3_
if num_removed:
if opts.pdf_
num_removed = remove_
if num_removed:
if opts.pdf_
num_removed = pdf_doc.
if num_removed:
* If you are reporting a conversion problem, attach the input file and the output file and describe exactly what the problem is.
On the left side of the attachment is the PDF reader, there is no problem with the appearance, on the right side is the text selected on page 4 and copied to the Notepad. All the text marked in red in the figure has problems.
Below is my command line and output:
D:\software\
Conversion options changed from defaults:
base_font_size: 14.0
pdf_serif_family: u'\u5fae\
pdf_sans_family: u'\u5fae\
1% 将输入转换为HTML中...
InputFormatPlugin: EPUB Input running
on D:\software\
Found HTML cover titlepage.xhtml
Parsing all content...
34% 正在对电子书进行转换...
Merging user specified metadata...
Detecting structure...
Flattening CSS and remapping font sizes...
Source base font size is 13.20000pt
Removing fake margins...
Cleaning up manifest...
Trimming unused files from manifest...
Creating PDF Output...
67% 正在运行 PDF Output 插件
D:\software\
The cover image has an id != "cover". Renaming to work around bug in Nook Color
68% Parsed all content for markup transformation
70% Completed markup transformation
90% Rendered all HTML as PDF
91% Added links to PDF content
100% Updated metadata in PDF
PDF output written to D:\software\
输出保存到 D:\software\
Embed the fonts you are using in the epub file and attach that, so I can
reproduce.
status incomplete