UnicodeEncodeError: 'utf-8' codec can't encode characters in position 6-7: surrogates not allowed

Bug #1881125 reported by Nelson A. de Oliveira
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Invalid
Undecided
Unassigned
calibre (Debian)
New
Undecided
Unassigned

Bug Description

While trying to convert a PDF to MOBI I am seeing this:

$ ebook-convert itil-etc.pdf itil-etc.mobi
1% Convertendo entrada para HTML...
InputFormatPlugin: PDF Input running
on /home/naoliv/curso/itil-etc.pdf
pdftohtml log:
Syntax Error: Expected the optional content group list, but wasn't able to find it, or it isn't an Array
Page-1
Page-2
Page-3
(...)
Page-171
Syntax Error: Expected the optional content group list, but wasn't able to find it, or it isn't an Array
Syntax Error: Expected the optional content group list, but wasn't able to find it, or it isn't an Array
Syntax Error: Expected the optional content group list, but wasn't able to find it, or it isn't an Array
Traceback (most recent call last):
  File "/usr/bin/ebook-convert", line 20, in <module>
    sys.exit(main())
  File "/usr/lib/calibre/calibre/ebooks/conversion/cli.py", line 401, in main
    plumber.run()
  File "/usr/lib/calibre/calibre/ebooks/conversion/plumber.py", line 1108, in run
    self.oeb = self.input_plugin(stream, self.opts,
  File "/usr/lib/calibre/calibre/customize/conversion.py", line 245, in __call__
    ret = self.convert(stream, options, file_ext,
  File "/usr/lib/calibre/calibre/ebooks/conversion/plugins/pdf_input.py", line 73, in convert
    opf.render(opffile)
  File "/usr/lib/calibre/calibre/ebooks/metadata/opf2.py", line 1521, in render
    item = E.item(id=unicode_type(ref.id), href=href)
  File "src/lxml/builder.py", line 210, in lxml.builder.ElementMaker.__call__
  File "src/lxml/builder.py", line 195, in lxml.builder.ElementMaker.__init__.add_dict
  File "src/lxml/etree.pyx", line 2429, in lxml.etree._Attrib.__setitem__
  File "src/lxml/apihelpers.pxi", line 593, in lxml.etree._setAttributeValue
  File "src/lxml/apihelpers.pxi", line 1532, in lxml.etree._utf8
UnicodeEncodeError: 'utf-8' codec can't encode characters in position 6-7: surrogates not allowed

Calibre is Debian's 4.99.4+dfsg+really4.17.0-1
"calibre --version" gives calibre 4.99.5
Python 2.7.18 and 3.8.3 are installed (I don't know if it's using 2 or 3)
locale has:

LANG=pt_BR.UTF-8
LANGUAGE=pt_BR:pt:en
LC_CTYPE="pt_BR.UTF-8"
LC_NUMERIC="pt_BR.UTF-8"
LC_TIME="pt_BR.UTF-8"
LC_COLLATE="pt_BR.UTF-8"
LC_MONETARY="pt_BR.UTF-8"
LC_MESSAGES="pt_BR.UTF-8"
LC_PAPER="pt_BR.UTF-8"
LC_NAME="pt_BR.UTF-8"
LC_ADDRESS="pt_BR.UTF-8"
LC_TELEPHONE="pt_BR.UTF-8"
LC_MEASUREMENT="pt_BR.UTF-8"
LC_IDENTIFICATION="pt_BR.UTF-8"
LC_ALL=

Do you need any more info, please?

Revision history for this message
Nelson A. de Oliveira (naoliv) wrote :
Revision history for this message
Kovid Goyal (kovid) wrote : Re: calibre bug 1881125

I do not provide support for people using distro provided packages. If you are using a distro provided package, uninstall it and install the official calibre binary from https://calibre-ebook.com/download_linux and if you can reproduce the bug with that, re-open the ticket. Of course, if you are already using the official binary, apologies and feel free to re-open the ticket immediately :)

 status invalid

Changed in calibre:
status: New → Invalid
Revision history for this message
Nelson A. de Oliveira (naoliv) wrote :

I would test with the official installer, but it fails:

Extracting files to /opt/calibre ...
Extracting application files...
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "<string>", line 800, in script_launch
  File "<string>", line 756, in main
  File "<string>", line 707, in run_installer
  File "/usr/lib/python3.8/subprocess.py", line 340, in call
    with Popen(*popenargs, **kwargs) as p:
  File "/usr/lib/python3.8/subprocess.py", line 854, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "/usr/lib/python3.8/subprocess.py", line 1702, in _execute_child
    raise child_exception_type(errno_num, err_msg, err_filename)
FileNotFoundError: [Errno 2] No such file or directory: '/opt/calibre/calibre_postinstall'

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.