conversion from docx cannot handle images in footnotes

Bug #1221686 reported by idle on 2013-09-06
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Undecided
Unassigned

Bug Description

I have problems converting ebooks that contain images in footnotes. The attached test file contains an inline image in a footnote and when I try to convert it to epub, the conversion fails. When I remove the image from the source file, the conversion runs fine.

Tried with Calibre 1.1.0 and 1.2.0 on Windows. (And probably on some older version on Linux, but that was with a different file and I cannot confirm that the footnote images were the sole reason; the error was the same or very similar, though.)

Another file (which I cannot share but which contained several images in footnotes) showed a different behaviour. It successfully converted, but the images in footnotes were broken, seemed to be xml files instead of image files and therefore would not display in the book.

This is from conversion to epub, but I tried several other formats and they all failed. Here are the details:

Convert book 1 of 1 (Test file)
C:\Program Files\Calibre2\pylib.zip\dateutil\parser.py:336: UnicodeWarning: Unicode equal comparison failed to convert both arguments to Unicode - interpreting them as being unequal
Resolved conversion options
calibre version: 1.2.0
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0.0,
 'book_producer': None,
 'change_justification': u'original',
 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., '\\s*((chapter|book|section|part)\\s+)|((prolog|prologue|epilogue)(\\s+|$))', 'i')) or @class = 'chapter']",
 'chapter_mark': u'pagebreak',
 'comments': None,
 'cover': None,
 'debug_pipeline': None,
 'dehyphenate': True,
 'delete_blank_paragraphs': True,
 'disable_font_rescaling': False,
 'docx_no_cover': False,
 'dont_split_on_page_breaks': False,
 'duplicate_links_in_toc': False,
 'embed_all_fonts': False,
 'embed_font_family': None,
 'enable_heuristics': False,
 'epub_flatten': False,
 'epub_inline_toc': False,
 'epub_toc_at_end': False,
 'extra_css': None,
 'extract_to': None,
 'filter_css': u'',
 'fix_indents': True,
 'flow_size': 260,
 'font_size_mapping': None,
 'format_scene_breaks': True,
 'html_unwrap_factor': 0.4,
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0x00000000046DB630>,
 'insert_blank_line': False,
 'insert_blank_line_size': 0.5,
 'insert_metadata': False,
 'isbn': None,
 'italicize_common_cases': True,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0.0,
 'linearize_tables': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'markup_chapter_headings': True,
 'max_toc_links': 50,
 'minimum_line_height': 120.0,
 'no_chapters_in_toc': False,
 'no_default_epub_cover': False,
 'no_inline_navbars': False,
 'no_svg_cover': False,
 'output_profile': <calibre.customize.profiles.SonyReaderOutput object at 0x00000000046DBEF0>,
 'page_breaks_before': u'/',
 'prefer_metadata_cover': False,
 'preserve_cover_aspect_ratio': False,
 'pretty_print': True,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': u'C:\\Users\\user\\AppData\\Local\\Temp\\calibre_n19wjm\\o1jzso.opf',
 'remove_fake_margins': True,
 'remove_first_image': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'renumber_headings': True,
 'replace_scene_breaks': u'',
 'search_replace': '[]',
 'series': None,
 'series_index': None,
 'smarten_punctuation': False,
 'sr1_replace': None,
 'sr1_search': None,
 'sr2_replace': None,
 'sr2_search': None,
 'sr3_replace': None,
 'sr3_search': None,
 'start_reading_at': None,
 'subset_embedded_fonts': False,
 'tags': None,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'toc_title': None,
 'unsmarten_punctuation': False,
 'unwrap_lines': True,
 'use_auto_toc': False,
 'verbose': 2}
InputFormatPlugin: DOCX Input running
on C:\Users\user\AppData\Local\Temp\calibre_n19wjm\9stdli.docx
Converting Word markup to HTML
Python function terminated unexpectedly
  u'word/../customXml/item1.xml' (Error Code: 1)
Traceback (most recent call last):
  File "site.py", line 132, in main
  File "site.py", line 109, in run_entry_point
  File "site-packages\calibre\utils\ipc\worker.py", line 189, in main
  File "site-packages\calibre\gui2\convert\gui_conversion.py", line 31, in gui_convert_override
  File "site-packages\calibre\gui2\convert\gui_conversion.py", line 25, in gui_convert
  File "site-packages\calibre\ebooks\conversion\plumber.py", line 1027, in run
  File "site-packages\calibre\customize\conversion.py", line 241, in __call__
  File "site-packages\calibre\ebooks\conversion\plugins\docx_input.py", line 29, in convert
  File "site-packages\calibre\ebooks\docx\to_html.py", line 126, in __call__
  File "site-packages\calibre\ebooks\docx\to_html.py", line 330, in convert_p
  File "site-packages\calibre\ebooks\docx\to_html.py", line 511, in convert_run
  File "site-packages\calibre\ebooks\docx\images.py", line 249, in to_html
  File "site-packages\calibre\ebooks\docx\images.py", line 161, in drawing_to_html
  File "site-packages\calibre\ebooks\docx\images.py", line 149, in pic_to_img
  File "site-packages\calibre\ebooks\docx\images.py", line 109, in generate_filename
  File "site-packages\calibre\ebooks\docx\container.py", line 115, in read
KeyError: u'word/../customXml/item1.xml'

idle (idle) wrote :
  • Test file.docx Edit (24.9 KiB, application/vnd.openxmlformats-officedocument.wordprocessingml.document)

Fixed in branch master. The fix will be in the next release. calibre is usually released every Friday.

 status fixreleased

Changed in calibre:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Bug attachments