[EPUB->MOBI] Calibre MOBI file Size too small

Bug #2023189 reported by Upamanyu Santra
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Fix Released
Undecided
Unassigned

Bug Description

Even using the tablet setting the MOBI option outputs drastically small files in Calibre 6.19.1. Tested 4.23.0 which did not have the problem.
All of the file pCloud link
Original epub
https://e.pcloud.link/publink/show?code=XZwEquZD3kHTXYo4RXgzattyuOYcuguOdNk
6.19.1 mobi
https://e.pcloud.link/publink/show?code=XZKEquZ0NxYpESu4eS0Ys6KsbozQkEY59BX
4.23.0 mobi
https://e.pcloud.link/publink/show?code=XZ9EquZ3P7Kw9DasiuLBdv84aelVm0Mji3V

Log of 6.19.1 installed
Convert book 1 of 1 (Hogarth (World of Art))
DeDRM v10.0.3: Trying to decrypt 4u3mnwdn.epub
DeDRM v10.0.3: Verifying zip archive integrity
DeDRM v10.0.3: Post-processing took 0.0 seconds
DeDRM v10.0.3: Finished after 5.3 seconds
Conversion options changed from defaults:
  page_breaks_before: "//*[name()='h1' or name()='h2']"
  personal_doc: None
  verbose: 2
  disable_font_rescaling: True
  cover: 'C:\\Users\\upama\\AppData\\Local\\Temp\\calibre_4xa_b8ea\\g8ylno4p.jpeg'
  output_profile: 'tablet'
  read_metadata_from_opf: 'C:\\Users\\upama\\AppData\\Local\\Temp\\calibre_4xa_b8ea\\wbvu5pa5.opf'
Resolved conversion options
calibre version: 6.19.1
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0.0,
 'book_producer': None,
 'change_justification': 'original',
 'chapter': "//*[((name()='h1' or name()='h2') and re:test(., "
            "'\\s*((chapter|book|section|part)\\s+)|((prolog|prologue|epilogue)(\\s+|$))', "
            "'i')) or @class = 'chapter']",
 'chapter_mark': 'pagebreak',
 'comments': None,
 'cover': 'C:\\Users\\upama\\AppData\\Local\\Temp\\calibre_4xa_b8ea\\g8ylno4p.jpeg',
 'debug_pipeline': None,
 'dehyphenate': True,
 'delete_blank_paragraphs': True,
 'disable_font_rescaling': True,
 'dont_compress': False,
 'duplicate_links_in_toc': False,
 'embed_all_fonts': False,
 'embed_font_family': None,
 'enable_heuristics': False,
 'expand_css': False,
 'extra_css': None,
 'extract_to': None,
 'filter_css': '',
 'fix_indents': True,
 'font_size_mapping': None,
 'format_scene_breaks': True,
 'html_unwrap_factor': 0.4,
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0x00000219F0CE3C70>,
 'insert_blank_line': False,
 'insert_blank_line_size': 0.5,
 'insert_metadata': False,
 'isbn': None,
 'italicize_common_cases': True,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0.0,
 'linearize_tables': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'markup_chapter_headings': True,
 'max_toc_links': 50,
 'minimum_line_height': 120.0,
 'mobi_file_type': 'old',
 'mobi_ignore_margins': False,
 'mobi_keep_original_images': False,
 'mobi_toc_at_start': False,
 'no_chapters_in_toc': False,
 'no_inline_navbars': False,
 'no_inline_toc': False,
 'output_profile': <calibre.customize.profiles.TabletOutput object at 0x00000219F0CE1CC0>,
 'page_breaks_before': "//*[name()='h1' or name()='h2']",
 'personal_doc': None,
 'prefer_author_sort': False,
 'prefer_metadata_cover': False,
 'pretty_print': False,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': 'C:\\Users\\upama\\AppData\\Local\\Temp\\calibre_4xa_b8ea\\wbvu5pa5.opf',
 'remove_fake_margins': True,
 'remove_first_image': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'renumber_headings': True,
 'replace_scene_breaks': '',
 'search_replace': '[]',
 'series': None,
 'series_index': None,
 'share_not_sync': False,
 'smarten_punctuation': False,
 'sr1_replace': None,
 'sr1_search': None,
 'sr2_replace': None,
 'sr2_search': None,
 'sr3_replace': None,
 'sr3_search': None,
 'start_reading_at': None,
 'subset_embedded_fonts': False,
 'tags': None,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'toc_title': None,
 'transform_css_rules': '[]',
 'transform_html_rules': '[]',
 'unsmarten_punctuation': False,
 'unwrap_lines': True,
 'use_auto_toc': False,
 'verbose': 2}
DeDRM v10.0.3: Trying to decrypt _d02i1ur.epub
DeDRM v10.0.3: Verifying zip archive integrity
DeDRM v10.0.3: Post-processing took 0.0 seconds
DeDRM v10.0.3: Finished after 5.3 seconds
InputFormatPlugin: EPUB Input running
on C:\Users\upama\AppData\Local\Temp\calibre_4xa_b8ea\8nkwls0f.epub
Parsing all content...
Parsing OEBPS/xhtml/12_Chapter05.xhtml ...
Parsing OEBPS/xhtml/01_Logo.xhtml ...
Parsing OEBPS/xhtml/18_Chapter11.xhtml ...
Parsing OEBPS/xhtml/11_Chapter04.xhtml ...
Parsing OEBPS/styles/stylesheet.css ...
Parsing OEBPS/xhtml/19_Bibliography.xhtml ...
Parsing OEBPS/xhtml/04_Abouttheauthor.xhtml ...
Parsing OEBPS/xhtml/03_Title.xhtml ...
Parsing OEBPS/xhtml/17_Chapter10.xhtml ...
Parsing OEBPS/xhtml/05_Contents.xhtml ...
Parsing OEBPS/xhtml/16_Chapter09.xhtml ...
Parsing OEBPS/xhtml/09_Chapter02.xhtml ...
Parsing OEBPS/xhtml/toc.xhtml ...
Parsing OEBPS/xhtml/15_Chapter08.xhtml ...
Parsing OEBPS/xhtml/10_Chapter03.xhtml ...
Parsing OEBPS/xhtml/02_Fm01.xhtml ...
Parsing OEBPS/xhtml/08_Chapter01.xhtml ...
Parsing OEBPS/xhtml/14_Chapter07.xhtml ...
Parsing OEBPS/xhtml/06_Preface.xhtml ...
Parsing OEBPS/xhtml/23_Mar.xhtml ...
Parsing OEBPS/xhtml/00_Cover.xhtml ...
Parsing OEBPS/xhtml/07_Introduction.xhtml ...
Parsing OEBPS/xhtml/22_Copyright.xhtml ...
Parsing OEBPS/xhtml/21_Index.xhtml ...
Parsing OEBPS/xhtml/20_Illu.xhtml ...
Parsing OEBPS/xhtml/13_Chapter06.xhtml ...
Reading TOC from NCX...
Merging user specified metadata...
Detecting structure...
 Detected chapter: Chapter 1
 Detected chapter: Chapter 2
 Detected chapter: Chapter 3
 Detected chapter: Chapter 4
 Detected chapter: Chapter 5
 Detected chapter: Chapter 6
 Detected chapter: Chapter 7
 Detected chapter: Chapter 8
 Detected chapter: Chapter 9
 Detected chapter: Chapter 10
 Detected chapter: Chapter 11
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Removing fake margins...
Found 905 items of level: p_1
Found 165 items of level: div_1
Found 335 items of level: p_2
Negative text indent detected at level p_1, ignoring this level
div_1 left margin stats: Counter()
div_1 right margin stats: Counter()
p_2 left margin stats: Counter({'0': 171, '1.5em': 164})
p_2 right margin stats: Counter({'0': 335})
Cleaning up manifest...
Trimming unused files from manifest...
Trimming 'OEBPS/toc.ncx' from manifest
Trimming 'OEBPS/xhtml/toc.xhtml' from manifest
Creating MOBI Output...
Serializing resources...
Creating MOBI 6 output
Generating in-line TOC...
Applying case-transforming CSS...
Parsing manglecase.css ...
Parsing tocstyle.css ...
Rasterizing SVG images...
Converting XHTML to Mobipocket markup...
Serializing markup content...
  Compressing markup content...
Generating MOBI index for a book
MOBI output written to C:\Users\upama\AppData\Local\Temp\calibre_4xa_b8ea\gaw63ibe.mobi

Log of 4.23.0 portable
Convert book 1 of 1 (Hogarth (World of Art))
Conversion options changed from defaults:
  cover: u'C:\\Users\\upama\\AppData\\Local\\Temp\\calibre_rvq_kn\\xtje_y.jpeg'
  read_metadata_from_opf: u'C:\\Users\\upama\\AppData\\Local\\Temp\\calibre_rvq_kn\\zldhxf.opf'
  output_profile: u'tablet'
  personal_doc: None
  disable_font_rescaling: True
  page_breaks_before: u"//*[name()='h1' or name()='h2']"
  verbose: 2
Resolved conversion options
calibre version: 4.23.0
{'asciiize': False,
 'author_sort': None,
 'authors': None,
 'base_font_size': 0.0,
 'book_producer': None,
 'change_justification': u'original',
 'chapter': u"//*[((name()='h1' or name()='h2') and re:test(., '\\s*((chapter|book|section|part)\\s+)|((prolog|prologue|epilogue)(\\s+|$))', 'i')) or @class = 'chapter']",
 'chapter_mark': u'pagebreak',
 'comments': None,
 'cover': u'C:\\Users\\upama\\AppData\\Local\\Temp\\calibre_rvq_kn\\xtje_y.jpeg',
 'debug_pipeline': None,
 'dehyphenate': True,
 'delete_blank_paragraphs': True,
 'disable_font_rescaling': True,
 'dont_compress': False,
 'duplicate_links_in_toc': False,
 'embed_all_fonts': False,
 'embed_font_family': None,
 'enable_heuristics': False,
 'expand_css': False,
 'extra_css': None,
 'extract_to': None,
 'filter_css': u'',
 'fix_indents': True,
 'font_size_mapping': None,
 'format_scene_breaks': True,
 'html_unwrap_factor': 0.4,
 'input_encoding': None,
 'input_profile': <calibre.customize.profiles.InputProfile object at 0x0670DA30>,
 'insert_blank_line': False,
 'insert_blank_line_size': 0.5,
 'insert_metadata': False,
 'isbn': None,
 'italicize_common_cases': True,
 'keep_ligatures': False,
 'language': None,
 'level1_toc': None,
 'level2_toc': None,
 'level3_toc': None,
 'line_height': 0.0,
 'linearize_tables': False,
 'margin_bottom': 5.0,
 'margin_left': 5.0,
 'margin_right': 5.0,
 'margin_top': 5.0,
 'markup_chapter_headings': True,
 'max_toc_links': 50,
 'minimum_line_height': 120.0,
 'mobi_file_type': u'old',
 'mobi_ignore_margins': False,
 'mobi_keep_original_images': False,
 'mobi_toc_at_start': False,
 'no_chapters_in_toc': False,
 'no_inline_navbars': False,
 'no_inline_toc': False,
 'output_profile': <calibre.customize.profiles.TabletOutput object at 0x06718030>,
 'page_breaks_before': u"//*[name()='h1' or name()='h2']",
 'personal_doc': None,
 'prefer_author_sort': False,
 'prefer_metadata_cover': False,
 'pretty_print': False,
 'pubdate': None,
 'publisher': None,
 'rating': None,
 'read_metadata_from_opf': u'C:\\Users\\upama\\AppData\\Local\\Temp\\calibre_rvq_kn\\zldhxf.opf',
 'remove_fake_margins': True,
 'remove_first_image': False,
 'remove_paragraph_spacing': False,
 'remove_paragraph_spacing_indent_size': 1.5,
 'renumber_headings': True,
 'replace_scene_breaks': u'',
 'search_replace': '[]',
 'series': None,
 'series_index': None,
 'share_not_sync': False,
 'smarten_punctuation': False,
 'sr1_replace': None,
 'sr1_search': None,
 'sr2_replace': None,
 'sr2_search': None,
 'sr3_replace': None,
 'sr3_search': None,
 'start_reading_at': None,
 'subset_embedded_fonts': False,
 'tags': None,
 'timestamp': None,
 'title': None,
 'title_sort': None,
 'toc_filter': None,
 'toc_threshold': 6,
 'toc_title': None,
 'transform_css_rules': '[]',
 'unsmarten_punctuation': False,
 'unwrap_lines': True,
 'use_auto_toc': False,
 'verbose': 2}
InputFormatPlugin: EPUB Input running
on C:\Users\upama\AppData\Local\Temp\calibre_rvq_kn\hmmmtb.epub
Parsing all content...
Parsing OEBPS/xhtml/06_Preface.xhtml ...
Parsing OEBPS/xhtml/toc.xhtml ...
Parsing OEBPS/xhtml/04_Abouttheauthor.xhtml ...
Parsing OEBPS/xhtml/13_Chapter06.xhtml ...
Parsing OEBPS/xhtml/00_Cover.xhtml ...
Parsing OEBPS/xhtml/03_Title.xhtml ...
Parsing OEBPS/xhtml/12_Chapter05.xhtml ...
Parsing OEBPS/xhtml/05_Contents.xhtml ...
Parsing OEBPS/xhtml/14_Chapter07.xhtml ...
Parsing OEBPS/xhtml/02_Fm01.xhtml ...
Parsing OEBPS/xhtml/10_Chapter03.xhtml ...
Parsing OEBPS/xhtml/19_Bibliography.xhtml ...
Parsing OEBPS/styles/stylesheet.css ...
Parsing OEBPS/xhtml/11_Chapter04.xhtml ...
Parsing OEBPS/xhtml/18_Chapter11.xhtml ...
Parsing OEBPS/xhtml/23_Mar.xhtml ...
Parsing OEBPS/xhtml/09_Chapter02.xhtml ...
Parsing OEBPS/xhtml/17_Chapter10.xhtml ...
Parsing OEBPS/xhtml/20_Illu.xhtml ...
Parsing OEBPS/xhtml/07_Introduction.xhtml ...
Parsing OEBPS/xhtml/16_Chapter09.xhtml ...
Parsing OEBPS/xhtml/22_Copyright.xhtml ...
Parsing OEBPS/xhtml/21_Index.xhtml ...
Parsing OEBPS/xhtml/01_Logo.xhtml ...
Parsing OEBPS/xhtml/08_Chapter01.xhtml ...
Parsing OEBPS/xhtml/15_Chapter08.xhtml ...
Reading TOC from NCX...
Merging user specified metadata...
Detecting structure...
 Detected chapter: Chapter 1
 Detected chapter: Chapter 2
 Detected chapter: Chapter 3
 Detected chapter: Chapter 4
 Detected chapter: Chapter 5
 Detected chapter: Chapter 6
 Detected chapter: Chapter 7
 Detected chapter: Chapter 8
 Detected chapter: Chapter 9
 Detected chapter: Chapter 10
 Detected chapter: Chapter 11
Flattening CSS and remapping font sizes...
Source base font size is 12.00000pt
Removing fake margins...
Found 165 items of level: div_1
Found 335 items of level: p_2
Found 905 items of level: p_1
div_1 left margin stats: Counter()
div_1 right margin stats: Counter()
p_2 left margin stats: Counter({u'0': 171, u'1.5em': 164})
p_2 right margin stats: Counter({u'0': 335})
Negative text indent detected at level p_1, ignoring this level
Cleaning up manifest...
Trimming unused files from manifest...
Trimming u'OEBPS/xhtml/toc.xhtml' from manifest
Trimming u'OEBPS/toc.ncx' from manifest
Creating MOBI Output...
Serializing resources...
Creating MOBI 6 output
Generating in-line TOC...
Applying case-transforming CSS...
Parsing manglecase.css ...
Parsing tocstyle.css ...
Rasterizing SVG images...
Converting XHTML to Mobipocket markup...
Serializing markup content...
  Compressing markup content...
Generating MOBI index for a book
MOBI output written to C:\Users\upama\AppData\Local\Temp\calibre_rvq_kn\hyfudd.mobi

Revision history for this message
Upamanyu Santra (upamanyu1999) wrote :
Revision history for this message
Kovid Goyal (kovid) wrote :

The files are indeed smaller but the image quality is the same. The only difference is they have been processed to conform to amazon's kindle renderer limitations. That caused the JPEG images to be saved with higher compression. That higher compression does not lead to any visual differences. See for example: the two attached images one from the original epub and the other from the converted azw3 (do not use mobi it is a legacy/obsolete format, azw3 is much higher quality, but regardless with the option to keep original images the images in the MOBI output will be the same as the azw3 output).

Revision history for this message
Kovid Goyal (kovid) wrote :

and from the original epub

Revision history for this message
Kovid Goyal (kovid) wrote :

The reduction in size is purely because the original epub has images with very poor compression.

Revision history for this message
Kovid Goyal (kovid) wrote :

Looking in some more detail, the reason these images get re-encoded by conversion is because they contain EXIF metadata and the Kindle doesnt like images that have EXIF metadata based transpose operators. So conversion removes these. However, the check for existence of EXIF was too lose, it caught images with any EXIF metadata even if it didnt contain transpose operators. I can fix that which will result in no changes to the images in this book since they contain EXIF without transpose.

Revision history for this message
Kovid Goyal (kovid) wrote :

Fixed in branch master. The fix will be in the next release. calibre is usually released every alternate Friday.

Changed in calibre:
status: New → Fix Released
Revision history for this message
Upamanyu Santra (upamanyu1999) wrote :

Thank You Kovid

Revision history for this message
Upamanyu Santra (upamanyu1999) wrote :

Can I request one small feature. If you have time can you implement a feature or a output profile or a toggle that will limit and compress the converted ebook size to 200 MB or a little less. Thanks in Advance

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.