Error generating TOC on AZW3 format

Bug #1898394 reported by Neil Laslett
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Fix Released
Undecided
Unassigned

Bug Description

Calibre version: 5.1
OS: Windows 10 x64 (build 19041, aka "version 2004")

When defining the table of contents (TOC) in a AZW3 book, I am getting strange but consistent errors related to auto-generated ids. I can correctly set up the TOC using either the editor or directly in the toc.ncx file and the check shows no errors. But when I save, close, and re-open the book in the editor, all the TOC entries are changed to point to invalid IDs that seem to be some kind of default values that do not exist in the actual HTML.

This is true when:
* I set a TOC entry as "top of page" (no id in the link)
* I set a TOC entry to the first DIV in the document. Calibre generates a "top_1" id in the HTML.
* I set my own (valid) IDs on the top DIV or BODY tag and reference them in toc.ncx

In all cases, after saving and reopening, the TOC entries look something like:
<content src="text/part0000.html#1-85a1bc389af5469293974b51a082e2e5"/>
<content src="text/part0004.html#3Q281-85a1bc389af5469293974b51a082e2e5"/>
<content src="text/part0005.html#4OIQ2-85a1bc389af5469293974b51a082e2e5"/>

I see the same issue in metadata.opf:
<reference type="toc" href="text/part0034.html#10DJ41-85a1bc389af5469293974b51a082e2e5" title="Table of Contents"/>

Note the invalid auto-generated IDs all share the same GUID value at the end. I have been unable to determine the source of this GUID. Every time I save and reopen, the IDs are all re-generated with a new GUID. This means that I cannot manually use these IDs in the HTML even if they were valid, which most are not.

In practice, I do not know if the error exists when saving the file or when re-opening the file. When loaded on my kindle (paperwhite 3) the TOC bookmarks mostly work, with one exception. I can't see anything different about this chapter in the editor.

The HTML and CSS is otherwise fairly clean, with only a bit of DIV-itis. Nothing obviously strange.

This is a new book created with Calibre 5.1, converted from an epub format and merged from multiple source files. Attaching.

Revision history for this message
Neil Laslett (nlaslett) wrote :
Revision history for this message
Kovid Goyal (kovid) wrote : Re: calibre bug 1898394

I cannot replicate. Steps I tried:

1) Open azw3 file
2) Open the ToC editor all the entries have red dots next to them
indicating broken links
3) Change the location of the first entry to point to the top of the
first file.
4) Click OK to close the dialog
5) Click save and exit editor
6) Re-open editor with the same file
7) Open the toc editor, the link is now green and clicking on it in the
table of contents panel (which you can access from the view menu) moves
the cursor to the correct place.
8) Checking by opening the toc.ncx shows the link going to text/part0000.html#toc_2
and ctrl clicking it jumps to the correct place.

 status invalid

Changed in calibre:
status: New → Invalid
Revision history for this message
Neil Laslett (nlaslett) wrote :

Thank you for the detailed assessment. I'm still seeing the behavior, however. I fix the invalid TOC entries (red dot) to top of page, and then when I save and reopen, all the TOC entries are invalid again and toc.ncx shows things like
<content src="text/part0000.html#1-d03e3d18457c48eeaab187cff385d739"/>

Any ideas? It this worth a complete reinstall?

It sounds like we are doing the exact same things, just getting different results.

Revision history for this message
Kovid Goyal (kovid) wrote :

Let me suggest you first fix the ToC in the source epub then convert it
to azw3, that might solve your problems.

Revision history for this message
Neil Laslett (nlaslett) wrote :

I was able to manually place all the autogenerated IDs from toc.ncx onto the first element within the <body> tag (not <body> itself). That caused the TOC editor to show all green checks. When I saved and re-opened, the TOC entries were still valid and working - they didn't change as before.

I don't know if this is something weird with my book, my system, or a deep bug in Calibre. TOC entries at "top of page" were definitely not working, nor were IDs directly on the <body> tag. Every time, TOC entries would get fresh (invalid) IDs in the form of xxxxx-{GUID}.

My problem is fixed, but if it is a Calibre bug, happy hunting!

Thank you for your help and all you do on this project.

Neil Laslett (nlaslett)
Changed in calibre:
status: Invalid → New
status: New → Invalid
Revision history for this message
Neil Laslett (nlaslett) wrote :

Sorry to reopen this old one... I am seeing this behavior consistently with all books, from all sources, when working with AZW3 file formats.

Calibre version: 5.7.2
OS: Windows 10 x64 (build 19042)

This includes original content as well as content in MOBI and EPUB from other sources. I have completely removed all plugins.

For the current example, I am using an EPUB file from TOR (their monthly free download of Seanan McGuire's "Come Tumbling Down"). I haven't modified this EPUB in any way, and it works great, no problems. The ToC is fine. When I convert to AZW3, all ToC entries that had been pointing to "top of page" now point to an invalid id that is a GUID autogenerated by Calibre. This id does not exist anywhere in the files. If I manually edit the ToC to return it to top of file, the ToC looks good again, but after saving, closing, and re-opening, there is a new invalid autogenerated GUID id and the ToC entry shows the error.

It's not just this book; it's ALL books since the release of 5.x, including books from MOBI sources and self-created books.

I will also upload images.

Thank you for all your help, and let me know if there are any diagnostics I can help provide.

Changed in calibre:
status: Invalid → New
Revision history for this message
Neil Laslett (nlaslett) wrote :

Working EPUB ToC

Revision history for this message
Neil Laslett (nlaslett) wrote :

Invalid ToC IDs (AZW3)

Revision history for this message
Kovid Goyal (kovid) wrote : Fixed in master

Fixed in branch master. The fix will be in the next release. calibre is usually released every alternate Friday.

 status fixreleased

Changed in calibre:
status: New → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.