Calibre Editor: Edit/Convert docx file, wrong indentation for lists

Bug #2069541 reported by Armin Geller
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Fix Released
Undecided
Unassigned

Bug Description

Hello Kovid,
there seems to be an unwanted offset for list items when I use an imported docx file as starting document in the editor. Please find attached a set of files that shows what the problem is. The files are test.docx (MS-Word 2019, with standard format template), test.pdf and test.epub. The pdf is ment as representation of the docx file to see how the file should look like.
There is as well a difference for list items where the list identifier isn't a dot or a number but instead a symbol (see last example in the text). Not quite sure if that was implemented in the past

There is as well a discussion at mobileread https://www.mobileread.com/forums/showthread.php?t=361292 (not from me). While the mentioned file from the TO is as well questionably (as it includes for me a defect format templpate), there is one valid point: There is a additional offset (maybe wrong interpretation?) that isn't include in the format template for such list items.

Workflow: Open docx file with calibre editor directly.

Calibre version 7.12, Windows 10-64bit

Tags: calibre editor
Revision history for this message
Armin Geller (armingeller) wrote :
Revision history for this message
Kovid Goyal (kovid) wrote :

Symbols are not supported, beyond the ones defined in CSS. As for the
extra padding, that comes from the default user agent stylesheet of the
browser/viewer used to view the EPUB. You can override it by adding

ul, ol { padding-inline-start: 0 }

to the extra css when converting. The docx input plugin doesn't do this
automatically because it breaks rendering of the lists in some cases.
That said, I can no longer get anything I currently have access to,
to break when setting the padding to zero, so I will change that for the
next release.

Changed in calibre:
status: New → Invalid
Revision history for this message
Kovid Goyal (kovid) wrote :

Fixed in branch master. The fix will be in the next release. calibre is usually released every alternate Friday.

Changed in calibre:
status: Invalid → Fix Released
Revision history for this message
Armin Geller (armingeller) wrote :

Hi Kovid,

Thanks a lot for the fast fix. I will test it as soon the new release is available.

Best regards,
Armin

Revision history for this message
Armin Geller (armingeller) wrote :

This looks much better then before. I'm happy with this change and the behavior.

While I'm testing I found a additional problem with lists and 3 levels that I had overseen and is present in the demo file as well:
When you look to the numerical list with 3 levels only the first level with numeric presentation seems to be identified correctly. The level numbering for level 2 and 3 is identified as part of the text while the list indentation is working.
For the last example w/o numbering but with disks then it works fine again.

Best regards,
Armin

Revision history for this message
Kovid Goyal (kovid) wrote :

There are limits to how much complexity can be converted between docx
and html as they have extremely different list semantics. I'm afraid the
current code is about as robust as I have the time/inclination for,
patches to improve it are welcome.

Revision history for this message
Armin Geller (armingeller) wrote :

No problem, I only thought I should tell you about this behavior. :)

Again, thanks for your work. I'm really happy that the original problem is solved now.

Best regards,
Armin

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Bug attachments

Remote bug watches

Bug watches keep track of this bug in other bug trackers.