DOCX input known characters metadata

Bug #1321343 reported by Tamsusis on 2014-05-20
This bug affects 1 person
Affects Status Importance Assigned to Milestone

Bug Description

Calibre version 1.37
When adding DOCX file as ebook, if metadata commnet field contains new line character it is inputed as _x000d_ in calibre book matadata. See example

The _x000d_ are rpesent in that docx file, which you can verify by
unzipping it and looking at docprops/core.xml

 status invalid

Changed in calibre:
status: New → Invalid
Tamsusis (tamsusis) wrote :

OK that is clear. However Word does not show those invalid characters. It is just plain line breaks in comments field.
See screen shot attached.
It would be wise for Calibre to strip such obvious garbage in metadata.

Charles Haley (cbhaley) wrote :

The characters aren't garbage. On Windows, line endings are 0x0d 0x0a, or CR LF.

Fixed in branch master. The fix will be in the next release. calibre is usually released every Friday.

 status fixreleased

Changed in calibre:
status: Invalid → Fix Released
Tamsusis (tamsusis) wrote :

Thank you, that's great

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers