Markdown converter produces garbled output on large files

Bug #1212330 reported by Glyn Barlow on 2013-08-14
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Undecided
John Schember

Bug Description

When converting a TXT file with Markdown, Large files get additional text added.

and example
text input of

‘So was the driver,’ chipped in Sarah. ‘He tried to kill us.’ ‘Good gracious\! Whatever for?’
Gives EPUB of

So was the driver,’ chipped in Sarah. ‘He tried to kill us.’ ‘Good graciousklzzwxh:10000 Whatever for?’

Once this has started happening in a file, it continues and gets worse (more occurrences). The number 10000, increments on each new occurrence.

the job conversion output is as follows, you can even see the chapter headings also get corrupted.

This can be easily reproduced with an existing (working) text file, if you make the file large (copy and paste the text until the file is over over 1M in size.

Glyn Barlow (glyn-barlow) wrote :

This is the conversion job details

summary: - Markdown converter produces carbled output on large files
+ Markdown converter produces garbled output on large files

Changing the component for this bug.

 assignee user-none
 tag txt-input
 status triaged

Changed in calibre:
assignee: nobody → John Schember (user-none)
status: New → Triaged
John Schember (user-none) wrote :

Attach a file demonstrating the behavior.

Changed in calibre:
status: Triaged → Incomplete
Glyn Barlow (glyn-barlow) wrote :

This is a file that I have concatenated over and over again to make it large. I The error occurs about 3/4 of the way down in the converted output.

Glyn Barlow (glyn-barlow) wrote :

And this is the converted output (EPUB)

John Schember (user-none) wrote :

The issue is in Python Markdown (a third party component used for Markdown input). A new release is due out very soon. I'll update calibre's internal copy to use the new release at that time and see if the issue is still present.

Changed in calibre:
status: Incomplete → Confirmed
status: Confirmed → Incomplete
John Schember (user-none) wrote :

Looks like Python Markdown 2.4.0 is taking a bit longer than expect to be released. I've tested with 2.3.1 and the issue is still present in. I've opened a ticket upstream https://github.com/waylan/Python-Markdown/issues/255 in regard to the issue.

Changed in calibre:
status: Incomplete → In Progress

Fixed in branch master. The fix will be in the next release. calibre is usually released every Friday.

 status fixreleased

Changed in calibre:
status: In Progress → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers