Simplifed Chinese text file charset detection problem: GBK text files detected as GB2312

Bug #1175974 reported by Wenzhuo Zhang
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
calibre
Fix Released
Undecided
Unassigned

Bug Description

Calibre 0.9.29 64-bit for Windows 7

When converting Simplified Chinese text files to ebooks, Calibre auto-detects the charset of GBK-encoded Chinese text files as GB2312, which makes GBK characters not in GB2312 unreadable (GBK is a superset of GB2312).

Related branches

Revision history for this message
Kovid Goyal (kovid) wrote : Re: calibre bug 1175974

There is no way to reliably detet txt file encoding you need to specify
the encoding in the conversion options.

 status invalid

Revision history for this message
Kovid Goyal (kovid) wrote :
Changed in calibre:
status: New → Invalid
Revision history for this message
Wenzhuo Zhang (wenzhuo) wrote : Re: [Bug 1175974] Re: calibre bug 1175974

GBK is obviously a better choice than GB2312, when auto-detection is needed.

于 2013/5/3 18:15, Kovid Goyal 写道:
> There is no way to reliably detet txt file encoding you need to specify
> the encoding in the conversion options.
>
> status invalid
>
> ** Changed in: calibre
> Status: New => Invalid
>

Revision history for this message
Kovid Goyal (kovid) wrote : Fixed in lp:calibre

Fixed in branch lp:calibre. The fix will be in the next release. calibre is usually released every Friday.

 status fixreleased

Changed in calibre:
status: Invalid → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.