RTF charset ansicpg0 handling
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
calibre |
Fix Released
|
Undecided
|
sengian |
Bug Description
Some RTF editors do not specify a valid charset. They use charset 0:
{\rtf1\
This is correctly handled by plenty of RTF editors, e.g. MS Word, Wordpad, Total Command Preview etc.
I suspect the charset 0 is handled as "use the system default" or "use the charset corresponding to language" (deflang).
It would be nice either to:
- be able to override the codepage in case of RTF with ansicpg0 header (the override of the charset is not possible in case of RTF document, the "input character encoding" value is ignored)
- use the system default charset instead of ansi (that in my case replaces accent characters with non-accent versions)
Word adds the charset (ansicpg with correct charset) when the document is re-saved, Wordpad removes it completely. Nevertheless, it would be nice to be able to solve this within Calibre conversion.
Thanks!
Calibre version: Windows, 32-bit, 0.9.25
Sorry, I can't attach the document here and I am not able to create a one-page sample with the problem described. Maybe email will do?
Changing the component for this bug.
assignee sengian
tag rtf-input
status triaged