utf8 initial support
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Cuneiform for Linux |
Fix Released
|
Undecided
|
Unassigned |
Bug Description
this patch adds UTF8 support to the engine. Currently, i enabled UTF8 support only for txt and html formats, because rtf format require some additional work (and also it store codepage information inside, so i don`t think that we really need this).
Description of the patch:
1) it defines ROUT_CODE_UTF8, and PUMA_CODE_UTF8
2) It enables UTF8 support for the HTML and TXT output formats
3) It adds int getUTF8Char() function, which is currenly depends on iconv(). This is only place with iconv() call. It should not be a problem to rewrite it based without recoding, because only used source tables are:
windows-1250
windows-1251
windows-1254
windows-1257
4) It modifies OneChar function to do recoding in case of ROUT_CODE_UTF8
5) It defines PUMA_CODE_UTF8 inside cli
I did testing with different languages and files and i see no regression from this patch. Also it 100% backward compatible, because API is unchanged.
Changed in cuneiform-linux: | |
status: | Fix Committed → Fix Released |
Good work. It makes life easy.