2015-02-16 09:09:50 |
Nobuto Murata |
description |
This branch adds default charsets handling for Windows archives in CJKV+th locale, inspired by Ubuntu Kylin way.
As a result of bug #580961, two options have been added as Ubuntu patch.
> -O CHARSET specify a character encoding for DOS, Windows and OS/2 archives
> -I CHARSET specify a character encoding for UNIX and other archives
Then Ubuntu Kylin added default encoding as environment variables for their distribution.
http://bazaar.launchpad.net/~ubuntukylin-members/ubuntukylin-default-settings/trunk/revision/171
Now as Ubuntu, we can go further by a better way:
- per user settings by their locales instead of global settings
- don't interfere in other locales by locale guard
I only add "-O", so no behavior change for zip files created on Ubuntu or other Linux/UNIX systems. This branch just handles zip file created on localized Windows system seamlessly.
charsets list is taken from:
https://msdn.microsoft.com/en-us/goglobal/bb964654
and
msdos/msdos.c in unzip package:
1682 case 932: /* Japanese */
1683 case 949: /* Korean */
1684 case 936: /* Chinese, simple */
1685 case 950: /* Chinese, traditional */
1686 case 874: /* Thai */
1687 case 1258: /* Vietnamese */
(Copied from @nobuto's branch description.) |
With the current unzip package in Ubuntu, we need to specify charset explicitly to extract zip files sent from localized Windows systems.
For example zip files sent from Japanese localized Windows,
$ zipinfo -O CP932 sent-from-localized-windows.zip
$ unzip -O CP932 sent-from-localized-windows.zip
This method won't work for GUI application like file-roller, users do not have way to specify charset from GUI.
Attached branch adds default charsets handling for Windows archives in CJKV+th locale, inspired by Ubuntu Kylin way.
As a result of bug #580961, two options have been added as Ubuntu patch.
> -O CHARSET specify a character encoding for DOS, Windows and OS/2 archives
> -I CHARSET specify a character encoding for UNIX and other archives
Then Ubuntu Kylin added default encoding as environment variables for their distribution.
http://bazaar.launchpad.net/~ubuntukylin-members/ubuntukylin-default-settings/trunk/revision/171
Now as Ubuntu, we can go further by a better way:
- per user settings by their locales instead of global settings
- don't interfere in other locales by locale guard
I only add "-O", so no behavior change for zip files created on Ubuntu or other Linux/UNIX systems. This branch just handles zip file created on localized Windows system seamlessly.
charsets list is taken from:
https://msdn.microsoft.com/en-us/goglobal/bb964654
and
msdos/msdos.c in unzip package:
1682 case 932: /* Japanese */
1683 case 949: /* Korean */
1684 case 936: /* Chinese, simple */
1685 case 950: /* Chinese, traditional */
1686 case 874: /* Thai */
1687 case 1258: /* Vietnamese */
(Copied from @nobuto's branch description.) |
|