It seems if you KNOW from which SW platform zip file comes from and codepage, you can successfully unzip the archive without loosing non-ASCII filenames not encoded in UTF-8.
I just did one experiment to unpack zip file that has been created in Korean Windows 7 and contains the Korean characters in both zip archive name and compressed files.
$ unzip --help
UnZip 6.00 of 20 April 2009, by Debian. Original by Info-ZIP.
...
Usage: unzip [-Z] [-opts[modifiers]] file[.zip] [list] [-x xlist] [-d exdir]
Default action is to extract files in list, except those in xlist, to exdir;
file[.zip] may be a wildcard. -Z => ZipInfo mode ("unzip -Z" for usage).
...
-O CHARSET specify a character encoding for DOS, Windows and OS/2 archives
-I CHARSET specify a character encoding for UNIX and other archives
Look at options with the following modifier:
-O CHARSET specify a character encoding for DOS, Windows and OS/2 archives
It is not -"zero", it is -O (capital O letter)!
In my case Korean Windows has EUC-KR codepage. The compressed zip-file has "2013년 설날" file name.
It means my command line will look like:
$ unzip -O EUC-KR "2013년 설날"
After checking unpacked files, it works! All files have right Korean encoding without strange characters.
Ubuntu 12.10 (UI with US English-UTF-8 codepage)
It seems if you KNOW from which SW platform zip file comes from and codepage, you can successfully unzip the archive without loosing non-ASCII filenames not encoded in UTF-8.
I just did one experiment to unpack zip file that has been created in Korean Windows 7 and contains the Korean characters in both zip archive name and compressed files.
First let's get a local-specific info:
$ locale "en_US. UTF-8" "en_US. UTF-8" "en_US. UTF-8" "en_US. UTF-8" "en_US. UTF-8" "en_US. UTF-8" "en_US. UTF-8" "en_US. UTF-8" "en_US. UTF-8" "en_US. UTF-8" "en_US. UTF-8" ON="en_ US.UTF- 8"
LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE=
LC_NUMERIC=
LC_TIME=
LC_COLLATE=
LC_MONETARY=
LC_MESSAGES=
LC_PAPER=
LC_NAME=
LC_ADDRESS=
LC_TELEPHONE=
LC_MEASUREMENT=
LC_IDENTIFICATI
LC_ALL=
Let's check the version of unzip utility:
$ unzip --help
UnZip 6.00 of 20 April 2009, by Debian. Original by Info-ZIP.
...
Usage: unzip [-Z] [-opts[modifiers]] file[.zip] [list] [-x xlist] [-d exdir]
Default action is to extract files in list, except those in xlist, to exdir;
file[.zip] may be a wildcard. -Z => ZipInfo mode ("unzip -Z" for usage).
...
-O CHARSET specify a character encoding for DOS, Windows and OS/2 archives
-I CHARSET specify a character encoding for UNIX and other archives
Look at options with the following modifier:
-O CHARSET specify a character encoding for DOS, Windows and OS/2 archives
It is not -"zero", it is -O (capital O letter)!
In my case Korean Windows has EUC-KR codepage. The compressed zip-file has "2013년 설날" file name.
It means my command line will look like:
$ unzip -O EUC-KR "2013년 설날"
After checking unpacked files, it works! All files have right Korean encoding without strange characters.