Ubuntu
unzip package

Bug #1422290
Comment #8

Comment 8 for bug 1422290

Revision history for this message

Yuan Chao (yuanchao) wrote on 2015-03-01:

This is from one of my machine running LUbuntu:

$ export |grep LANG
declare -x LANG="en_US.UTF-8"

$ export |grep LC
declare -x LC_ADDRESS="en_US.UTF-8"
declare -x LC_IDENTIFICATION="en_US.UTF-8"
declare -x LC_MEASUREMENT="en_US.UTF-8"
declare -x LC_MONETARY="en_US.UTF-8"
declare -x LC_NAME="en_US.UTF-8"
declare -x LC_NUMERIC="en_US.UTF-8"
declare -x LC_PAPER="en_US.UTF-8"
declare -x LC_TELEPHONE="en_US.UTF-8"
declare -x LC_TIME="en_US.UTF-8"

$ unzip -h
UnZip 6.00 of 20 April 2009, by Debian. Original by Info-ZIP.
...

Use the file from here: http://www1.axfc.net/uploader/Sc/so/325701.zip (passwd: backer) (CP932)

$ unzip celluloid.zip
Archive: celluloid.zip
  inflating: celluloid/readme.txt
  inflating: celluloid/В╣ВщВчВдВ╟.ust
  inflating: celluloid/В╣ВщВчВдВ╟2Ф╘.ust
  inflating: celluloid/В╣ВщВчВдВ╟СхГTГrСOВйВч.ust

$ unzip -O cp932 celluloid.zip
Archive: celluloid.zip
  inflating: celluloid/readme.txt
  inflating: celluloid/せるらうど.ust
  inflating: celluloid/せるらうど2番.ust
  inflating: celluloid/せるらうど大サビ前から.ust

$ unzip -O cp936 celluloid.zip
Archive: celluloid.zip
  inflating: celluloid/readme.txt
  inflating: celluloid/偣傞傜偆偳.ust
  inflating: celluloid/偣傞傜偆偳2斣.ust
  inflating: celluloid/偣傞傜偆偳戝僒價慜偐傜.ust

$ unzip -O cp950 celluloid.zip
Archive: celluloid.zip
  inflating: celluloid/readme.txt
  inflating: celluloid/��炤��.ust
  inflating: celluloid/��炤��2��.ust
  inflating: celluloid/��炤�Ǒ��T�r�O��.ust

Another file from here http://3jf.wodemo.com/file/310894 (CP936)

$ unzip -L 王妃.zip
Archive: 王妃.zip
inflating: ═їх·_a.ust
inflating: ═їх·_b.ust

$ unzip -O cp932 王妃.zip
Archive: 王妃.zip
inflating: ﾍ銈A.ust
inflating: ﾍ銈B.ust

$ unzip -O cp936 王妃.zip
Archive: 王妃.zip
inflating: 王妃_A.ust
inflating: 王妃_B.ust

$ unzip -O cp950 王妃.zip
Archive: 王妃.zip
inflating: 卼漦_A.ust
inflating: 卼漦_B.ust

Actually, not all the wrong cases map to illegal UTF8 string (question marks). I guess why an auto-detect is not so straight forward?

This is from one of my machine running LUbuntu:

$ export |grep LANG
declare -x LANG="en_US.UTF-8"

$ unzip -h
UnZip 6.00 of 20 April 2009, by Debian. Original by Info-ZIP.
...

Use the file from here: http://www1.axfc.net/uploader/Sc/so/325701.zip (passwd: backer) (CP932)

$ unzip celluloid.zip 
Archive:  celluloid.zip
  inflating: celluloid/readme.txt    
  inflating: celluloid/В╣ВщВчВдВ╟.ust  
  inflating: celluloid/В╣ВщВчВдВ╟2Ф╘.ust  
  inflating: celluloid/В╣ВщВчВдВ╟СхГTГrСOВйВч.ust

$ unzip -O cp932 celluloid.zip 
Archive:  celluloid.zip
  inflating: celluloid/readme.txt    
  inflating: celluloid/せるらうど.ust  
  inflating: celluloid/せるらうど2番.ust  
  inflating: celluloid/せるらうど大サビ前から.ust

$ unzip -O cp936 celluloid.zip 
Archive:  celluloid.zip
  inflating: celluloid/readme.txt    
  inflating: celluloid/偣傞傜偆偳.ust  
  inflating: celluloid/偣傞傜偆偳2斣.ust  
  inflating: celluloid/偣傞傜偆偳戝僒價慜偐傜.ust

$ unzip -O cp950 celluloid.zip 
Archive:  celluloid.zip
  inflating: celluloid/readme.txt    
  inflating: celluloid/�����炤��.ust  
  inflating: celluloid/�����炤��2��.ust  
  inflating: celluloid/�����炤�Ǒ��T�r�O����.ust

Another file from here  http://3jf.wodemo.com/file/310894   (CP936)

$ unzip -L 王妃.zip 
Archive:  王妃.zip
  inflating: ═їх·_a.ust         
  inflating: ═їх·_b.ust

$ unzip -O cp932 王妃.zip 
Archive:  王妃.zip
  inflating: ﾍ銈A.ust          
  inflating: ﾍ銈B.ust

$ unzip -O cp936 王妃.zip 
Archive:  王妃.zip
  inflating: 王妃_A.ust            
  inflating: 王妃_B.ust

$ unzip -O cp950 王妃.zip 
Archive:  王妃.zip
  inflating: 卼漦_A.ust            
  inflating: 卼漦_B.ust

Actually, not all the wrong cases map to illegal UTF8 string (question marks). I guess why an auto-detect is not so straight forward?

Ubuntuunzip package

Comment 8 for bug 1422290

Ubuntu
unzip package