unzip missing code pages

Bug #1255640 reported by Ľubomír Mlích
36
This bug affects 6 people
Affects Status Importance Assigned to Milestone
unzip (Ubuntu)
Expired
Undecided
Unassigned

Bug Description

Bug 580961 was solved, but nobody created another bug to repair encoding as is described in comments bellow.

https://bugs.launchpad.net/ubuntu-jp-improvement/+bug/580961/comments/140
and
https://bugs.launchpad.net/ubuntu-jp-improvement/+bug/580961/comments/146

Testcase: Archive file with name encoded in CP-1250 and then unzip it. I have 12.04 and package unzip (6.0-4ubuntu2) installed.

I think revision will be better than waiting for people to ask for repairing their specific encoding.

Thanks.

description: updated
description: updated
Revision history for this message
Launchpad Janitor (janitor) wrote :

Status changed to 'Confirmed' because the bug affects multiple users.

Changed in unzip (Ubuntu):
status: New → Confirmed
Revision history for this message
Alkis Georgopoulos (alkisg) wrote :

I'm using this script to work around the bug:
http://bazaar.launchpad.net/~ts.sch.gr/sch-scripts/trunk/view/head:/sch-scripts/unzip

Click on "Download file", then run
sudo mv unzip /usr/local/bin/unzip
sudo chmod +x /usr/local/bin/unzip

Then edit the file:
sudo gedit /usr/local/bin/unzip

Find the line that says:
    el) charset=cp737 ;;

...and either on top of it, or under it, add a similar line for your LANG <=> DOS codepage mapping.

Maybe when we have several mappings, someone will take the time to reimplement it inside unzip.c so that we don't have to use a wrapper anymore.

Revision history for this message
Ľubomír Mlích (hater-zlin) wrote :

Thanks, nice script.

My line looks like this:

    cs) charset=cp852;;

If You give me three more missing code pages, I'll try contact unzip maintainers and ask them for fix.

For help on codepages, look at: http://en.wikipedia.org/wiki/Code_page

summary: - unziping file with not utf-8 encoding error
+ unzip missing code pages
Revision history for this message
Dominik Viererbe (dviererbe) wrote :

> Testcase: Archive file with name encoded in CP-1250 and then unzip it. I have 12.04 and package unzip (6.0-4ubuntu2) installed.
Thank you for reporting this bug.

Ubuntu 12.04 (precise) reached end-of-life on April 28, 2017.

See this document for currently supported Ubuntu releases:
https://wiki.ubuntu.com/Releases

I tetsted on 22.04 (Jammy Jellyfish) if the bug still exists, but it seems to be fixed at some point.

Testcase:
1. create test file:
echo "Hello World!" > '~€…†‡‰Š‹ŚŤŽŹ•–—™š›śťžźNPˇ˘Ł¤Ą¦§¨©Ş«¬®Ż°±˛ł´µ¶·¸ąş»Ľ˝ľżŔÁÂĂÄĹĆÇČÉĘËĚÍÎĎĐŃŇÓÔŐÖ×ŘŮÚŰÜÝŢßŕáâăäĺćçčéęëěíîďđńňóôőö÷řůúűüýţ˙.txt'
2. force filename encoding:
convmv -f utf8 -t cp1250 '~€…†‡‰Š‹ŚŤŽŹ•–—™š›śťžźNPˇ˘Ł¤Ą¦§¨©Ş«¬®Ż°±˛ł´µ¶·¸ąş»Ľ˝ľżŔÁÂĂÄĹĆÇČÉĘËĚÍÎĎĐŃŇÓÔŐÖ×ŘŮÚŰÜÝŢßŕáâăäĺćçčéęëěíîďđńňóôőö÷řůúűüýţ˙.txt'
3. create zip file:
zip test.zip '~€…†‡‰Š‹ŚŤŽŹ•–—™š›śťžźNPˇ˘Ł¤Ą¦§¨©Ş«¬®Ż°±˛ł´µ¶·¸ąş»Ľ˝ľżŔÁÂĂÄĹĆÇČÉĘËĚÍÎĎĐŃŇÓÔŐÖ×ŘŮÚŰÜÝŢßŕáâăäĺćçčéęëěíîďđńňóôőö÷řůúűüýţ˙.txt'
4. unzip file:
unzip -d extracted test.zip
5. check if there are decoding errors:
ls -la extracted/

We appreciate that this bug may be old and you might not be interested in discussing it any more. But if you are then please upgrade to the latest Ubuntu version and re-test. If you then find the bug is still present in the newer Ubuntu version, please add a comment here telling us which new version it is in.

Changed in unzip (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
Ľubomír Mlích (lubomir-mlich) wrote :

Hello,
I've checked in Ubuntu 23.04 and seems this works well, my steps:

$ touch šč.txt
$ zip test2.zip šč.txt
$ unzip test2.zip -d extracted2
$ ls -la extracted2
drwxrwxr-x 2 myusername mygroup 4096 čen 9 21:14 .
drwxrwxr-x 4 myusername mygroup 4096 čen 9 21:14 ..
-rw-rw-r-- 1 myusername mygroup 0 čen 9 21:13 šč.txt

seems solved to me.

Revision history for this message
Ľubomír Mlích (lubomir-mlich) wrote :

ah, should have change locale in progress

Revision history for this message
Ľubomír Mlích (lubomir-mlich) wrote :

yours convmw seems simpler, it works with my system well

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for unzip (Ubuntu) because there has been no activity for 60 days.]

Changed in unzip (Ubuntu):
status: Incomplete → Expired
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.