mkisofs aborts on malformed joliet filenames

Bug #23046 reported by José M. López-Cepero on 2005-10-03
10
Affects Status Importance Assigned to Milestone
cdrkit (Baltix)
Undecided
Unassigned
cdrkit (Ubuntu)
Medium
Unassigned

Bug Description

If mkisofs is invoked with Joliet-generating options and finds an invalid UTF-8
filename it aborts. That wouldn't be that bad, if not for the fact that it
aborts once the image generation has begun. Thus, any program using mkisofs as a
backend and writing on the fly (k3b and growisofs come to mind) is in danger of
wasting a CD or DVD if any of these files is hidden on the written tree. A more
graceful exit for mkisofs would certainly be welcome - for instance, replacing
the offending character with another one, or aborting before having written any
part of the image. I think that this behaviour is not found on the base
cdrtools, but in a so-called 'iconv patch' which is included in the Ubuntu version.

Ian Jackson (ijackson) wrote :

It would assist me with reproducing and fixing this if you could supply me with:
1. a tarball containing a test tree which has at least one valid filename and at
least one invalid filename
2. a session transcript showing how you invoked mkisofs (including all of the
options) and the error message

José M. López-Cepero (cepe) wrote :

Please excuse the delay, I've had a hectic week at work.

You can use convmv to generate offending files. An example:

$ mkdir test
$ cd test
$ touch badfilename_ñ
$ convmv -f utf8 -t cp437 badfilename* --notest --nosmart
$ cd ..
$ mkisofs -R -J -o testimage.iso test
INFO: UTF-8 character encoding detected by locale settings.
        Assuming UTF-8 encoded filenames on source filesystem,
        use -input-charset to override.
Incorrectly encoded string (badfilename_\uffff) encountered.
Possibly creating an invalid Joliet extension. Aborting.
$ ls -sh testimage.iso
64K testimage.iso

Since 64K are written, if you had used mkisofs to write to the CD/DVD on the fly
(say growisofs -Z /dev/hdc -R -J test), you would have wasted a CD/DVD. If I
recall correctly, this does not happen on other errors (for instance, when the
-joliet-long option is needed), because in that case mkisofs does not begin writing.

Maybe the test for the incorrectly encoded strings could be moved to the part
where the original Joliet name tree is created, so that no data is written to
the image before aborting. However, just replacing the offending characters with
'_' or something similar seems like the sanest idea.

I understand that this problem is not specific to Ubuntu (I have tested and
reproduced it also in a FC2 machine), but since the mkisofs version shipping
with Ubuntu is modified (and the cdrtools mantainer is rather strict in not
accepting bug reports for modified versions) I'm reporting it here.

I'm attaching a zipfile of the above test directory. You can unzip it and run
the above mkisofs line and it will abort.

Interestingly enough, non-utf8 characters of the isolatin-1 charset do not seem
to trigger the error (you may test it by replacing cp437 with isolatin1 in the
convmv line above); cp437 and cp850 do, though. However, this could be just a
coincidence because of the particular non-Unicode character chosen (Spanish ñ).

Let me know if you need any further assistance.

Best regards - CP

José M. López-Cepero (cepe) wrote :

Created an attachment (id=4484)
.zip of a tree that triggers the bug

José M. López-Cepero (cepe) wrote :

Ian,

have you been able to made any progress on this bug so far? (just asking nicely)

Best regards

Simon Law (sfllaw) on 2006-04-28
Changed in cdrtools:
status: Unconfirmed → Confirmed
Nils Pickert (nils-mipi) wrote :

Hi,

I had the same problem today with some german filenames containing umlauts which were copied among various computers and therefore in some very garbled utf/whatever encoding... Took me two blank DVDs to figure out what happened :-(

Any progress happened?

Nils

Schily (schilling-fokus) wrote :

Please note that you are not using mkisofs but a
fork that introcudes bugs.

Your problem never has been in the original mkisofs,
it has been introduced by the people who created
the fork.

Please upgrade to a recent cdrtools version:

http://cdrecord.berlios.de/

Schily (schilling-fokus) wrote :

Moved to cdrkit as this bug is specific to the way genisoimage handles UTF-8 encoding.

mkisofs uses better code to handle character encoding and replaces illegal characters by '_'.

Please try the mkisofs package from gutsy/multiverse

Ian Jackson (ijackson) on 2007-10-23
Changed in cdrkit:
assignee: ijackson → nobody
Przemek K. (azrael) wrote :

Is this bug still present in latest version of Ubuntu? (9.10)

Schily (schilling-fokus) wrote :

As Ubuntu unfortunately still distributes the fork "cdrkit"
instead of the original software, there is no hope for a
fix.

We are closing this bug report because it lacks the information we need to investigate the problem, as described in the previous comments. Please reopen it if you can give us the missing information, and don't hesitate to submit bug reports in the future. To reopen the bug report you can click on the current status, under the Status column, and change the Status back to "New". Thanks again!

Changed in cdrkit (Baltix):
status: New → Invalid
Changed in cdrkit (Ubuntu):
status: Confirmed → Invalid

Thank you for taking the time to report this bug and helping to make Ubuntu better. My apologies as I should not have marked this Invalid. The issue that you reported is one that should be reproducible with the live environment of the Desktop CD of the development release - Maverick Meerkat. It would help us greatly if you could test with it so we can work on getting it fixed in the next release of Ubuntu. You can find out more about the development release at http://www.ubuntu.com/testing/ . Thanks again and we appreciate your help.

Changed in cdrkit (Ubuntu):
status: Invalid → Incomplete
Changed in cdrkit (Baltix):
status: Invalid → New
Schily (schilling-fokus) wrote :

The original software did never use the broken UTF-8 patch that is the reason for the buggy behavior of the fork on Ubuntu.

The original software introduced a working UTF-8 solution in Auhust 2006.

I recommend to use the original software (current release is cdrtools-3.00) as it correctly handles UTF-8

Changed in cdrkit (Ubuntu):
status: Incomplete → Confirmed
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.