Pdftk cannot work with non-ascii named files

Bug #158025 reported by Joe_Bishop
22
This bug affects 2 people
Affects Status Importance Assigned to Milestone
pdftk (Debian)
Fix Released
Unknown
pdftk (Ubuntu)
Fix Released
Undecided
Unassigned

Bug Description

Binary package hint: pdftk

Pdftk cannot work with non-ascii named files. For example:
--------------------------------------------------------------------------------------------
master@master:~/Desktop$ ls -l
total 7112
drwxr-xr-x 3 master master 8 2007-10-28 15:04 examples
-rw-r--r-- 1 master master 5842516 2007-10-28 15:03 tutorial.tar.gz
-rw-rw-r-- 1 master master 1437044 2007-10-02 08:46 Панкратьев.pdf
master@master:~/Desktop$ pdftk Панкратьев.pdf dump_data output ahha
Error: Failed to open PDF file:
   Панкратьев.pdf
Errors encountered. No output created.
Done. Input errors, so no output created.
--------------------------------------------------------------------------------------------
It happens because input and output file names decode into Latin1 character set with the code like this one:
java::String* jv_output_filename_p= JvNewStringLatin1( output_filename.c_str() );
You can see it in the pdftk/pdftk.cc file.

Revision history for this message
ClonedAgain (clonedagain) wrote :

I can confirm UTF8 seems to confuse pdftk: I get "Error: Failed to open PDF file:" when I use filenames with french accents on Hardy.

Here's a blog entry from someone who claims to have a fix (source+win32 exe):
http://blog.rubypdf.com/2007/07/19/pdftk-supports-chinese-path-now/
I hav'nt had the time to try it yet, sorry.

Changed in pdftk:
status: New → Confirmed
Revision history for this message
Adam Buchbinder (adam-buchbinder) wrote :

The attached patch (adapted from the above link) is confirmed as fixing the issue for me. My knowledge of Debian packaging is skinny enough that I can't conjure up an actual debdiff, but I've tested adding this package via quilt and running dpkg-buildpackage; the built version does work with accented characters in filenames.

Revision history for this message
Joe_Bishop (denis-cheremisov-gmail) wrote : Re: [Bug 158025] Re: Pdftk cannot work with non-ascii named files

Great! A little late, I made workaround with temporary files for this issue,
but it's better to have it late than never.

2008/9/8 Adam Buchbinder <email address hidden>

> The attached patch (adapted from the above link) is confirmed as fixing
> the issue for me. My knowledge of Debian packaging is skinny enough that
> I can't conjure up an actual debdiff, but I've tested adding this
> package via quilt and running dpkg-buildpackage; the built version does
> work with accented characters in filenames.
>
> ** Attachment added: "Patch to enable non-Latin-1 filename support."
> http://launchpadlibrarian.net/17402709/non_latin1_filename_support
>
> --
> Pdftk cannot work with non-ascii named files
> https://bugs.launchpad.net/bugs/158025
> You received this bug notification because you are a direct subscriber
> of the bug.
>

Revision history for this message
Danilo Piazzalunga (danilopiazza) wrote :

A simple but somewhat crude workaround would be replacing JvNewStringLatin1 with JvNewStringUTF, which assumes UTF-8. This was proposed for Debian bug #461169 (http://bugs.debian.org/461169), which affects the update_info command.

Changed in pdftk:
status: Unknown → Confirmed
Revision history for this message
Jose M. Albarrán (yomismo-jmalbarran) wrote :

I confirm the same bug in spanish charset on UTF-8

Changed in pdftk (Debian):
status: Confirmed → Fix Released
Revision history for this message
Johann Felix Soden (johfel) wrote :

Fixed in 1.41+dfsg-1.

Changed in pdftk (Ubuntu):
status: Confirmed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.