pdfshuffler does not recognize all PDF files in directory

Bug #542755 reported by N7DR
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
PDF-Shuffler
Unknown
Unknown
pdfshuffler (Ubuntu)
Confirmed
Undecided
Unassigned

Bug Description

Binary package hint: pdfshuffler

If one goes to import a PDF file, the dialogue lists only the files that happen to end in ".pdf", regardless of whether:
1) those files really are PDF files
2) there are other PDF files in the directory that do not end in ".pdf".

The second bug is particularly a problem. It means that one cannot open some PDFs unless one happens to remember their name exactly. (I list the first bug simply for completeness as something that should be fixed; it doesn't actually affect my workflow.)

For example, I have a file "CO 106" which is a PDF file. The OS recognizes it as such (if I type 'file "CO 106"' the OS correctly informs me that it is a PDF 1.4 file). But PDF-Shuffler does not list it in the dialogue box when I try to import the file.

ProblemType: Bug
Architecture: amd64
Date: Sat Mar 20 09:19:03 2010
DistroRelease: Ubuntu 9.10
NonfreeKernelModules: fglrx
Package: pdfshuffler 0.4.2-1
PackageArchitecture: all
ProcEnviron:
 LANGUAGE=
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.31-20.58-generic
SourcePackage: pdfshuffler
Uname: Linux 2.6.31-20-generic x86_64

Revision history for this message
N7DR (doc-evans) wrote :
Revision history for this message
Lorenzo De Liso (blackz) wrote :

I can confirm this bug.

Regards,

Lorenzo D. (alias BlackZ).

Changed in pdfshuffler (Ubuntu):
status: New → Confirmed
Revision history for this message
Marcel Stimberg (marcelstimberg) wrote :

Note that you can choose "all files" in the bottom right corner of the dialogue and get a list of all files in the directory, regardless of their file extension.
I do not see any better solution, or do you suggest to first check the true type of every file before displaying the file list? This would slow down the dialogue considerably. For example, I have a directory with ~500 PDFs, running "file" on all of them takes ~4s. For directories containing movies etc. it might even get worse.

Changed in pdfshuffler (Ubuntu):
status: Confirmed → Incomplete
Revision history for this message
N7DR (doc-evans) wrote :

That's exactly what I'm suggesting.

If that isn't going to be implemented, then the wording of the dialogue should be changed. As a user, if I see a dialogue that tells me that it is offering to list "PDF files" then I expect it to... list the PDF files.

I'm not sure that a few seconds of delay (especially if accompanied with a "working" notification) is a problem. Given how long it takes to open file dialogues in many applications (yes, OpenOffice.org, I'm thinking of you), a few seconds doesn't seem to be much of a problem. But you may not agree.

----

I guess I see the following possibilities:

1. Do nothing. This basically means that the dialogue is misleading the user.

2. Reword the dialogue. This would work, but would ultimately force the user to switch to the "All files" option and find the file.

3. Replace the current "PDF Files" option with a proper search for PDF files. This would work, but might cause a few seconds' delay. That wouldn't bother me, but maybe it would bother other users.

4. Reword the dialogue AND add a new option that implements a proper search for PDF files.

Personally, I'd put the desirability of these in the order:
  3, 4, 2, 1

Revision history for this message
Marcel Stimberg (marcelstimberg) wrote :

Ok, I have to admit you're right. I was under the impression that pdfshuffler's behaviour was the standard one, i.e. all or at least most file dialogues filter only on the extension and only offer the "all files" option to get other files. But I was wrong, the behaviour you expect is indeed the standard one. Also my comment on speed was not really fitting: In my case (folder full of PDFs with correct extension) it won't make any speed difference, it would only rely on the slow detection if the file has no extension.
The only thing that would remain is point 1) from your description but this would really be standard behaviour, e.g. eog would list a file with .jpeg extension regardless whether it really is an image or not.

To conclude, this issue is an upstream bug, I'll report it in the upstream bug tracker. The patch is actually trivial, in the /usr/bin/pdfshuffler file, replace the line
         filter_pdf.add_pattern('*.pdf')
with
         filter_pdf.add_mime_type('application/pdf')
(Note that whitespace is important for python, i.e. you have to have the same indentation as before).

Sorry for my negative earlier response, I learned something new today ;)

Changed in pdfshuffler (Ubuntu):
status: Incomplete → Confirmed
Revision history for this message
Mélodie (meets) wrote :

Hello,

me wonders why you use spaces in names in your files.

Else, in Ubuntu Xenial (16.04), provided all pdf files in a directory bear the .pdf extension (which your bug report suggests they don't all have in your computer), you could do something like this in the console:

find . -name "*.pdf" -exec pdfshuffler {} \;

it works, just you can't just close the program once the job done, you need to Ctrl+C. I don't know how to end the command line properly to end the it.

A question for marcelstimberg : how should I change this command line if the files can be detected as pdf applications instead of being detected by their name extention?

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.