cups-pdf should not convert PDF print jobs to PS then to PDF

Bug #820820 reported by Adrian Johnson on 2011-08-04
118
This bug affects 19 people
Affects Status Importance Assigned to Milestone
cups-pdf (Ubuntu)
Wishlist
Unassigned

Bug Description

When printing to cups-pdf from applications that use PDF as the printing format, the PDF output from the application is converted to PostScript using pdftops then back to PDF using Ghostscript. This turns a perfectly good PDF file into a crappy PDF file that does not display images correctly and the text is not selectable. If the print job is already in PDF format, cups-pdf should just pass the file through unmodified or at least avoid the PDF->PS->PDF conversion.

ProblemType: Bug
DistroRelease: Ubuntu 11.04
Package: cups-pdf 2.5.1-2
ProcVersionSignature: Ubuntu 2.6.38-10.46-generic 2.6.38.7
Uname: Linux 2.6.38-10-generic x86_64
Architecture: amd64
CupsErrorLog:

Date: Thu Aug 4 18:41:17 2011
InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release amd64 (20101007)
Papersize: a4
ProcEnviron:
 LANGUAGE=en_AU:en
 PATH=(custom, user)
 LANG=en_AU.UTF-8
 SHELL=/bin/bash
SourcePackage: cups-pdf
UpgradeStatus: Upgraded to natty on 2011-05-01 (95 days ago)

This test case is the PDF output from Firefox. It displays correctly in evince and acroread. The select is selectable and can be copy and pasted from the PDF.

This is the cups-pdf output after running:

lpr -P PDF firefox.pdf

It does no display correctly in evince or acroread (images are black). Text can not be copy and pasted from evince or acroread.

Till Kamppeter (till-kamppeter) wrote :

This is caused by bug 817049.

Hadmut Danisch (hadmut) wrote :

This is more than just bug 817049. As Adrian pointed out, the resulting PDF files are often degraded (e.g. I sometimes experience shifted contents where something at the very top or bottom is missing on the printout since it was shifted outside the printer's range.

Converting everything including PDF to postscript, applying some dirty modifications on it and converting it back to PDF, is definitely not a good idea. The whole print pipeline is outdated and - sorry to say that - very questionable.

Till Kamppeter (till-kamppeter) wrote :

I already modified the cups-pdf package some time ago to use a straight PDF workflow without detour through PostScript but this was apparently changed back to allow certain configurable options. Then I have given up on this package, also because most apps/desktop toolkits have their own PDF export feature ("Print to file" in the GTK printing dialog and "Export to PDF" in OpenOffice and LibreOffixe).

Till Kamppeter (till-kamppeter) wrote :

In Oneiric I have switched CUPS' pdftops filter from Poppler to Ghostscript. Please try whether this helps.

Martin-Éric Racine (q-funk) wrote :

There are two sides to this story. Those who only use pure PDF documents and who need something to print them to a spool, and those who actually want to be able to manipulate the output via Ghostscrpt first. The main issue with the PDF workflow patch was that it prevented those who want to manipulate the output via cups-pdf.conf Ghostscript options from doing so; it hard-coded too many things and it removed upstream functionality from CUPS-PDF. That produced endless criticism against the patch's impact upon the package's lost flexibility. If someone ever came up with a less intrusive version of this patch, upstream would gladly merge it.

Changed in cups-pdf (Ubuntu):
status: New → Confirmed
Changed in cups-pdf (Ubuntu):
status: Confirmed → Triaged
Till Kamppeter (till-kamppeter) wrote :

On the last UDS we have decided to finally deprecate PostScript as the (former) standard print job format, due to the fact that all standard applications send print jobs in PDF format now and that the CUPS filters (in /usr/lib/cups/filter/) are moving from CUPS to OpenPrinting upstream now. The PostScript-specific filters will get dropped and so the standard way is to have a workflow with PDF as standard print job format. See

https://blueprints.launchpad.net/ubuntu/+spec/desktop-p-new-cups-filters-package
https://www.linuxfoundation.org/collaborate/workgroups/openprinting/pdfasstandardprintjobformat
http://www.openprinting.org/

So a PDF->PostScript->PDF path is more than awkward. Manipulating the output should be done with other methods:

1. Use the CUPS options for N-up, reverse order, selected pages, scale-to-fit, ... They are all executed by the pdftopdf CUPS filter and user-accessible by the GTK printing dialog (later by the Common Print Dialog).

2. Enhance the pdftopdf CUPS filter by adding more page manipulations to it (patches welcome). This would offer these page manipulations also to queues which print to real printers.

3. Use pdftk instead of Ghostscript in the cups-pdf backend.

Martin-Éric Racine (q-funk) wrote :

Using a filter other than Ghostscript would probably be suitable, just as long as it remains configurable. The main complain that people had with the old PDF Workflow patch was that it completely disabled CUPS-PDF's output formating configurability.

Martin-Éric Racine (q-funk) wrote :

I forwarded this to the upstream author. Here is his response:

    So a PDF->PostScript->PDF path is more than awkward.

The reasons for not skipping this step are known to you (it will severly impede the functionality of CUPS-PDF). Furthermore, once again, CUPS-PDF is not meant for processing PDF-input (i.e., it is not meant to be a PDF-manipulation tool).

    1. Use the CUPS options for N-up, reverse order, selected pages, scale-
    to-fit, ... They are all executed by the pdftopdf CUPS filter and user-
    accessible by the GTK printing dialog (later by the Common Print
    Dialog).

Those apply to the pdftopdf filter of CUPS. As for pdftopdf: see above.

    2. Enhance the pdftopdf CUPS filter by adding more page manipulations to
    it (patches welcome). This would offer these page manipulations also to
    queues which print to real printers.

I am certain that pdftopdf can be used as a great PDF-manipulation tool - but once again: that is not what CUPS-PDF is made for.

    3. Use pdftk instead of Ghostscript in the cups-pdf backend.

And once more: with pdftk one can modify PDFs. CUPS-PDF is made to convert non-PDF to PDF.

So, what is being suggested here is to have a tool that can drop printjobs to defined directories as CUPS-PDF does but does not convert PS to PDF but manipulates PDF instead.

Just for your understanding why I am so adamant about CUPS-PDF not relying on PDF-input: I never developed CUPS-PDF as "latest-and-greatest"-tool, but as a tool we employ in our labs with several hundert students/workers and a very diverse IT-infrastructure. With respect to printing, PS really IS a common ground to start from. PDF is entirely unknown to several of our installations.

Download full text (3.2 KiB)

This way cups-pdf as provided by upstream disqualifies itself for being part of Debian and Ubuntu distributions. Practically all applications send their print jobs in PDF format and CUPS in Debian and Ubuntu is configured to respect this by maintaining the data stream format in PDF until sending the job to the renderer/driver, so a tool like cups-pdf should work well with PDF input

Most users simply want a tool which drops the job as PDF file in their home directory. This is nicely fulfilled by my patch which was applied to cups-pdf earlier. The job principally stayed the original PDF output of the application (speed reliability) and only got modified by the pdftopdf filter of CUPS in the case of the user using CUPS' page management options (N-up, reverse order, selected pages, ...).

Some users want to have the possibility of applying additional manipulation to their print-to-PDF jobs via the settings in the /etc/cups/cups-pdf.conf config file. What can be done there (except file name, user, and security settings which are independent of the use of Ghostscript) is merely editing the Ghostscript command line which was intended to turn PostScript into PDF and the PDF level to convert to this PDF level.

So what is mostly needed of the original cups-pdf is the mechanism to save print jobs as PDF files in a given subdirectory the user's home directory, assign the desired permissions and the desired file name scheme. In most cases the user wants to have the PDF coming out quickly and not being blown up by unneeded conversions. As the data already comes in PDF, this PDF should be dropped in the user's directory if the user did not express special desires via the /etc/cups/cups-pdf.conf file. So default should be a pass-through of incoming PDF. A conversion with Ghostscript should be only done either if the input is PostScript (usually does not happen in Debian or Ubuntu) or if the user explicitly desires a conversion by giving a Ghostscript command line or a PDF level in /etc/cups/cups-pdf.conf, and as Ghostscript also understands PDF, PDF input should never be converted to PostScript only for feeding it into Ghostscript. Ghostscript simply re-renders incoming PDF to the desired alternative PDF format (other level, quality, ...).

So I suggest the following changes on cups-pdf (being our fork if upstream does not accept them):

- The PPD file makes the tool accepting both PostScript and PDF, PDF with a lower cost factor, to assure a PDF printing workflow on systems with the pdftopdf filter installed (Debian, Ubuntu):

*cupsFilter: "application/vnd.cups-postscript 100 -"
*cupsFilter: "application/vnd.cups-pdf 0 -"

- The backend only calls Ghostscript if at least one of these three conditions is fulfilled:
1. The input data is PostScript
2. The user specifies a custom Ghostscript command line via "GSCall" in /etc/cups/cups-pdf.conf
3. The user specifies a custom PDF level via "PDFVer" in /etc/cups/cups-pdf.conf
Otherwise Ghostscript is not called and incoming PDF is simply saved in the user's directory. Default is "GSCall" and "PDFVer" not being set so that PDF gets simply saved.

This allows quick, reliable, and resource-saving PDF printin...

Read more...

Additional remarks:

1. The backend should not contain anything which forces the conversion of incoming PDF to PostScript, as Ghostscript understands PDF as well.
2. We should propose my suggested architecture upstream, as it still allows PostScript as input and conversion of the incoming data to an alternative PDF format with Ghostscript, controlled by the same variables as before in the config file.

Martin-Éric Racine (q-funk) wrote :

It mostly makes sense, except for one important detail: The assumption about input format being PDF is wrong, because CUPS-PDF is meant to *convert* input documents into PDF e.g. I can 'cat' an ASCII text file to 'lpr' and it comes out as PDF.

Martin-Éric Racine (q-funk) wrote :

So, for your above proposal to work, condition #1 would have to be "the input format is not PDF" instead.

Martin-Éric, due to the filter mechanisms of CUPS the data format arriving at cups-pdf is always PostScript or PDF. The upstream PPD file of cups-pdf forces CUPS to turn everything into PostScript, by omitting any cupsFilter line which defaults to

*cupsFilter: "application/vnd.cups-postscript 0 -"

All conversion to PostScript is done by the CUPS filters, texttops for plain text, pdftops for PDF, ... cups-pdf then receives PostScript and nothing else.

In my suggested architecture the PPD file has

*cupsFilter: "application/vnd.cups-postscript 100 -"
*cupsFilter: "application/vnd.cups-pdf 0 -"

allowing both PostScript and PDF as input. Any other format is converted by the filters coming with CUPS, so it is guaranteed that the input is PostScript or PDF. PDF is the desired destination format, so if the incoming format is PDF and the user has no special desires one could simply save it. PostScript can be turned to PDF by the backend using Ghostscript, as it was done before. If the user expresses special desires via the config file, the input data is passed through Ghostscript. In this case it does not matter whether the input data is PostScript or PDF, as Ghostscript understands both and switches automatically.

In Debian and Ubuntu CUPS is configured for a PDF-centric printing workflow. Desktop applications send PDF, then a pdftopdf filter applies CUPS' page management options and the result is fed into cups-pdf when one uses my suggested architecture.

If the user cats a text file into lpr, CUPS calls texttopdf, then pdftopdf and finally cups-pdf, again feeding PDF into cups-pdf.

Conversion of arbitrary input files is always done by CUPS and not by cups-pdf. In the good old times of PostScript-centric CUPS cups-pdf only needed to do the final step of turning PostScript into PDF, with the new PDF-centric CUPS workflow this is not needed any more, CUPS delivers PDF for us then. cups-pdf only needs to cater for special wishes, like changing the PDF level then, and can simply save the file in most cases as most users do not have these special wishes.

Martin-Éric Racine (q-funk) wrote :

If I understood correctly, what you're saying is that CUPS itself would handle conversion of any format into PDF (i.e. ASCII via text2pdf and PostScript via ps2pdf) before formatting the printable document via pdf2pdf according to the settings of whichever PPD is currently used and then sending it to a device driver backend? Would that cover any other input format than those I've mentioned?

Yes, CUPS turns everything to PDF (at least on Debian and Ubuntu), by using the filters texttopdf, imagetopdf, and pstopdf. In most cases these filters are not used at all as practically all desktop applications send print jobs in PDF. texttopdf usually only kicks in if a system admin wants to print quickly a config file with "lpr". All in all, one gets PDF and CUPS applies the pdftopdf filter to execute PPD-independent CUPS options like N-up, reverse order, .... After that the printer driver filters are called, pdftoraster and rasterto... for CUPS-Raster-based drivers, foomatic-rip for Ghostscript-based drivers, cpdftocps for PostScript printers. The driver filters apply the options from the PPD file, and some options of the PPD, like PageSize, are additionally used by the other filters.

For cups-pdf in my suggested architecture no driver filter is used. The PDF is then simply saved as it comes from CUPS or re-rendered by Ghostscript if desired by the user by the cups-pdf backend.

Martin-Éric Racine (q-funk) wrote :

Upstream says that he never noticed how those recent internal changes in CUPS could make the production of PDF documents much easier than before and no longer require Ghostscript by default.

What he is likely to do is to fix outstanding bugs within the 2.6 series and then, if his initial testing in his work environment is conclusive, fork the code to a 3.0 series that specifically depend upon recent upstream CUPS features for producing the PDF documents, all while retaining the existing configuration file format.

Upstream would however like to point out that, at his workplace, CUPS-PDF is often used to produce reports by automated means e.g. by networked laboratory equipment, so his code cannot make the assumption that CUPS-PDF would be used within a desktop environment or even from a Free Software printing client.

In any case, he cannot return to these issues until his Christmas vacations. Until then, if we have any patch to propose him, that could perhaps help speed up the migration process.

Changed in cups-pdf (Ubuntu):
importance: Undecided → Wishlist
pt123 (pt123) wrote :

Is there a temporary fix we can do to have cups-pdf working like it was in 10.10.

Can I also ask whether there is a temporary fix to get searchable text from CUPS-PDF?

I'm currently being forced to use Windoze - the pain, the pain....

pt123 (pt123) wrote :

you need to use print-to-file, cups has become buggy in the last release

unfortunately print-to-file doesn't work for me. My particular problem is a Windows program running under Wine, and print-to-file in that environment has a similar problem. I have found a fairly diabolical work-around for my current problem (involving Access 2003 to RTF, to Word 2003 to DOC, to LibreOffice to PDF), but it's tedious to say the least, and the resulting layout far from perfect.

As I get more experience using Ubuntu I suspect I will encounter other native Linux programs that I want to output to searchable PDF and that don't have their own PDF facility (I am aware that Libreoffice and some browsers already do have that facility).

Doesn't wine let you access Gnome's Print to file option

On Sun, Oct 14, 2012 at 7:00 AM, Tim Passingham <email address hidden>wrote:

> unfortunately print-to-file doesn't work for me. My particular problem
> is a Windows program running under Wine, and print-to-file in that
> environment has a similar problem. I have found a fairly diabolical
> work-around for my current problem (involving Access 2003 to RTF, to
> Word 2003 to DOC, to LibreOffice to PDF), but it's tedious to say the
> least, and the resulting layout far from perfect.
>
> As I get more experience using Ubuntu I suspect I will encounter other
> native Linux programs that I want to output to searchable PDF and that
> don't have their own PDF facility (I am aware that Libreoffice and some
> browsers already do have that facility).
>
> --
> You received this bug notification because you are subscribed to a
> duplicate bug report (1003883).
> https://bugs.launchpad.net/bugs/820820
>
> Title:
> cups-pdf should not convert PDF print jobs to PS then to PDF
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/cups-pdf/+bug/820820/+subscriptions
>

The Wine print-to-file does not appear to work well. Some applications under wine produce a postcript file, but this does not seem to provide searchable text. This may be a Wine bug.

However, the print-to-file option for native linux programs does seem to provide what CUPS-PDF should provide (I tried it with Thunderbird), which rather begs the question of why CUPS_PDF can't be made to work.

CUPS-PDF converts PostScript to PDF. PostScript does not support searchable text. So CUPS-PDF does not have the information available to make a searchable PDF. Many applications now generate print output in PDF format. So using CUPS-PDF means the searchable PDF from the application is converted to PostScript then converted back to PDF. That's why the print to file from the Linux print dialog is the better option.

Does wine allow printer drivers to be installed? There are PDF printer drivers such as PDFCreator that will create searchable PDFs.

pt123 (pt123) wrote :

Adrian cups pdf used to create searchable PDFs until 12.04, when it became infested with many bugs including this one.

It depends on the font encoding used in the PostScript. It is possible for it to work in some cases. It may be a CUPS-PDF regression or it may be a change in the way the fonts are encoding in the PostScript supplied to CUPS-PDF.

It is possible to get searchable PDF when converting PostScript to PDF but it is not guaranteed to work for every PostScript file.

Unfortunately there are some things that are not fully supported in Wine. I don't understand the details, but it seems there's a missing redirected port monitor (Redmon) feature that means that PDFCreator doesn't work. There are reports that there is also a problem with the fonts which would prevent searchable PDFs from being produced even if the rest worked. I have a parallel discussions going on the Wine bug list and on the PDFCreator forum, just in case solutions comes from that angle.

Rob Crow (r-crow) wrote :

Is there any resolution to this issue? - i.e Cups-PDF converting PDFs to PS and back.

I am still getting mangled PDFs out of CUPS PDF (mac os)

Martin Wildam (mwildam) wrote :

The point is that printing e.g. from Firefox to cups-pdf works with the
older versions of cups-pdf and does not with recent versions (for the same
websites).

Why not print to file, you ask: It is simply a lot more steps (more clicks
and manually entering a filename).

I do not really understand, why this long discussion instead of reverting
the changes that caused the issue. pdf-printing was a solved problem for
years and now suddenly became an issue.

--
Regards, Martin Wildam.
Am 17.09.2013 15:21 schrieb "Rob Crow" <email address hidden>:

> Is there any resolution to this issue? - i.e Cups-PDF converting PDFs to
> PS and back.
>
> I am still getting mangled PDFs out of CUPS PDF (mac os)
>
> --
> You received this bug notification because you are subscribed to the bug
> report.
> https://bugs.launchpad.net/bugs/820820
>
> Title:
> cups-pdf should not convert PDF print jobs to PS then to PDF
>
> Status in “cups-pdf” package in Ubuntu:
> Triaged
>
> Bug description:
> When printing to cups-pdf from applications that use PDF as the
> printing format, the PDF output from the application is converted to
> PostScript using pdftops then back to PDF using Ghostscript. This
> turns a perfectly good PDF file into a crappy PDF file that does not
> display images correctly and the text is not selectable. If the print
> job is already in PDF format, cups-pdf should just pass the file
> through unmodified or at least avoid the PDF->PS->PDF conversion.
>
> ProblemType: Bug
> DistroRelease: Ubuntu 11.04
> Package: cups-pdf 2.5.1-2
> ProcVersionSignature: Ubuntu 2.6.38-10.46-generic 2.6.38.7
> Uname: Linux 2.6.38-10-generic x86_64
> Architecture: amd64
> CupsErrorLog:
>
> Date: Thu Aug 4 18:41:17 2011
> InstallationMedia: Ubuntu 10.10 "Maverick Meerkat" - Release amd64
> (20101007)
> Papersize: a4
> ProcEnviron:
> LANGUAGE=en_AU:en
> PATH=(custom, user)
> LANG=en_AU.UTF-8
> SHELL=/bin/bash
> SourcePackage: cups-pdf
> UpgradeStatus: Upgraded to natty on 2011-05-01 (95 days ago)
>
> To manage notifications about this bug go to:
>
> https://bugs.launchpad.net/ubuntu/+source/cups-pdf/+bug/820820/+subscriptions
>

Rob Crow (r-crow) wrote :

Cups-PDF almost is perfect for my task, it is just the conversion of PDFs to Postscript (and then back to PDF) which is introducing errors in the final PDF's, when they could simply skip that conversion.

I think the architecture suggested in this discussion seems rational and would be welcomed.

If there is a version, branch or alternative backend with this functionality somewhere I would like to know.

I discovered this bug just yesterday as I used pdfgrep to find a special expression in the PDF archive of my mails. Only mails before August 2011 were searchable.

Until this bug is fixed: What exacxtly do I have to do to prevent the ps-and-back-conversion? I didn’t understand the thing with *cupsFilter: "application/vnd.cups-postscript 100 -" and *cupsFilter: "application/vnd.cups-pdf 0 -"? Where does this have to be added? Anything else?

To use “print to file” for every single mail is unsuitable, I need the mass printing where the filename is automatically generated from the mail subject.

Hello all,

I've created a patch against cups-pdf_2.6.1-9 which addresses this issue. See attachment.

kind regards,

Björgvin

The attachment "PDF input support for cups-pdf" seems to be a patch. If it isn't, please remove the "patch" flag from the attachment, remove the "patch" tag, and if you are a member of the ~ubuntu-reviewers, unsubscribe the team.

[This is an automated message performed by a Launchpad user owned by ~brian-murray, for any issues please contact him.]

tags: added: patch

Björgvin, is the adding of JCL by the PPD file really needed? The resulting PDF should be pure PDF without any JCL around it, as it is not for an actual printer.

No, you're right, it's not needed, I've attached an updated version of the patch.

I did not read this bug report carefully enough, but this is essentially the same as 70_cups-pdf_support-pdf-workflow.patch which was dropped as discussed earlier.

I'm using cups-pdf as a stop gap until LibreOffice improves the usability of its "print to file" feature. (not the same as export to PDF).

Jethro Beekman (jethrogb) wrote :

Thanks Björgvin. PPA here: https://launchpad.net/~jethrogb/+archive/ppa . My PPA does currently not contain any standard packages, but I'm not guaranteeing it won't in the future, so if you don't want to risk updating those, set you apt_preferences accordingly:

Package: *
Pin: release o=LP-PPA-jethrogb
Pin-Priority: -1

Package: cups-pdf
Pin: release o=LP-PPA-jethrogb
Pin-Priority: 500

Jethro Beekman (jethrogb) wrote :

I noticed two things about the patch:

1) I needed to reinstall the printer for it to work. If this ever makes it into distribution, that should be automatic.
2) Files printed from Firefox this way no longer have a title (output file is just called "PDF-job_###.pdf")

To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers