hplip-data ballooned by 2.5 MB in lucid

Bug #493282 reported by Steve Langasek on 2009-12-06
10
This bug affects 1 person
Affects Status Importance Assigned to Milestone
HPLIP
Undecided
Unassigned
hplip (Ubuntu)
Medium
Till Kamppeter
Lucid
Medium
Till Kamppeter

Bug Description

Binary package hint: hplip

The hplip-data package is 5MB larger in lucid than it was in karmic, making this package the single top contributor to the fact that the amd64 alternate CD is currently 22MB oversized.

The change in size is due to a huge increase in the size of all the ppds in the package; e.g.:
-120883 hp-color_laserjet_cp6015-ps.ppd
+447513 hp-color_laserjet_cp6015-ps.ppd

This needs to be remedied so we can have usable CDs for alpha-1.

ProblemType: Bug
Architecture: amd64
CupsErrorLog: W [06/Dec/2009:08:06:19 -0800] No limit for CUPS-Get-Document defined in policy default - using Send-Document's policy
Date: Sun Dec 6 11:48:16 2009
DistroRelease: Ubuntu 10.04
Lpstat:
 device for davidson-printer: smb://WORKGROUP/EXCALIBUR/Printer2
 device for PSC_750: ipp://192.168.13.6:631/printers/PSC_750
 device for PSC_750_legal: ipp://192.168.13.6:631/printers/PSC_750_legal
MachineType: LENOVO 6371CTO
Package: hplip 3.9.10-0ubuntu2
Papersize: letter
PccardctlIdent:
 Socket 0:
   no product info available
PccardctlStatus:
 Socket 0:
   no card
PpdFiles: davidson-printer: HP Color LaserJet CP3505 Postscript (recommended)
ProcCmdLine: root=/dev/mapper/hostname-root ro quiet splash
ProcEnviron:
 PATH=(custom, user)
 LANG=en_US.UTF-8
 SHELL=/bin/bash
ProcVersionSignature: Ubuntu 2.6.32-6.8-generic
SourcePackage: hplip
Tags: lucid
Uname: Linux 2.6.32-6-generic x86_64
dmi.bios.date: 12/27/2006
dmi.bios.vendor: LENOVO
dmi.bios.version: 7IET23WW (1.04 )
dmi.board.name: 6371CTO
dmi.board.vendor: LENOVO
dmi.board.version: Not Available
dmi.chassis.asset.tag: No Asset Information
dmi.chassis.type: 10
dmi.chassis.vendor: LENOVO
dmi.chassis.version: Not Available
dmi.modalias: dmi:bvnLENOVO:bvr7IET23WW(1.04):bd12/27/2006:svnLENOVO:pn6371CTO:pvrThinkPadT60:rvnLENOVO:rn6371CTO:rvrNotAvailable:cvnLENOVO:ct10:cvrNotAvailable:
dmi.product.name: 6371CTO
dmi.product.version: ThinkPad T60
dmi.sys.vendor: LENOVO

Related branches

Steve Langasek (vorlon) wrote :
Changed in hplip (Ubuntu):
importance: Undecided → High
milestone: none → lucid-alpha-1
status: New → Triaged
Till Kamppeter (till-kamppeter) wrote :

The change is in the PostScript printer PPDs supplied by HP. They have internationalized them, which means that all option menu items are translated into around 10 languages now. This has blown up these PPDs.

The internationalization is based on a CUPS PPD extension (http://www.cups.org/documentation.php/doc-1.4/spec-ppd.html) for which I have made some propaganda recently when communicating with the printer manufacturers, as they will lead to translated option and choice names in the Common Printing Dialog (https://www.linuxfoundation.org/collaborate/workgroups/openprinting/commonprintingdialog), one of my main projects at OpenPrinting. It also provides translations for the web-based printer setup tool of CUPS (http://localhost:631/).

I cannot remove the PostScript PPDs from the HPLIP package, as there is a very big user base with PostScript printers from HP. They are also not auto-downloadable at OpenPrinting as HP is maintaining them in HPLIP, which comes with all distributions. I have removed the HP PPDs from OpenPrinting some time ago as no one kept them up-to-date.

Possible solutions are:

Do not ship the openprinting-ppds package. This would free some space, but for any non-HP PostScript printer the user must be connected to the internet for setting up the printer, as the PPD will get auto-downloaded from OpenPrinting.

Do not ship HP's PPDs and post them on OpenPrinting again. This would require that the PPDs at OpenPrinting are quickly updated after each release of HPLIP (or even before, in cooperation with HP). This would mean that users of HP PostScript printers need internet connection to set up their printers.

Currently, we make use of OpenPrinting to save space by letting the PPDs from Ricoh family and OEM getting auto-downloaded from OpenPrinting.

In the future we will get more space occupation with PPD files. More manufacturers discover internationalization and more manufacturers provide PPD files at all.

I am thinking about opening a Google Summer of Code project next year about highly compressing ready-made PPDs shipping with Linux distributions, based on a PPD generator (program in /usr/lib/cups/driver) which uncompresses them on-the-fly.

Steve Langasek (vorlon) wrote :

appears the CDs are down to size again for alpha-1, so deferring this bug.

Changed in hplip (Ubuntu Lucid):
milestone: lucid-alpha-1 → lucid-alpha-2
Martin Pitt (pitti) wrote :
Download full text (7.9 KiB)

There's also a much easier potential for triming the package size, as the livefs build's fdupes check shows:

BEGIN fdupes
46062897 2461 3910
4551360 12 379280 usr/share/ppd/hplip/HP/hp-laserjet_p4014-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p4014dn-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p4014n-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p4015-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p4015dn-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p4015n-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p4015tn-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p4015x-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p4515-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p4515n-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p4515tn-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p4515x-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p4515xm-ps.ppd
1524824 7 217832 usr/share/doc/libgtk2.0-common/changelog.gz usr/share/doc/libgtk2.0-common/ChangeLog.gz usr/share/doc/libgtk2.0-0/changelog.gz usr/share/doc/libgail-common/changelog.gz usr/share/doc/libgail18/changelog.gz usr/share/doc/libgtk2.0-bin/changelog.gz usr/share/doc/gtk2-engines-pixbuf/changelog.gz usr/share/doc/gtk2-engines-pixbuf/ChangeLog.gz
922590 3 307530 usr/share/ppd/hplip/HP/hp-color_laserjet_cm2320_mfp-ps.ppd usr/share/ppd/hplip/HP/hp-color_laserjet_cm2320fxi_mfp-ps.ppd usr/share/ppd/hplip/HP/hp-color_laserjet_cm2320n_mfp-ps.ppd usr/share/ppd/hplip/HP/hp-color_laserjet_cm2320nf_mfp-ps.ppd
855376 1 855376 usr/share/doc/libsnmp-base/changelog.gz usr/share/doc/libsnmp15/changelog.gz
749940 3 249980 usr/share/ppd/hplip/HP/hp-laserjet_p2015_series-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p2015dn_series-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p2015n_series-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p2015x_series-ps.ppd
731726 2 365863 usr/share/doc/libdrm2/changelog.gz usr/share/doc/libdrm-intel1/changelog.gz usr/share/doc/libdrm-radeon1/changelog.gz
721857 3 240619 usr/share/ppd/hplip/HP/hp-laserjet_p2055-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p2055d-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p2055dn-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_p2055x-ps.ppd
669490 5 133898 usr/share/themes/New\ Wave\ Dark\ Menus/gtk-2.0/default-gtkrc usr/share/themes/New\ Wave/gtk-2.0/gtkrc
614156 4 153539 usr/share/ppd/hplip/HP/hp-laserjet_1320-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_1320_series-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_1320n-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_1320nw-ps.ppd usr/share/ppd/hplip/HP/hp-laserjet_1320tn-ps.ppd
549650 2 274825 usr/share/ppd/hplip/HP/hp-color_laserjet_cp1514n-ps.ppd usr/share/ppd/hplip/HP/hp-color_laserjet_cp1515n-ps.ppd usr/share/ppd/hplip/HP/hp-color_laserjet_cp1518ni-ps.ppd
531902 14 37993 usr/share/doc/libmono-corlib2.0-cil/copyright usr/share/doc/libmono-system-data2.0-cil/copyright usr/share/doc/libmono-system2.0-cil/copyright usr/share/doc/libmono-security2.0-cil/copyright usr/share/doc/mono-2.0-gac/copyright usr/share/doc/mono-gac/copyright usr/share/doc/mono-runtime/copyright usr/share/doc/libmono-data-tds2.0-cil/copyright usr/share/doc/libmono-posix2.0-cil/copyright usr/share/doc/libmono-sharpzip2.84-cil/copyright usr/share/doc/libmono-sqlite2.0-cil/copyright usr/share/doc/libmono2.0-ci...

Read more...

Till Kamppeter (till-kamppeter) wrote :

To the developers of HPLIP at HP: Please do not ship absolutely identical files more than once. Especially for PPD files CUPS chooses the PPD only by model, nickname and device ID entries in the PPD, not by the file name under which the PPD is saved on the disk. So adding the same PPD with another file name does not add any new model to CUPS' list of supported printers. This leads only to wasted disk space and ugly duplicate entries in the lists of supported printers in printer setup tools. Please remove all these duplicate PPDs from the package. Please do not add symlinks to avoid duplicate entries in printer/driver lists.

Till Kamppeter (till-kamppeter) wrote :

pitti, I have implemented this now and committed it to the Debian BZR repository of HPLIP, but unfortunately it does not reduce the size a lot. There are 783 PPD files but only 65 duplicates. So the expected reduction is less then 10 %.

A general check of /usr/share/ppd with

fdupes -r /usr/share/ppd

shows that there are no other packages with duplicate PPDs.

Suma Byrappa (suma-byrappa) wrote :

Hi Till,

These issues are already in our list. The future HPLIP releases will definitely address these. We won't be able to commit the time frame. But certainly, they are priority issues for us too.

Thanks for your support for HPLIP.
Suma

Launchpad Janitor (janitor) wrote :

This bug was fixed in the package hplip - 3.9.10-3ubuntu3

---------------
hplip (3.9.10-3ubuntu3) lucid; urgency=low

  * debian/rules: Remove identical PPD files, they only waiste space and
    cause duplicate entries in printer setup tools (LP: #493282).
 -- Till Kamppeter <email address hidden> Wed, 09 Dec 2009 16:45:18 +0100

Changed in hplip (Ubuntu Lucid):
status: Triaged → Fix Released
Steve Langasek (vorlon) wrote :

14M /home/lp_archive/ubuntu/pool/main/h/hplip/hplip-data_3.9.10-3ubuntu2_all.deb
14M /home/lp_archive/ubuntu/pool/main/h/hplip/hplip-data_3.9.10-3ubuntu3_all.deb

The new upload has done nothing to change the package size.

Changed in hplip (Ubuntu Lucid):
status: Fix Released → Triaged

Sorry, I did not mean to close this bug by the last upload. I only added the "LP: #..." to reference to this bug.

It seems that the real solution is what I mentioned in the last paragraph of comment #2. As it is more programming effort (Finding algorithms to find out which PPDs are very similar and can be replaced by 1 full PPD plus diffs, finding other compression strategies specific to PPDs, code everything, ...) and also a nice student project, I thought about letting this get done by someone in the next Google Summer of Code, but then it will only be available for Lucid+1. Or did this turn too urgent now as the PostScript printer PPD part of the distro reached a critical mass?

I have found a problem in the package which made the elimination of duplicate PPDs not working on our build servers. I have uploaded a fixed version (hplip 3.9.10-3ubuntu4) now. Here the size of the hplip-data package actually reduces from 13.3 MiB to 10.7 MiB. That is not a lot, but I hope this helps already.

I tried also to remove the compression of the individual PPD files (*.ppd instead of *.ppd.gz) in foomatic-db, but this does not help much. The sizes of the binary packages openprinting-ppds and openprinting-ppds-extra reduce by less than 10% (o-p: 3311892 -> 3129502, o-p-e: 18184504 -> 17508048) but the installed size on uncompressed file systems (hard disk installation) gets much bigger.

Johannes Meixner (jsmeix) wrote :

FYI:
In particular regarding PostScript printer PPDs,
see the related bug
https://bugs.launchpad.net/hplip/+bug/485218

Only a guess regarding the above comment #5
https://bugs.launchpad.net/ubuntu/+source/hplip/+bug/493282/comments/5
Perhaps HP's own setup tool hp-setup uses some
additional special magic via the PPD file name?

Martin Pitt (pitti) wrote :

The fdupes fix mitigated the situation a lot already. Instead of 5 MB, the package is now just 2.2 MB bigger than in karmic.

Changed in hplip (Ubuntu Lucid):
importance: High → Medium
Jason Scurtu (solard3ity) wrote :

Why not take out PPD's that are a few years old perhabs 5 years?! Or all those that are actually not supported from HP anymore and pack them in to a seperate package named "hplib-unsupported-ppd" and provide them by auto download+install when needed , would'nt that work?

Martin Pitt (pitti) on 2010-01-22
summary: - hplip-data ballooned by 5MB in lucid
+ hplip-data ballooned by 2.5 MB in lucid
Changed in hplip (Ubuntu Lucid):
milestone: lucid-alpha-2 → ubuntu-10.04-beta-1
assignee: nobody → Till Kamppeter (till-kamppeter)
Steve Langasek (vorlon) wrote :

It seems the size increase is under control for Lucid and no longer causes problems for the CDs, so unmilestoning/untargeting.

I think it's still a flawed design to embed all the translations in the PPDs and think this should be revisited at a later date to see if the same benefits can be achieved without having to ship all this duplicate information in each file.

Changed in hplip (Ubuntu Lucid):
milestone: ubuntu-10.04-beta-1 → none
status: Triaged → Won't Fix
Changed in hplip (Ubuntu):
status: Triaged → In Progress

The student project mentioned in comment #10 was now successfully concluded. I have mentored Vitor Baptista in the Google Summer of Code 2010 exactly for this. He finished it today and I applied it to the HPLIP package (binary: hplip-data) and also to foomatic-db (binaries: openprinting-ppds and openprinting-ppds-extra) in Maverick, saving near 60 MB in an installed system.

Changed in hplip (Ubuntu):
status: In Progress → Fix Committed
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package hplip - 3.10.6-1ubuntu1

---------------
hplip (3.10.6-1ubuntu1) maverick; urgency=low

  * debian/local/pyppd/, debian/rules: Compressed all the physical PPD files
    for the PostScript printers of the hplip-data package into an archiv file
    reducing the disk space occupation by a factor of 10, freeing several tens
    of megabytes on the Ubuntu Desktop CDs (or on any live CD based on Debian
    or a derivative distribution). The archives are self-extracting and located
    in /usr/lib/cups/driver/, so that CUPS automatically extracts the PPD
    files. Thank you very much to Vitor Baptista who developed this great PPD
    compressor in the Google Summer of Code 2010 (LP: #493282).
  * debian/hplip.postinst: Updated auto updater for the PPDs of the already
    existing print queues to work with the new PPD archive.
  * debian/control: Changed versioned conflict of hpijs-ppds with
    foomatic-filters-ppds. Now it conflicts for versions bigger than
    20000101 (real foomatic-filters-ppds packages) and not with
    foomatic-filters-ppds with a small version number (transitional packages).
 -- Till Kamppeter <email address hidden> Tue, 10 Aug 2010 23:16:18 +0200

Changed in hplip (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  Edit
Everyone can see this information.

Other bug subscribers