xsane PDF file sizes could be optimized
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
xsane (Ubuntu) |
Confirmed
|
Wishlist
|
Unassigned |
Bug Description
I've been using xsane for years. I was very happy to see version 0.991 in Ubuntu Edgy 6.10 including PDF and multi-page output options enabled. These features work, but the PDF file sizes could be more optimal.
For example, a one-page document I scanned using xsane and converted two different ways produced a noticeably smaller file than xsane's PDF version. (Approximately 8% smaller.)
Example: I scanned a US Letter size document three different ways in xsane.
101769 -- test_0001.pdf -- scanned to pdf
1016760 -- test_0002.pnm -- scanned to pnm
93843 -- test_0002x.pdf -- converted using my procedure
101107 -- test_0003.ps -- scanned to ps
94081 -- test_0003x.pdf -- converted using ps2pdf
As you can see, xsane saved a single-page pdf file 101769 bytes long, but the same document scanned as a pnm or a ps file then converted produced a 93843 byte or 94081 byte file, respectively.
When scanning a multi-page document this difference in size adds up.
Here's the procedure I have used for years to get optimally-sized pdf files:
1) scan all pages to pnm lineart at 300 dpi (tiff also works but you must wait for xsane to convert each file)
2) convert -density 300x300 file*.pnm temp.ps
3) ps2pdf -dPDFSETTINGS=
4) rm temp.ps
This procedure depends upon imagemagick's convert command and ghostscript's ps2pdf command. It requires specifying the density in convert and setting the page size in the ps2pdf. (If you don't add these refinements, the resulting PDF image may be the wrong density or may have the wrong size bounding box. I suspect these issues may depend upon bugs/features of specific versions of imagemagick and ghostscript.)
Changed in xsane (Ubuntu): | |
status: | New → Confirmed |
Thank you for taking the time to report this bug and helping to make Ubuntu better. I have reproduced this issue by using xsane to produce both a pdf (1.4MB) and ps (1.8MB) scan, and then converting the ps to a pdf (103.8KB). Upon visual inspection, it is apparent that this large difference in file size is due to image compression (there are clear visual artifacts on the low quality version), and not because of poor optimization. I am therefore marking this bug as invalid. Please don't hesitate to submit bug reports in the future. Thanks again!