PDF To PNG Or JPG Conversion

From Wiki
Jump to navigationJump to search

Overview

The following programs can be used under Linux to convert PDF files to JPG or PNG.

For the purposes of the examples, the input file is named 'HP_HDSP-2112.pdf', and is available here. This is a datasheet for a part that has a few graphics, text, but no color.

ImageMagick

Using the density option will render the PDF to JPG with a specified number of dots-per-inch resolution. Nominally, for a letter sized page, at 300 DPI, this will result in a JPG that is approximately 2550 x 3300 pixels.

convert -density 300 HP_HDSP-2112.pdf HP_HDSP-2112.jpg

To control the dimension of the output file(s), specify -size and the dimensions. This example will generate a JPG of 1024 x 768.

convert -size 1024x768 HP_HDSP-2112.pdf HP_HDSP-2112.jpg

Note that for the examples above, if the source PDF (HP_HDSP-2112.pdf) is 16 pages, 16 individual output files will be created, named HP_HDSP-2112-x.jpg, where 'x' is the page number.

To produce PNG files instead of JPG, replace the HP_HDSP-2112.jpg with HP_HDSP-2112.png.

Ghostscript

Ghostscript does font substitution when it can't find a font used by the PDF file. This will cause pages to render differently than intended. I don't know why, but ImageMagick hasn't had this problem on the pages I've tried converting. Perhaps it's smarter about finding fonts. This needs more research.

This '%03d' causes the pages to be named 'out- 1.jpg' for the first one, 'out- 10.jpg' for the tenth, etc. This results in pages that sort correctly for 'ls'. '%d' can be used instead, which will name the pages without a leading space, but 'ls' will not sort them in numerical order.

The '-dFirstPage' and '-dLastPage' can be dropped if the full document is to be converted.

gs -sDEVICE=jpeg -dFirstPage=2 -dLastPage=11 -o out-%03d.jpg HP_HDSP-2112.pdf

To specify the output size of the image, specify the '-g' argument, as shown below. Notice they are reversed from what you would expect.

gs -sDEVICE=jpeg -g768x1024 -dFirstPage=1 -dLastPage=1 -o out-%d.jpg HP_HDSP-2112.pdf

This produces the following output, which shows the font substitution:

GPL Ghostscript 8.63 (2008-08-01)
Copyright (C) 2008 Artifex Software, Inc.  All rights reserved.
This software comes with NO WARRANTY: see the file PUBLIC for details.
Processing pages 1 through 1.
Page 1
Substituting font Helvetica-Narrow-Bold for AgilentCond-Bold.
Loading NimbusSanL-BoldCond font from /usr/share/fonts/default/ghostscript/n019044l.pfb... 2579384 1094648 11389452 10100772 3 done.
Substituting font Helvetica-Narrow for AgilentCond-Regular.
Loading NimbusSanL-ReguCond font from /usr/share/fonts/default/ghostscript/n019043l.pfb... 2683852 1283689 16460520 15160316 3 done.
Substituting font Helvetica-Narrow-Bold for Myriad-CnBold.
Substituting font Helvetica-Narrow for Myriad-Condensed.

Poppler

http://poppler.freedesktop.org/

Adobe SDK

http://www.adobe.com/devnet/pdf/library/