wkhtmltopdf 0.9.6 Manual
This file documents wkhtmltopdf, a program capable of converting html documents into PDF documents.
Contact
If you experience bugs or want to request new features please visit http://code.google.com/p/wkhtmltopdf/issues/list, if you have any problems or comments please feel free to contact me: see http://www.madalgo.au.dk/~jakobt/#about
Reduced Functionality
Some versions of wkhtmltopdf are compiled against a version of QT without the wkhtmltopdf patches. These versions are missing some features, you can find out if your version of wkhtmltopdf is one of these by running wkhtmltopdf --version if your version is against an unpatched QT, you can use the static version to get all functionality.
Currently the list of features only supported with patch QT includes:
Printing more then one HTML document into a PDF file.
Running without an X11 server.
Adding a document outline to the PDF file.
Adding headers and footers to the PDF file.
Generating a table of contents.
Adding links in the generated PDF file.
Printing using the screen media-type.
Disabling the smart shrink feature of webkit.
License
Copyright (C) 2008,2009 Wkhtmltopdf Authors.
License GPLv3+: GNU GPL version 3 or later http://gnu.org/licenses/gpl.html. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law.
Authors
Written by Jakob Truelsen. Patches by Mário Silva, Benoit Garret and Emmanuel Bouthenot.
Synopsis
wkhtmltopdf [OPTIONS]...
[More input files]
General Options
--allow Allow the file or files from the specified folder to be loaded (repeatable) -b, --book* Set the options one would usually set when printing a book --collate Collate when printing multiple copies --cookie Set an additional cookie (repeatable) --cookie-jar Read and write cookies from and to the supplied cookie jar file --copies Number of copies to print into the pdf file (default 1) --cover* Use html document as cover. It will be inserted before the toc with no headers and footers --custom-header Set an additional HTTP header (repeatable) --debug-javascript Show javascript debugging output -H, --default-header* Add a default header, with the name of the page to the left, and the page number to the right, this is short for: --header-left='[webpage]' --header-right='[page]/[toPage]' --top 2cm --header-line --disable-external-links* Do no make links to remote web pages --disable-internal-links* Do no make local links -n, --disable-javascript Do not allow web pages to run javascript --disable-pdf-compression* Do not use lossless compression on pdf objects --disable-smart-shrinking* Disable the intelligent shrinking strategy used by WebKit that makes the pixel/dpi ratio none constant --disallow-local-file-access Do not allowed conversion of a local file to read in other local files, unless explecitily allowed with --allow -d, --dpi Change the dpi explicitly (this has no effect on X11 based systems) --enable-plugins Enable installed plugins (such as flash --encoding Set the default text encoding, for input --extended-help Display more extensive help, detailing less common command switches --forms* Turn HTML form fields into pdf form fields -g, --grayscale PDF will be generated in grayscale -h, --help Display help --htmldoc Output program html help --ignore-load-errors Ignore pages that claimes to have encountered an error during loading -l, --lowquality Generates lower quality pdf/ps. Useful to shrink the result document space --manpage Output program man page -B, --margin-bottom Set the page bottom margin (default 10mm) -L, --margin-left Set the page left margin (default 10mm) -R, --margin-right Set the page right margin (default 10mm) -T, --margin-top Set the page top margin (default 10mm) --minimum-font-size Minimum font size (default 5) --no-background Do not print background -O, --orientation Set orientation to Landscape or Portrait --page-height Page height (default unit millimeter) --page-offset* Set the starting page number (default 1) -s, --page-size Set paper size to: A4, Letter, etc. --page-width Page width (default unit millimeter) --password HTTP Authentication password --post Add an additional post field (repeatable) --post-file Post an aditional file (repeatable) --print-media-type* Use print media-type instead of screen -p, --proxy Use a proxy -q, --quiet Be less verbose --read-args-from-stdin Read command line arguments from stdin --readme Output program readme --redirect-delay Wait some milliseconds for js-redirects (default 200) --replace* Replace [name] with value in header and footer (repeatable) --stop-slow-scripts Stop slow running javascripts --title The title of the generated pdf file (The title of the first document is used if not specified) -t, --toc* Insert a table of content in the beginning of the document --use-xserver* Use the X server (some plugins and other stuff might not work without X11) --user-style-sheet Specify a user style sheet, to load with every page --username HTTP Authentication username -V, --version Output version information an exit --zoom Use this zoom factor (default 1)
Items marked * are only available using patched QT.
Headers And Footer Options
--footer-center* Centered footer text --footer-font-name* Set footer font name (default Arial) --footer-font-size* Set footer font size (default 11) --footer-html* Adds a html footer --footer-left* Left aligned footer text --footer-line* Display line above the footer --footer-right* Right aligned footer text --footer-spacing* Spacing between footer and content in mm (default 0) --header-center* Centered header text --header-font-name* Set header font name (default Arial) --header-font-size* Set header font size (default 11) --header-html* Adds a html header --header-left* Left aligned header text --header-line* Display line below the header --header-right* Right aligned header text --header-spacing* Spacing between header and content in mm (default 0) Items marked * are only available using patched QT. Table Of Content Options --toc-depth* Set the depth of the toc (default 3) --toc-disable-back-links* Do not link from section header to toc --toc-disable-links* Do not link from toc to sections --toc-font-name* Set the font used for the toc (default Arial) --toc-header-font-name* The font of the toc header (if unset use --toc-font-name) --toc-header-font-size* The font size of the toc header (default 15) --toc-header-text* The header text of the toc (default Table Of Contents) --toc-l1-font-size* Set the font size on level 1 of the toc (default 12) --toc-l1-indentation* Set indentation on level 1 of the toc (default 0) --toc-l2-font-size* Set the font size on level 2 of the toc (default 10) --toc-l2-indentation* Set indentation on level 2 of the toc (default 20) --toc-l3-font-size* Set the font size on level 3 of the toc (default 8) --toc-l3-indentation* Set indentation on level 3 of the toc (default 40) --toc-l4-font-size* Set the font size on level 4 of the toc (default 6) --toc-l4-indentation* Set indentation on level 4 of the toc (default 60) --toc-l5-font-size* Set the font size on level 5 of the toc (default 4) --toc-l5-indentation* Set indentation on level 5 of the toc (default 80) --toc-l6-font-size* Set the font size on level 6 of the toc (default 2) --toc-l6-indentation* Set indentation on level 6 of the toc (default 100) --toc-l7-font-size* Set the font size on level 7 of the toc (default 0) --toc-l7-indentation* Set indentation on level 7 of the toc (default 120) --toc-no-dots* Do not use dots, in the toc
Items marked * are only available using patched QT.
Outline Options
--dump-outline* Dump the outline to a file --outline* Put an outline into the pdf --outline-depth* Set the depth of the outline (default 4)
Items marked * are only available using patched QT.
Specifying A Proxy
By default proxy information will be read from the environment variables: proxy, all_proxy and http_proxy, proxy options can also by specified with the -p switch
:= "http://" | "socks5://"
:= (":" )? "@"
:= "None" | ? ? (":" )?
Here are some examples (In case you are unfamiliar with the BNF):
http://user:password@myproxyserver:8080
socks5://myproxyserver
None
Footers And Headers
Headers and footers can be added to the document by the --header-* and --footer* arguments respectfully. In header and footer text string supplied to e.g. --header-left, the following variables will be substituted.
* [page] Replaced by the number of the pages currently being printed
* [frompage] Replaced by the number of the first page to be printed
* [topage] Replaced by the number of the last page to be printed
* [webpage] Replaced by the URL of the page being printed
* [section] Replaced by the name of the current section
* [subsection] Replaced by the name of the current subsection
* [date] Replaced by the current date in system local format
* [time] Replaced by the current time in system local format
As an example specifying --header-right "Page [page] of [toPage]", will result in the text "Page x of y" where x is the number of the current page and y is the number of the last page, to appear in the upper left corner in the document.
Headers and footers can also be supplied with HTML documents. As an example one could specify --header-html header.html, and use the following content in header.html:
As can be seen from the example, the arguments are sent to the header/footer html documents in get fashion.
Outlines
Wkhtmltopdf with patched qt has support for PDF outlines also known as book marks, this can be enabled by specifying the --outline switch. The outlines are generated based on the tags, for a in-depth description of how this is done see the "Table Of Contest" section.
The outline tree can sometimes be very deep, if the tags where spread to generous in the HTML document. The --outline-depth switch can be used to bound this.
Page Breaking
The current page breaking algorithm of WebKit leaves much to be desired. Basically webkit will render everything into one long page, and then cut it up into pages. This means that if you have two columns of text where one is vertically shifted by half a line. Then webkit will cut a line into to pieces display the top half on one page. And the bottom half on another page. It will also break image in two and so on. If you are using the patched version of QT you can use the CSS page-break-inside property to remedy this somewhat. There is no easy solution to this problem, until this is solved try organising your HTML documents such that it contains many lines on which pages can be cut cleanly.
See also: http://code.google.com/p/wkhtmltopdf/issues/detail?id=9, http://code.google.com/p/wkhtmltopdf/issues/detail?id=33 and http://code.google.com/p/wkhtmltopdf/issues/detail?id=57.
Page sizes
The default page size of the rendered document is A4, but using this --page-size optionthis can be changed to almost anything else, such as: A3, Letter and Legal. For a full list of supported pages sizes please see http://doc.trolltech.com/4.6/qprinter.html#PageSize-enum.
For a more fine grained control over the page size the --page-height and --page-width options may be used
Reading arguments from stdin
If you need to convert a lot of pages in a batch, and you feel that wkhtmltopdf is a bit to slow to start up, then you should try --read-args-from-stdin,
When --read-args-from-stdin each line of input sent to wkhtmltopdf on stdin will act as a separate invocation of wkhtmltopdf, with the arguments specified on the given line combined with the arguments given to wkhtmltopdf
For example one could do the following:
echo "http://doc.trolltech.com/4.5/qapplication.html qapplication.pdf" >> cmds
echo "--cover google.com http://en.wikipedia.org/wiki/Qt_(toolkit) qt.pdf" >> cmds
wkhtmltopdf --read-args-from-stdin --book < cmds
Static version
On the wkhtmltopdf website you can download a static version of wkhtmltopdf http://code.google.com/p/wkhtmltopdf/downloads/list. This static binary will work on most systems and comes with a build in patched QT.
Unfortunately the static binary is not particularly static, on Linux it depends on both glibc and openssl, furthermore you will need to have an xserver installed but not necessary running. You will need to have different fonts install including xfonts-scalable (Type1), and msttcorefonts. See http://code.google.com/p/wkhtmltopdf/wiki/static for trouble shouting.
Compilation
It can happen that the static binary does not work for your system for one reason or the other, in that case you might need to compile wkhtmltopdf yourself.
GNU/Linux:
Before compilation you will need to install dependencies: X11, gcc, git and openssl. On Debian/Ubuntu this can be done as follows:
sudo apt-get build-dep libqt4-gui libqt4-network libqt4-webkit
sudo apt-get install openssl build-essential xorg git-core git-doc libssl-dev
On other systems you must use your own package manager, the packages might be named differently.
First you must check out the modified version of QT
git clone git://gitorious.org/+wkhtml2pdf/qt/wkhtmltopdf-qt.git wkhtmltopdf-qt
Next you must configure, compile and install QT, note this will take quite some time, depending on what arguments you use to configure qt
cd wkhtmltopdf-qt ./configure -nomake tools,examples,demos,docs,translations -opensource -prefix ../wkqt make -j3 make install cd ..
All that is needed now is, to compile wkhtmltopdf.
git clone git://github.com/antialize/wkhtmltopdf.git wkhtmltopdf
cd wkhtmltopdf
../wkqt/bin/qmake
make -j3
You show now have a binary called wkhtmltopdf in the currently folder that you can use, you can optionally install it by running
make install
Other operative systems and advanced features
If you want more details or want to compile under other operative systemsother then GNU/Linux, please seehttp://code.google.com/p/wkhtmltopdf/wiki/compilation.
Installation
There are several ways to install wkhtmltopdf. You can download a already compiled binary, or you can compile wkhtmltopdf yourself. On windows the easiest way to install wkhtmltopdf is to download the latest installer. On linux you can download the latest static binary, however you still need to install some other pieces of software, to learn more about this read the static version section of the manual.
Examples
This section presents a number of examples of how to invoke wkhtmltopdf.
To convert a remote HTML file to PDF:
wkhtmltopdf http://www.google.com google.pdf
To convert a local HTML file to PDF:
wkhtmltopdf my.html my.pdf
You can also convert to PS files if you like:
wkhtmltopdf my.html my.ps
Produce the eler2.pdf sample file:
wkhtmltopdf [url]http://geekz.co.uk/lovesraymond/archive/eler-highlights-2008 eler2.pdf -H --outline[/url]