HTML to PDF converter for Java™

HOME   FEATURES   PRODUCTS   DOWNLOADS   BUY NOW!   SUPPORT

PD4ML Reference Manual

(updated June 12, 2007)

If you are looking for PD4ML API reference, click here;
for PD4ML taglib reference, click here.

Table of contents

1. PD4ML Distribution
2. Running PD4ML Converter as a standalone application
3. PD4ML Java API Installation
4. PD4ML JSP taglib and PD4ML Web Installation
5. Adding PD4ML API calls to your Java application
    #1. Converting HTML addressed by URL to PDF
    #2. Converting HTML obtained from input stream to PDF
    #3. Generating PDF documents with header and/or footer
    #4. Protecting PDF document
    #5. Converting HTML headings or named anchors to PDF bookmarks
    #6. Inserting page breaks
    #7. Example: generating and sending PDF by email
6. Web scenarios
    #1. Using PD4ML custom tags in JSP
    #2. Using PD4ML custom tags with ColdFusion
    #3: Using PD4ML custom tags with Struts or with any other
        J2EE UI frameworks, if JSP taglib integration is problematic

7. Using professional version features. TTF embedding
    7.1 TTF embedding
        #1. System requirements
        #2. Configuring fonts directory
        #3: Embedding fonts to PDF from Java API
        #4: Embedding fonts to PDF from JSP
        #5: Known issues
    7.2 PDF headers and footers in HTML format
        #1. Defining header/footer from Java API
        #2. Defining header/footer in JSP
    7.3 Watermark images
        #1. Specifying watermark image from Java API
        #2. Specifying watermark image in JSP
    7.4 Table of contents generation
8. General notes

1. PD4ML Distribution

PD4ML is available in 2 variants:

  • PD4ML Standard - offers basic Java API for HTML-to-PDF conversion, includes PD4ML JSP tag library.
  • PD4ML Professional - adds to PD4ML Standard more features.

PD4ML Standard contains PD4ML library pd4ml.jar (or pd4ml_demo.jar) in lib/ dir, PD4ML JSP custom tag library (pd4ml_tl.jar or pd4ml_tl_demo.jar), the library description (pd4ml.tld), PD4ML Browser/Converter (part of the main PD4ML library) and relevant documentation.
PD4ML Pro package by content is similar to PD4ML Web, but includes more featured version of pd4ml.jar/pd4ml_demo.jar.

Both versions rely on open source CSS Parser library available from original project site (http://cssparser.sourceforge.net/) as well as from PD4ML Software download area. We recommend to use CSS Parser version patched by our team (http://pd4ml.com/cssparser-0.9.4.patched.2007.zip), which supports underscores in CSS properties, resolves a number of minor bugs and provides more informative error messaging.

2. Running PD4ML Converter as a standalone application

To run PD4ML Converter as a standalone application it is not necessary to perform any special installation procedures. Simply copy PD4ML library and CSS parser (ss_css2.jar) to your working directory and make sure, that your JRE environment (JAVA_HOME, CLASSPATH) is properly configured.

Run an evaluation copy of PD4ML as a GUI application:

D:\tools>java -jar pd4ml_demo.jar <params>

Note. Older version require the following syntax:
D:\tools>java -Xbootclasspath/a:ss_css2.jar -jar pd4ml_demo.jar <params>
or
D:\tools>java -cp pd4ml_demo.jar org.zefer.pd4ml.tools.PD4Browser

Run a commercial copy of PD4ML as a GUI application:
D:\tools>java -jar pd4ml.jar <params>

Note. Older version require the following syntax:
D:\tools>java -Xbootclasspath/a:ss_css2.jar -jar pd4ml.jar <params>
or
D:\tools>java -cp pd4ml.jar org.zefer.pd4ml.tools.PD4Browser

Note: The application creates and updates automatically pd4browser.properties file that holds options specified in its options dialog.

Running PD4ML as a command line converter tool

In order to start the application in command-line mode simply append two parameters (source URL and output file name) to the command we used above.

D:\tools>java -jar pd4ml.jar http://localhost:80/ test.pdf
or
D:\tools>java -jar pd4ml.jar file:d:/test.html test.pdf
Note: there is no possibility to pass HTML-to-PDF convert parameters in the command line. The application takes parameters from pd4browser.properties (if exists) created by the application automatically.

3. PD4ML Java API Installation

Installation of PD4ML API is straightforward: the downloaded pd4ml.jar (or pd4ml_demo.jar in a case of the evaluation version) and ss_css2.jar should be included to classpath of your project.

PD4ML is intended to be used with JDK1.3.1 and above.

4. PD4ML JSP taglib and PD4ML Web Installation

As a start point for your PDF-enabled Web application development you can use the example Web application, supplied with the PD4ML Pro and PD4ML Web distributions.

Copy taglib/ directory and all its sub-content into your working directory.

Than copy pd4ml.jar (or pd4ml_demo.jar) to WEB-INF/lib directory of the webapp. Make sure, that pd4ml_tl.jar (or pd4ml_tl_demo.jar) and ss_css2.jar are already there.

Now deploy the application to your JSP runtime engine. The operation is usually differs for the engines from different vendors. Check the documentation of your version.

The following are possible steps to deploy the application to Tomcat 4.1.30 server.

  • Create pd4ml.xml file in the [tomcat_root_dir]/webapps/ with the content like below:

<Context path="/pd4ml" docBase="D:/tools/web" debug="0" privileged="true" reloadable="true">

  <ResourceLink name="users" global="UserDatabase" type="org.apache.catalina.UserDatabase" />

</Context>

  • Change docBase attribute to correspond your web application location
  • Restart Tomcat
  • Access the application with an URL like that http://localhost:8080/pd4ml/ (Change host and port to match your Tomcat installation). taglib/index.jsp page is be activated by default and example PDF should be generated. See: Using PD4ML custom tags in JSP section of this document for more info.

5. Adding PD4ML API calls to your Java application

Converting HTML addressed by URL to PDF

1  import org.zefer.pd4ml.PD4ML;
   import org.zefer.pd4ml.PD4Constants;

    ...

 2  protected Dimension format = PD4Constants.A4;

   protected boolean landscapeValue = false;

   protected int topValue = 10;

   protected int leftValue = 10;

   protected int rightValue = 10;

   protected int bottomValue = 10;

   protected String unitsValue = "mm";

   protected String proxyHost = "";

   protected int proxyPort = 0;

 

3  protected int userSpaceWidth = 780;

 

   ...

 

   private void runConverter(String urlstring, File output) throws IOException {

 

         if (urlstring.length() > 0) {

                if (!urlstring.startsWith("http://") && !urlstring.startsWith("file:")) {

                              urlstring = "http://" + urlstring;

                }

 

4               java.io.FileOutputStream fos = new java.io.FileOutputStream(output);

               

5               if ( proxyHost != null && proxyHost.length() != 0 && proxyPort != 0 ) {

                       System.getProperties().setProperty("proxySet", "true");

                       System.getProperties().setProperty("proxyHost", proxyHost);

                       System.getProperties().setProperty("proxyPort", "" + proxyPort);

                }

 

6               PD4ML pd4ml = new PD4ML();

 

7               try {                                                              

                       pd4ml.setPageSize( landscapeValue ? pd4ml.changePageOrientation( format ): format );

                    } catch (Exception e) {

                       e.printStackTrace();

                    }

                      

                if ( unitsValue.equals("mm") ) {

                       pd4ml.setPageInsetsMM( new Insets(topValue, leftValue,

bottomValue, rightValue) );

                } else {

                       pd4ml.setPageInsets( new Insets(topValue, leftValue,

bottomValue, rightValue) );

                }

 

                pd4ml.setHtmlWidth( userSpaceWidth );

               

8               pd4ml.render( urlstring, fos );

         }

   }

 

       ...

 

 

Comments:
1. Import the PD4ML converter class
2. Define HTML-to-PDF converting parameter values if needed. See API reference for more info.
3. Specify user space width. It has an analogy to Web-browser window horizontal size. From common web-browsing experience you can guess, that changing of the size can affect the HTML document representation: HTML elements arrangement, vertical size etc. See API reference for more info.
4. Preparing output stream for PDF generation.
5. Specifying proxy settings if the source HTML document is behind the firewall.
6. Instantiating PD4ML converter.
7. Passing to it HTML-to-PDF converting parameters.
8. Performing HTML-to-PDF translation. Note: using of an URL is not mandatory. PD4ML can read a source HTML from input stream. See API reference for more info.

Converting HTML obtained from input stream to PDF

File f = new File("D:/tools/test.pdf");

java.io.FileOutputStream fos = new java.io.FileOutputStream(f);

OutputStream sos = System.out;

 

      File fz = new File("D:/tools/yahoo.htm");

      java.io.FileInputStream fis = new java.io.FileInputStream(fz);

      InputStreamReader isr = new InputStreamReader( fis, "UTF-8" );

                 

      PD4ML html = new PD4ML();

      html.setPageSize( new Dimension(450, 450) );

      html.setPageInsets( new Insets(20, 50, 10, 10) );

      html.setHtmlWidth( 750 );

      html.enableImgSplit( false );

 

      URL base = new URL( "file:D:/tools/" );

 

      // alternatively base can be specified with <base href="..."> tag

      html.render( isr, fos, base );

Generating PDF documents with text-only header and/or footer

 

PD4PageMark header = new PD4PageMark();

      header.setAreaHeight( 20 );

      header.setTitleTemplate( "title: $[title]" );

      header.setTitleAlignment( PD4PageMark.CENTER_ALIGN );

      header.setPageNumberAlignment( PD4PageMark.LEFT_ALIGN );

      header.setPageNumberTemplate( "#$[page]" );

                 

      PD4PageMark footer = new PD4PageMark();

      footer.setAreaHeight( 30 );

      footer.setFontSize( 20 );

      footer.setColor( Color.red );

      footer.setPagesToSkip( 1 );

      footer.setTitleTemplate( "[ $[title] ]" );

      footer.setPageNumberTemplate( "page: $[page]" );

      footer.setTitleAlignment( PD4PageMark.RIGHT_ALIGN );

      footer.setPageNumberAlignment( PD4PageMark.LEFT_ALIGN );

 

      pd4ml.setPageHeader( header );

      pd4ml.setPageFooter( footer );

 

See PD4ML API reference for the PD4PageMark methods description.  

Protecting PDF documents

A PDF document can be encrypted to protect its contents from unauthorized access. PD4ML supports PDF access permissions concept and allows a password to be specified for a document.

If any passwords or access restrictions are specified with PD4ML.setPermissions(), the document is encrypted, and the permissions and information required to validate the passwords are stored to the resulting document.

If a user attempts to open an encrypted document that has a password, the viewer application should prompt for a password. Correctly supplying either password allows the user to open the document, decrypt it, and display it on the screen.

If the document is encrypted with a password set to "empty", no password is requested; the viewer application can simply open, decrypt, and display the document. Whether additional operations are allowed on a decrypted document depends on any access restrictions that were specified when the document was created.

The possible restrictions:

  • Modifying the document’s contents
  • Copying or otherwise extracting text and graphics from the document
  • Adding or modifying text annotations
  • Printing the document
See PD4ML API reference (PD4Constants.Allow*) for others.

The PDF document produced by PD4ML can be protected with 40-bit or 128-bit encryption.

...

String password = "empty";

boolean strongEncryption = true;

int permissions = PD4Constants.AllowPrint | PD4Constants.AllowCopy;

 

pd4ml.setPermissions( password, permissions, strongEncryption );

...

See PD4ML API reference for PD4ML methods description.  

Converting HTML headings or named anchors to PDF bookmarks

PD4ML supports two different methods to generate PDF bookmarks (also known as outlines):

  1. Converting of HTML headings structure to a corresponding bookmark structure.
  2. Listing of HTML named anchors (HTML destinations) as bookmarks.

By default PD4ML does not generates document bookmarks. In order to enable the generation, an API method generateOutlines(boolean) should be triggered before render(...) method call. It can be called with one of the two available parameter values: true - to use headings structure for bookmarks; false - to use named anchors.

(In PD4ML taglib the generation process is controlled by outline attribute of <pd4ml:transform>. It can be assigned to the values "none", "headings" or "anchors")


The picture above shows a bookmarks list generated from named anchors of PD4ML FAQ page.

  • What do "named anchors" mean? Named anchors (or destinations) are defined in HTML code like the following:
     
    <a name="destination">label</a>.
     
    Note: do not use nested HTML tags for label definition.

  • What do "headings" mean? Headings are HTML tags <h1> ... </h1> to <h6> ... </h6> whose hierarchy can be used by PD4ML to generate tree-like bookmarks structure.

See PD4ML API reference for PD4ML methods description.  

Inserting page breaks

PD4ML introduces a special HTML tag <pd4ml:page.break> which is interpreted by the converting engine as page break command. In JSP the tag should have XHTML-like closing slash: <pd4ml:page.break/>

Example: generating and sending PDF by email

Note: in order to run the example you should install Java Mail API and JavaBeans Activation Framework.


import
java.awt.*;
import
java.io.*;
import
java.net.URL;
import
java.util.Properties;
import
javax.activation.*;
import
javax.mail.*;
import
org.zefer.pd4ml.PD4ML;

class
SendPdf {
private static String login = "your smtp login";
private static String password = "your smtp password";
private static String smtphost = "your smtp host";

public static void
main(String[] args) {
byte[] att = generateAttachment();
sendMessage(
"test_pd4ml@gmx.net", "test",
                         
"test message", att );
System.out.println(
"done"
);
}

public static byte[] generateAttachment() {
try {
// Output
ByteArrayOutputStream bos =
new
     ByteArrayOutputStream();
// Input
StringReader sr =
new
     StringReader(
"hello,World!");
PD4ML html =
new PD4ML();
html.setPageSize(
new Dimension(450, 450));
html.setPageInsets(
new Insets(20, 50, 10, 10));
html.setHtmlWidth(750);
html.enableImgSplit(
false);
URL base =
new URL("file:C:/images/");
html.render(sr, bos, base);
return bos.toByteArray();
} catch (Exception ex) {
ex.printStackTrace();
}
return null
;
}

public static void sendMessage(
     String to, String subj, String text,
byte[] att) {
try {
Properties props = System.getProperties();
props.put(
"mail.smtp.host", smtphost);
Session session =
         Session.getDefaultInstance(props,
null);
// Define message
Message message =
new MimeMessage(session);
message.setFrom(
new InternetAddress("pd4ml@gmx.de"));
message.addRecipient( Message.RecipientType.TO,
        
new InternetAddress(to));
         message.setSubject(subj);
// Create the message part
BodyPart messageBodyPart =
new MimeBodyPart();
// Fill the message
messageBodyPart.setText(text);
Multipart multipart =
new MimeMultipart();
multipart.addBodyPart(messageBodyPart);
// Part two is attachment
messageBodyPart =
new MimeBodyPart();
DataSource source =
new BufferedDataSource(att,
      
"attachment");
messageBodyPart.setDataHandler(
new DataHandler(source));
messageBodyPart.setFileName(
"test.pdf");
multipart.addBodyPart(messageBodyPart);
// Put parts in message
message.setContent(multipart);
// Send the message
message.saveChanges();
// implicit with send()
Transport transport = session.getTransport(
"smtp");
transport.connect(smtphost, login, password );
transport.sendMessage(message, message.getAllRecipients());
transport.close();
} catch (MessagingException e) {
e.printStackTrace();
}
}

public static class BufferedDataSource implements DataSource {
    
private byte[] _data;
    
private java.lang.String _name;

    
public BufferedDataSource(byte[] data, String name) {
         _data = data;
         _name = name;
     }

    
public String getContentType() {
        
return "application/octet-stream";
     }

    
public InputStream getInputStream() throws IOException {
        
return new ByteArrayInputStream(_data);
     }

    
public String getName() {
        
return _name;
     }

    
public OutputStream getOutputStream() throws IOException {
         OutputStream out =
new ByteArrayOutputStream();
         out.write(_data);
        
return
out;
     }
}
}

6. Web scenarios

#1: Using PD4ML custom tags in JSP

Surrounding of HTML/JSP content with <pd4ml:transform> tags.

Note: some combinations of MS Internet Explorer and Adobe Acrobat reader plugin versions are buggy. Instead of a PDF generation result MS IE displays a blank page. Check our online Support/HelpDesk for a possible workaround.

 

1  <%@ taglib uri="http://pd4ml.com/tlds/pd4ml/2.5" prefix="pd4ml" %><%@page
contentType=
"text/html; charset=ISO8859_1"%><pd4ml:transform

      screenWidth="400"

      pageFormat="A5"

      pageOrientation="landscape"

      pageInsets="100,100,100,100,points"

      enableImageSplit="false">

 

  <html>

      <head>

            <title>pd4ml test</title>

            <style type="text/css">

                  body {

                        color: red;

                        background-color: #FFFFFF;

                        font-family: Tahoma, "Sans-Serif";

                        font-size: 10pt;

                  }

            </style>

      </head>

      <body>

2           <img src="images/logos.gif" width="125" height="74">

            <p>

            Hello, World!

3 <pd4ml:page.break/>

            <table width="100%" style="background-color: #f4f4f4; color: #000000">

            <tr>

            <td>

                  Hello, New Page!

            </td>

            </tr>

            </table>

      </body>

  </html>

4 </pd4ml:transform>

 

Comments:
1. PD4ML JSP taglib declaration and opening transform tag. JSP content surrounded with <pd4ml:transform> and </pd4ml:transform> tags is passed to the PD4ML converter.
2. Image should be referenced with relative path. Absolute URLs, like src="http://myserver:80/path/to/img.gif" are allowed as well, but src="/path/to/img.gif" not.
3. The directive forces PD4ML converter to insert a page break to the output PDF.
4. Closing of the transformation tag. Any content that appears after the tag is ignored.
5.There is a CSS bug in JDKs older than v1.5b2. In order to avoid it, use CSS class names lowercased. (Irrelevant since PD4ML v3.x)

Click here to see the resulting PDF.
Click here to see PD4ML taglib documentation

 

Defining PDF document footer (or header) with JSP custom tag.

The <pd4ml:header> and <pd4ml:footer> JSP tags as well as inline, fileName and interpolateImages attributes of <pd4ml:transform> tag are available since v1.0.5

<%@ taglib uri="http://pd4ml.com/tlds/pd4ml/2.5" prefix="pd4ml" %><%@page
contentType=
"text/html; charset=ISO8859_1"%><pd4ml:transform

       screenWidth="400"

       pageFormat="A5"

       pageOrientation="landscape"

       pageInsets="15,15,15,15,points"

       enableImageSplit="false"

       inline="true"

       fileName="footer.pdf"

       interpolateImages="false">

 

<pd4ml:footer

1      titleTemplate="[${title}]"

2      pageNumberTemplate="page ${page}"

       titleAlignment="left"

       pageNumberAlignment="right"

       color="#008000"

3      initialPageNumber="1"

4      pagesToSkip="1"

       fontSize="14"

5      areaHeight="18"/>

 

<html>

       <head>

             <title>pd4ml header/footer test</title>

             <style type="text/css">

                    body {

                           color: #000000;

                           background-color: #FFFFFF;

                           font-family: Tahoma, "Sans-Serif";

                           font-size: 10pt;

                    }

             </style>

       </head>

       <body>

             <img src="images/logos.gif" width="125" height="74">

             <p>

             Hello, World!

<pd4ml:page.break/>

             <table width="100%" style="background-color: #f4f4f4; color: #000000">

             <tr>

             <td>

                    Hello, New Page!

             </td>

             </tr>

             </table>

       </body>

</html>

</pd4ml:transform>

 

Comments:
1. Title template definition. A string that can optionally contain placeholders ${title} for a title value taken from HTML's <title> tag, ${page} for a page counter value.
2. Page number template definition. A string with placeholder ${page} for a page counter value.
3. The attribute initializes internal page counter with the given value.
4. The attribute defines, that 1 page should not contain footer information.
5. Footer area height in points.

The syntax like ${var} has special meaning in the most recent Java Servlet API versions. In order to avoid notation conflicts PD4ML additionally supports $[var] placeholders since v3.x.

Click here to see the resulting PDF.
Click here to see PD4ML taglib documentation

 

How to add dynamic data (like current date) to PDF header or footer

 

<%
String template = getFormattedDate() + ", page ${page} ";
%>

<pd4ml:footer
    pageNumberTemplate="<%=template%>"
    titleAlignment="left"
    pageNumberAlignment="right"
    color="#008000"
    initialPageNumber="1"
    pagesToSkip="1"
    fontSize="14"
    areaHeight="18"/>

Temporary saving generated PDF to hard drive.

With <pd4ml:savefile> tag you have possibility to store just generated PDF to hard drive and redirect user's browser to read the PDF as static resource or to redirect the request to another URL for PDF post-processing.

Note: the tag should be nested to <pd4ml:transform> and have no body.

Usage 1.

<pd4ml:savefile
    uri="/WEB/savefile/saved/"
    dir="D:/spool/generated_pdfs"
    redirect="pdf"
    debug="false"/>


The tag above forces PD4ML to save the generated PDF to D:/spool/generated_pdfs with an autogenerated name.

It is expected, that local directory D:/spool/generated_pdfs corresponds to URL http://yourserver.com/WEB/savefile/saved/ (as given in "uri" attribute)

After generation PD4ML will send to client's browser a redirect command with URL like that:

http://yourserver.com/WEB/savefile/saved/generated_name.pdf

Usage 2.

<pd4ml:savefile
    dir="D:/spool/generated_pdfs"
    redirect="/mywebapp/send_pdf_by_email.jsp"
    debug="false"/>


The tag above forces PD4ML to save the generated PDF to D:/spool/generated_pdfs with an autogenerated name.

After that it forwards to /mywebapp/send_pdf_by_email.jsp with a parameter filename=<pdfname>.

So send_pdf_by_email.jsp can read file name

String fileName = request.getParameter("filename");

build full path

String path = "D:/spool/generated_pdfs" + "/" + fileName;

read the just-generated PDF file and and perform post-processing or other actions (like email sending).

In both cases above you can predefine PDF file name with "name" attribute. If a file with the name is already exists in D:/spool/generated_pdfs, than the new file name is appended with an auto-incremented numeric value.
 

#2: Using PD4ML custom tags with ColdFusion
 

The described integration method was tested with ColdFusion MX 6.1 enterprise and development editions running under Jrun4

Making PD4ML available in ColdFusion web application.

Copy pd4ml.jar, pd4ml_tl.jar and pd4ml.tld (pd4ml_demo.jar, pd4ml_tl_demo.jar and pd4ml.tld) to the directory WEB-INF/lib of your CF-enabled application. By default it is ${jrun4}/servers/cfusion/cfusion-ear/cfusion-war/WEB-INF/lib.

Restart the CF runtime!

Creating a PD4ML-enabled .cfm page

<cfimport taglib="/WEB-INF/lib/pd4ml.tld" prefix="pd4ml"><pd4ml:transform

       screenWidth="400"

       pageFormat="A5"

       pageOrientation="landscape"

       pageInsets="15,15,15,15,points"

       enableImageSplit="false"

       inline="true"

       fileName="myreport.pdf"

       encoding="ISO8859_1"

       interpolateImages="false">