HTML to PDF converter for Java™

HOME   FEATURES   PRODUCTS   DOWNLOADS   BUY NOW!   SUPPORT

Frequently Asked Questions


  1. General questions
  2. Legal/Support questions
  3. Technical questions
  4. Troubleshooting
General questions

Which version of HTML is supported by PD4ML?

PD4ML provides basic HTML 3.2 and HTML 4.0 support with some exceptions. Here is the list of supported tags.

Is CSS supported as well?

Yes. The list of supported CSS properties is not full yet, but it constantly grows. It is very important, that inline CSS sections or references to external CSS style sheets are located within <head> tag.

Which fonts are supported by PD4ML?

Standard and Web versions of PD4ML output PDFs that only refer to the standard fonts, come with Adobe Acrobat Reader: Courier,Times Roman and Helvetica. PD4ML Pro has no such limitations: it provides support for True Type and Open Type font embedding.

Which version of JDK is supported?

PD4ML is intended to be used with JDK1.4 and above, although it works with minor restrictions under JDK1.3.1 as well.

Which version of PDF is outputed?

PDF v1.4

Are HTML framesets supported by PD4ML?

No

Legal/Support questions

What is the difference between commercial and evaluation copies of PD4ML?

Evaluation copy of PD4ML functionally matches the commercial version except the fact that PDFs generated by the evaluation version are “watermarked” with PD4ML ad.

How can I request a new feature?

You can send us an e-mail that describes the required feature. After that we analyze your request and fulfil it in one of the following ways:
  • In a case when the requested feature is useful for the library and we can include the request in our to-do list for the nearest version or update.
  • If the requested feature belongs to your specific needs, we can help you for free in case it doesn't take a lot time and resources.
  • If the requested feature belongs to your specific needs and the work is limited by time and needs a substantial effort, we can estimate it. The estimation describes the time and pricing regarding the work scope.

How many licenses do I need?

There are 2 types of PD4ML licenses: Single and Volume.

PD4ML Java Library Standard, PD4ML Java Library Web and PD4ML Java Library Pro are single licenses. A single license is valid for a group of developers OR for a single desktop product title OR for a single web deployment.

A single web deployment means a deployment to a single hardware web server that identified by a single IP/DNS address.

Special situations:

  • If your PD4ML-enabled web application runs in a clustered environment, than you need a single license per each clustered server.
  • If your single server runs multiple PD4ML-enabled web applications accessible by different addresses, than you need a single license per address.
  • If a cluster of servers runs multiple PD4ML-enabled web applications accessible by different addresses, than you should compare the number of servers in the cluster with the number of external addresses and take the greatest number as a quantity of needed PD4ML licenses.
PD4ML Volume License and its extension PD4ML Source Code License allow to deploy PD4ML libraries (Standard, Web or Pro) with your web or desktop applications to an unlimited number of client computers or web servers.

Technical questions

How to use PD4ML with ColdFusion?

See the dedicated section in the reference manual.

How big can be original HTML/resulting PDF?

The process of HTML to PDF converting includes a step called “HTML rendering”. That means that the original HTML document is completely loaded and processed in RAM: HTML elements must be correctly positioned, images loaded, texts aligned, table column widths balanced, styles applied and so forth. The specifics of HTML rendering do not allow PD4ML to perform the converting to PDF by portions. So the maximal sizes of HTML and PDF depend on your JVM’s available memory resources.

Which additional tools/libraries are needed to run PD4ML?

PD4ML requires an open source CSS parser library ss_css2.jar. A patched version of the library and the changed source codes are available to download from our site.

Is PD4ML thread-safe?

PD4ML instances access any resource (shared image and font cache, for example) by thread-safe way. If you need to run multiple HTML-to-PDF conversions simultaneously, you need to create a separate PD4ML instance per converting thread.

Can I pre-define font sizes in resulting PDF documents?

No. PD4ML scales HTML document depending on HTML width given in screen pixels and PDF page width (minus horizontal insets) in typographical points. The font sizes are scaled correspondingly to the entire document scale in order to keep page layout not broken.
If you need to get a specific font size in your PDF documents, try to do that empirical by "playing" with HTML width and right PDF page inset value.

Update: API call protectPhysicalUnitDimensions() forces PD4ML to respect and to output to PDF sizes/dimensions given in pt, in, cm etc. Use with care.

Is there an automated way to split a table, that does not fit a single page?

The simpliest approach is to define "page-break-inside: avoid" CSS property for the table rows.

PD4ML also provides a limited experimental support for table header replication:

You need to invoke pd4ml.enableTableBreaks(true) method - it implicitely defines "page-break-inside: avoid" for <tr> elements and tries to copy the first table row of a splitted table to each page if the row consists of <th> elements only.

In both of the cases the table border and the table background can not reflect the splits on <tr> level. So the best way is to disable the table border and background and to define borders and background colors for single cells. See example 1 and example 2 (more sophisticated CSS, but not supported by MS IE yet).

Update:
TD { border: 1px solid red }
TABLE { border: none; border-collapse: collapse }

also do the trick for the actual versions of PD4ML

How to add dynamic data (like current date) to PDF header or footer?

Java:

String pageNumberTemplate = getFormattedDate() + ", page ${page}";
footer.setPageNumberTemplate( pageNumberTemplate );

JSP:

<%
String template = getFormattedDate() + ", page ${page} ";
%>

<pd4ml:footer
    pageNumberTemplate="<%=template%>"
    titleAlignment="left"
    pageNumberAlignment="right"
    color="#008000"
    initialPageNumber="1"
    pagesToSkip="1"
    fontSize="14"
    areaHeight="18"/>

Why the gap between the header and content is bigger on the first PDF page comparing to the next ones?

The content on the first page is placed respecting the HTML document top margin. In order to suppress it define the following CSS properties:
<body style="padding-top: 0; margin-top: 0">

How to determine pd4ml(_demo).jar version?

Test it with the simple tool

Troubleshooting

What causes the error "Can't connect to X11 window server" (or any other exception, mentions GraphicsEnvironment) when I access PD4ML from a server-side Java application running on a UNIX/Linux machine?

Unix Java users who just want to the AWT or Swing components (PD4ML relies on that) even without displaying them, get an error message if there is no X server installed (which is true for many server systems). In fact, the X server is required for that. Some tips on how to solve that problem:
  • The recommended solution is to run your application or servlet engine with -Djava.awt.headless=true given as parameter to the virtual machine. It is true only for JDK1.4 and above. Java 1.4 includes a new image I/O API that reportedly does not require an X server.
  • Install xvfb. It provides an X server emulator that can run on machines with no display hardware and no physical input devices.

"getOutputStream() has already been called" Exception by servlet engine

Your JSP has whitespace characters (i.e. new line chars)  that cause the output writer to be opened before PD4ML takes control. Solution: remove all whitespaces till  "<pd4ml:transform ".
Wrong:
<%@ taglib uri="http://pd4ml.com/tlds/pd4ml/2.5" prefix="pd4ml" %>
<pd4ml:transform
    screenWidth="700"
    pageFormat="A4"
Correct:
<%@ taglib uri="http://pd4ml.com/tlds/pd4ml/2.5" prefix="pd4ml"
%><pd4ml:transform
    screenWidth="700"
    pageFormat="A4"

In the first ("wrong") example there is a new line character between "%>" and "<pd4ml:transform", which causes the exception.
 

A combination of MS Internet Explorer 6 + Adobe Acrobat Reader 6.01 does not open generated PDF documents.

First of all it is not a problem of PD4ML software. It seems Adobe Acrobat Reader plug-ins starting from version 6.01 does not process correctly content type supplied with HTTP response header and it rejects all PDF documents if their URL strings end not with .pdf extension.
The only workaround we know, is to add to your web.xml file the following directives:

<servlet>
    <servlet-name>converter</servlet-name>
    <jsp-file>/test.jsp</jsp-file>
</servlet>

<servlet-mapping>
    <servlet-name>converter</servlet-name>
    <url-pattern>/pd4ml.pdf</url-pattern>
</servlet-mapping>

Replace "test.jsp" with the actual name of your JSP.
Now you can access the generated page with the url:
http://yourserver/yourapp/pd4ml.pdf

How to interpret InvocationTargetException?

In most of the cases the exception is caused by invalid usage of PD4ML. It is thrown by the dispatcher thread of Java, which processes PDF rendering. Itself the exception is not informative - important info comes with an encapsulated "original" exception. You can find it in the stack trace by "Caused by:" patern (closer to end of the trace).

In the generated PDF file hyperlinks point to my local drive, instead of HTTP location?

You can define correct document base using HTML tag like that <base href="http://127.0.0.1:8080/pd4ml/">

All PDFs generated by PD4ML running under Resin application server are blank.

Resin application server in ServlerOutputStream methods always uses default character encoding set for OS ignoring explicitly given settings.
If the default OS encoding differs from Latin 1 (ISO-8859-1), it tries to convert it causing a corruption of binary data (in our case generated PDF).
To avoid such conversion, add to Resin command line the following parameter:
-Dfile.encoding=ISO-8859-1
Under particular conditions encoding="default" attribute of <pd4ml:transform> also helps.

Browser error "File does not begin with '%PDF-'"

Usually the problem happens if an application server or servlet engine runs in a combination with Apache web server. By default Apache tries to optimize PDF delivery by so-called "byteserve" method, which is not correctly interpreted by every browser/acrobat reader plugin versions. To disable the optimization add to Apache's httpd.conf file the following directive:
Header unset Accept-Ranges
Check Apache documentation to learn in which config section the directive should go to.

"Can not change default encoding AAA to BBB as specified in the source HTML" exception

The exception tells, that PD4ML encountered a charset directive, whose value differs from the default. It tries to re-open the input stream in order to apply the new encoding, but can not (reset() is not supported).

The simpliest workaround is to use
public void render( URL url, OutputStream os )
instead of your approach.

org/w3c/dom/Document LinkageError under Websphere or Weblogic.

Some types of appservers deploy org/w3c/dom/* classes by default. The Open Source CSS Parser library (ss_css2.jar) also includes the package and causes the conflict. A solution is to remove the conflicting packages from ss_css2.jar using WinZip or any other JAR/ZIP tool.

DOWNLOAD
fully functional trial
BUY NOW
NEWS

Jun 18, 08
Just released: PD4ML v3.2.3

May 24, 08
v3.2.3b8 is available for download.

News archive...
Copyright ©2004-08 zefer|org. All rights reserved.