PD4ML. New Generation
With the version 3.0 we introduce a new generation of PD4ML tools based on our original implementation of HTML rendering engine.
The previous PD4ML versions utilize a general purpose HTML renderer JEditorPane/HTMLEditorKit, which comes as a standard component of JDK. The component provides the basic HTML layouting functionality, but does not allow us to extend it with new features important for efficient and accurate PDF generation.
The new rendering engine is still based on the HTML 3.2 specification, but it adds to it some of the most popular features of HTML 4.x.
There is a list of new features in PD4ML v3.0.
- XHTML-like syntax supported
- Table borders style and width can be controlled
- Added implementation of a number of widely used tags (i.e. <span>); the new tag styles are correctly applied to nested content.
- Elements visibility control added
- Justification of text blocks is possible.
- Type, descendant and child selectors of CSS2 are supported.
The new rendering engine works faster comparing to HTMLEditorKit and is intended to be thread-safe. No implicit PDF generation requests serialization is needed anymore.
From developer's perspective there are no significant API changes. The existing PD4ML-enabled applications can be upgraded to the new version by simple JAR and TLD files overwriting with new versions. The only deployment difference is that PD4ML uses now an open source library of CSS Parser project (ss_css2.jar; licensed under LGPL). The library can be obtained from the original site:
or from our download area:
In despite the fact, that the new version of PD4ML is compatible with the previous versions on API level, outputted PDF layout can differ from the generated before.
Known problems of HTML rendering:
- For the time being the new version can be used with JDK1.4 and newer.
- Right-to-left scripts are not supported yet.
- <basefont>, <caption> tag and image maps are not supported yet.
Not supported CSS features:
- Pseudo elements in selectors (first-letter, first-line etc)
- display property values other than 'none' are ignored.
- background-attachment and background-position properties
- word-spacing and letter-spacing
- text-transform and text-indent
- float and clear
- list-style-image and list-style-position
A. Rendering of the Yahoo front page.
The page below is generated with the previous version of PD4ML. Some table borders are missing,
right column has wrongly stretched elements, other minor visual defects.
The next PDF is generated with the new version of PD4ML.
B. Rendering of an HTML table.
<td width="50" height="50"> </td>
<td width="50" style="border: solid 3 red" rowspan="2"> </td>
style="border-bottom: solid 7 blue; border-left: solid 7 green"> </td>
HTMLEditorKit ignores table border style directives, so the previous versions of PD4ML
require HTML workarounds in order to get solid borders.
New HTML renderer allows to control border style for entire table as well as for selected cells.