<< back
PD4ML: HTML to a raster image conversionPD4ML (as HTML-to-PDF converter) consists of 2 relatively separate modules: HTML rendering engine and PDF output pseudo-device, derived from java.awt.Graphics. That makes an output of a rendered HTML document to an image (or to any other Graphics device) quite a trivial task.In order to make the task even easier, we added a image output mode to PD4ML API. With a simple output format switch you may produce a PNG or a multipage TIFF
pd4ml.outputFormat(PD4Constants.PNG8); // or pd4ml.outputFormat(PD4Constants.PNG24); // or pd4ml.outputFormat(PD4Constants.TIFF);The equivalents in JSP taglib: <pd4ml:transform ... outputFormat="png8"> ... </pd4ml:transform> <pd4ml:transform ... outputFormat="png24"> ... </pd4ml:transform> <pd4ml:transform ... outputFormat="tiff"> ... </pd4ml:transform>(in the case the transform tag automatically sets corresponding Content-type HTTP header "image/png" or "image/tiff") In the command line tool: java -Xmx512m -Djava.awt.headless=true -cp ./pd4ml.jar Pd4Cmd <URL> 1200 -out thumbnail.png -outformat png8 java -Xmx512m -Djava.awt.headless=true -cp ./pd4ml.jar Pd4Cmd <URL> 1200 -out thumbnail.png -outformat png24 java -Xmx512m -Djava.awt.headless=true -cp ./pd4ml.jar Pd4Cmd <URL> 1200 -out thumbnail.tiff -outformat tiffBy PNG image output PD4ML ignores page breaks, but it respects them by multipage TIFF generation. There are some other limitations in the HTML-to-Image conversion mode.
Also, for a case, you need further image data processing, PD4ML API introduces a couple of specialized renderAsImages() methods, which return an array of BufferedImage objects, represent document pages. The biggest source of troubles by image output is memory allocation. Even a relatively small HTML layout 1000x5000px requires to allocate at least 20 MB for image bytes output (plus BufferedImage class infrastructure overhead). |