Re: Re: Large PDF Issue

February 3, 2012 at 13:57

#28857

I hope you compared similar functions of the tools: HTML-to-PDF conversion.

The PDF output itself by PD4ML is quite quick: 2-3% of the conversion time.
But HTML rendering – the first phase of the conversion – is a resource consuming task.

PD4ML does not use native HTML rendering engines (like Webkit or Mozilla), PD4ML is managed .NET code (and 100% Java). Managed code has its benefits and, of course, disadvantages like performance penalties.

For example it instantiates a .NET object even for any standalone whitespace. Bearing in mind generic .NET overhead in CPU/RAM usage, we simply would not recommend to convert such big documents.

If you definitely need that, I would recommend to revise the document layout. First make sure the document is not a huge table. PD4ML does all the layout of all the pages in memory, before it writes anything out. Any cell on, let’s say, page #350 whose width is a bit wider than previous cells of the same column requires re-layouting of previous 349 pages – as it impacts the entire table layout.

So split big tables to smaller ones.

Second, try to avoid nesting of tables where it is possible. Each table cell layouts 3 times: MIN, MAX, OPTIMAL. Each table cell of a nested table layouts 9 times. If the nesting level is 2, a cell layouts 27 times etc.