HTML to PDF / DOCX / RTF Java converter library › Forums › PD4ML v3 Archived Forums (Read Only) › General questions / FAQ › Performance issue › Re: Re: Performance issue
An HTML->PDF conversion procedure may be logically split to 3 phases:
1. document parsing and external resources (image, CSS) loading.
2. document layouting/redndering
3. PDF output
PDF pages output is swift procedure, which takes, as a rule, not more than 5-10% of the entire conversion time.
Parsing is also relatively fast, but loading of external resources may take some significant time due to network delays. To detect them, you may use the current debug mode. It dumps to STDOUT/server log as soon as a particular resource has been loaded.
The most resource consuming phase is HTML layouting/rendering. It depends on the document size and structure. For example, table layouting is done in three passes (min, max, optimal width). That means any nested table is layouted 9 passes, next nesting level – 27 passes and so on. Even if we added a detailed debug output for the phase, it will lead you to quite obvious discoveries: source document huge in size -> long conversion time; source document has multiple levels of table nesting -> long conversion time…